All Episodes

July 3, 2025 10 mins

This episode explores the critical importance of measuring and optimizing platform stability within digital media organizations. It delves into how consistently tracking key metrics allows publishers to validate the success of initiatives aimed at eliminating platform fragility and drives continuous improvement across various operational areas.

The discussion will outline five essential categories of metrics that ensure a robust platform:

  • Performance Metrics: Learn how to deliver fast, reliable user experiences by monitoring Site Speed using tools like Google Lighthouse or Real User Monitoring (RUM), tracking Uptime/Downtime (noting that elite platforms aim to restore services in under an hour during disruptions), and identifying technical issues through Error Rates.
  • Workflow Efficiency Metrics: Discover how optimized workflows reduce bottlenecks and enhance productivity. Key measures include Time-to-Publish (from content ideation to publication) and Task Completion Rates for editorial and development tasks.
  • Developer Metrics (DORA Metrics): Understand how these highlight agility and release quality within development teams. This includes tracking Deployment Frequency (high-performing teams deploy multiple times per day), Lead Time for Changes (efficient teams achieve production deployment in less than a day), Change Failure Rate (aiming for below 15% requiring fixes), and Mean Time to Recovery (MTTR) for incident resolution.
  • Audience Impact Metrics: Explore how platform improvements directly translate into better user experiences. This involves analyzing Engagement Metrics such as bounce rates, session durations, and conversion rates, and testing Traffic Handling capabilities during spikes using tools like Catchpoint.
  • Observability Tools: Understand the vital role of real-time monitoring tools like Datadog, CloudWatch, or Catchpoint in providing actionable insights for proactive issue resolution.

By applying these metrics, media organizations can ensure their platform delivers exceptional user experiences while supporting operational excellence. The episode also touches on broader considerations for publishers, including the importance of Gutenberg optimization for editorial efficiency and navigating the hidden risks of AI + WordPress integration to build resilience in the publishing landscape.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Audrey (00:00):
This is Media Endeavor. We dive into the real stories of

(00:04):
those shaping the digital mediapublishing space. From editors
creating content, developersmaking it all possible,
marketers and designers handlingthe audience experience, to
leaders driving it all forwardand shaping the future. We
explore how they've built andscaled their platforms,

(00:25):
navigated industry shifts, andadapted to the ever changing
digital landscape. With a focuson the intersection of content,
technology, and operationalstrategy, we deliver actionable
insights for media executivesand digital publishers.

Owen (00:41):
Imagine you're running a big digital media site. You
know, every second it feels slowor worse, it actually stumbles.
Well, you're not just watchingmetrics dip, are you? You're
actually losing reader trust.

Alice (00:51):
Right.

Owen (00:51):
Losing them to competitors. That that fear,
that sort of platform fragilityidea, it's it's very real. So
the big question is how do youactually measure that? And then
how do you improve it? How doyou, you know, build a really
solid digital platform?
That's what we wanna dig intotoday.
Yeah. Think of this deep dive as maybe a shortcut, a way
to really grasp the key metricsthat show if your efforts to

(01:13):
make things more stable aregenuinely working. We're gonna
unpack the important stuffacross performance, internal
workflows, the whole developmentside of things, and maybe most
importantly, how it all impactsyou, the audience. So our
mission today is to give you theinsights you need to make your
platform robust, you know, so itthrives even under pressure.

Alice (01:31):
That's a great way to put it because these metrics,
they're not just diagnostictools for when something breaks.
Not just reactive. They'rereally about driving continuous
improvement. They help youvalidate where your efforts are
paying off and make sure theplatform supports, well,
operational excellence day today. It's really about shifting
to being proactive.

Owen (01:51):
Okay. Right. So let's unpack this then. Starting with
the basics, the foundationreally, performance. When we
talk about a fast, reliable userexperience, what are we actually
measuring?
First up is site speed. And thisisn't just one simple number, is
it? It's about how quickly thepage loads, sure, but also how
responsive it feels. Mhmm. Youknow, from click to fully

(02:12):
interactive.
You've got tools like GoogleLighthouse which are great for
a, a lab view, a synthetic test.But for what real users see, you
absolutely need real usermonitoring. GammaRoM, that
captures performance from actualpeople out there.

Alice (02:25):
Mhmm. Different devices, different networks.

Owen (02:27):
Exactly. All that variety. Yeah. Because what looks speedy
in the lab might be, well,painfully slow for someone on a
weak mobile signal somewhere.

Alice (02:34):
And that REM data, the real world stuff, that's what
directly ties performance backto whether users are happy or,
you know, frustrated. You canoptimize in a controlled setting
forever, but if it doesn't workfor your real audience, well, it
doesn't really build that usertrust, does it? Every
millisecond genuinely countsthere.

Owen (02:51):
Yeah. Definitely. And following right on from speed,
you've got uptime and downtime.Obviously, you need those uptime
percentages. But the reallyinteresting part, I think, is
mean time to recovery.
MTTR. Mhmm. I saw this stat thatelite platforms can actually
restore services in under anhour during a disruption. That
just sounds incredibly fast. Howdo they even manage that?

Alice (03:12):
Yeah. That under an hour figure, it's not just about
being quick. It speaks volumesabout their investment and
things like automation, havingresponse plans, prebuilt
playbooks ready to go, and also,crucially, a culture where they
learn from incidents withoutblame. It means they figured out
how to limit the damage, the theblast radius of any problem. So
they protect revenue, protecttheir reputation by bouncing

(03:33):
back super fast.

Owen (03:35):
Right. Minimizes the impact. Okay. And the last bit
for performance, error rates.We're talking specifically about
things like server side errors,maybe HTTP five hundreds or
failed API calls, Not just likea user finding a broken link.

Alice (03:51):
Exactly. Those point to deeper system issues.

Owen (03:54):
Catching those early helps you find those core
instabilities before they reallyaffect a lot of users. It really
just boils down to measuringwhat matters, doesn't it? Cause
with speed and errors, lost timeliterally means lost money or
just as bad, a user clickingaway annoyed.

Alice (04:09):
Absolutely.

Owen (04:10):
Okay. So we've covered the user facing side, but, you know,
all this great content, thesefeatures, someone has to create
them, get them published. So howdo we make sure the teams doing
that work are efficient? Becausethat efficiency or lack of it
can definitely impact stabilitytoo.

Alice (04:23):
Right.

Owen (04:23):
Which brings us to workflow efficiency metrics.
Things like time to publish.

Alice (04:27):
Right. How long does it take for my idea to live?

Owen (04:30):
Yeah. Exactly. Yeah. Especially measuring that after
you've maybe tried to improveyour internal processes.

Alice (04:35):
And then there's task completion rates. Basically, are
editorial development tasksgetting done on time.

Owen (04:41):
And these might feel like internal behind the scenes
numbers, but they are soimportant. When you optimize
those workflows, you get rid ofbottlenecks, you improve
productivity, and that directlyaffects how quickly you can get
content out, how responsive youare. Think of it like a busy
restaurant kitchen. If the cookscan't get ingredients or the
stations are disorganized, thefood's not getting out right no

(05:03):
matter how good the recipe is.So these workflow metrics are
about making sure the digitalkitchen runs smoothly.
That's a great analogy. Okay. Soif the kitchen's running
smoothly, what about the folksactually, you know, building and
maintaining the restaurantitself? The platform developers.
How do we ensure they're workingeffectively and reliably?
This generally brings updeveloper agility, release
quality.

Alice (05:24):
Mhmm.

Owen (05:25):
Which leads us straight into what many call the, the
holy grail for dev teams, theDora metrics. Now I suspect many
listeners know Dora, but let'smaybe look at them through this
stability lens.

Alice (05:36):
Good idea. They're fundamental here.

Owen (05:38):
So first deployment frequency. Simply, how often are
you deploying code changes? Youhear about high performing teams
doing it multiple times a day.What? Which is yack at it.
Then lead time for changes.That's the time from when code
is committed, like saved by thedeveloper all the way until it's
live in production. Thebenchmark for efficient teams is

(05:58):
often less than a day. Now let'spause on that one. It sounds
simple, less than a day, Butgetting from committed to live,
that's where things often getstuck.
Right? Manual testing,approvals, just, not having
automated pipelines. If thatless than a day feels like a
huge leap, that's often a bigclue about where stability work
needs to happen, wouldn't yousay?

Alice (06:18):
Oh, absolutely. That's spot on. What's really powerful
about Dora, particularly leadtime and deployment frequency,
is they reflect more than justtechnical skill. They're really
strong signals about theorganization's overall health.
Bottlenecks there often point todeeper process issues or even
cultural hurdles, not just atech problem.
That's the real insight.

Owen (06:36):
Interesting. Okay. Then you've got change failure rate,
CFR. What percentage of yourdeployments end up needing a fix
or a rollback? The goal istypically aiming for, less than
15%.
A low CFR often suggestsdevelopers feel safe deploying,
you

Alice (06:52):
know, psychological safety. Yeah.

Owen (06:53):
They know failures are caught, learned from not
punished. And finally, MTTRagain, meantime to recovery. We
mentioned it for performance,but here it's specifically about
how quickly the development teamcan resolve incidents caused by
changes. It really shows theirresilience, their ability to
respond when inevitablysomething goes wrong with the
deployment.

Alice (07:13):
Exactly. These four DORA metrics together are so powerful
because they measure both thespeed and the quality, the
reliability of development work.Focusing on them helps build
that culture of continuousimprovement. It means you can
deliver value faster, morereliably, and just build
confidence in the whole platformchange process.

Owen (07:31):
Makes sense. So all this effort, better performance,
smoother workflows, agiledevelopment, it all has to
circle back to the main reasonwe do it. Right? A better
experience for the audience.

Alice (07:40):
Ultimately, yes.

Owen (07:41):
And that's where audience impact metrics fit in. Things
like, engagement metrics. Solooking at bounce rates, how
long people stay sessionduration, maybe conversion rates
are the changes actually makingusers happier or more engaged.
And then there's traffichandling. How does the platform
hold up during big trafficspikes?

(08:01):
You test this using, syntheticload testing tools, maybe like
Catchpoint.

Alice (08:06):
Yeah. This is really where the rubber meets the road
as they say. Your platform hasto perform well under pressure.
Think breaking news, a storygoing viral, a big launch.
Keeping that user experiencepositive during those high
stress times is critical.
It proves the platform isn'tjust stable day to day, but it's
genuinely scalable and reliablewhen counts the most.

Owen (08:26):
Right. It has to handle the peaks. So how do
organizations actually get allthis data, all these different
metrics? That brings us toobservability and the tools that
enable it.

Alice (08:34):
Mhmm. The eyes and ears.

Owen (08:36):
Exactly. Your real time monitoring. Tools like Datadog,
CloudWatch, Catchpoint again

Alice (08:41):
or Yeah.

Owen (08:42):
There are many. Now these tools sound amazing, like the
solution to everything, but Iimagine for companies just
starting out, the sheer amountof data coming from them could
be overwhelming. What's maybethe biggest pitfall you see when
teams first adopt these kinds ofobservability tools?

Alice (08:56):
That's a really good question. I think the biggest
mistake is focusing too much onjust collecting everything,
drowning in data. Yeah. Insteadof first defining what questions
do we actually need answers to?What actions will we take based
on this data?
It has to be about actionableinsights, not just having more
charts and dashboards to lookat. Because these tools used
right, they aren't just forreacting when something breaks.

(09:18):
They let you be proactive. Theyhelp you anticipate issues,
prevent them, and really makeoperational excellence a
continuous reality, not justsomething you're aiming for.

Owen (09:27):
Okay. So pulling it all together for today's deep dive,
we've journeyed through what?Five key areas of metrics.
Performance first, then workflowefficiency, developer agility
powered by those Dora metrics,then direct audience impact, and
finally the observability toolsthat tie it all together. It
seems like consistently trackingthese different types of metrics

(09:48):
is what really allows mediaorganizations to see if their
stability work is paying off Andcrucially, it helps them spot
where they can optimize further.
The end goal being that robustplatform delivering great user
experiences and well,operational excellence.

Alice (10:03):
Exactly. And perhaps that raises an important question for
you listening right now.Thinking about your own context
or maybe your organization,which of these metric categories
do you think would have thesingle biggest impact on
improving platform stabilityefforts and why?

Owen (10:17):
Oh, that's a great question to leave folks with.
Definitely something to mullover. We hope you'll reflect on
that, maybe explore how thesemetrics apply to your world.
Keep that curiosity going.Thanks for joining us, and we'll
see you on the next deep dive.

Alice (10:29):
Stay tuned for a new episode every Monday afternoon.
Advertise With Us

Popular Podcasts

Crime Junkie

Crime Junkie

Does hearing about a true crime case always leave you scouring the internet for the truth behind the story? Dive into your next mystery with Crime Junkie. Every Monday, join your host Ashley Flowers as she unravels all the details of infamous and underreported true crime cases with her best friend Brit Prawat. From cold cases to missing persons and heroes in our community who seek justice, Crime Junkie is your destination for theories and stories you won’t hear anywhere else. Whether you're a seasoned true crime enthusiast or new to the genre, you'll find yourself on the edge of your seat awaiting a new episode every Monday. If you can never get enough true crime... Congratulations, you’ve found your people. Follow to join a community of Crime Junkies! Crime Junkie is presented by audiochuck Media Company.

24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.