All Episodes

August 6, 2025 56 mins

Watch on YouTube

In Part II of the Multi-Party Dilemma (MPD) drill retrospective, we reconvene to dig deeper into the implications and nuances of the simulated incident exercise hosted on the Uptime Labs platform. Eric Dobbs (incident analyst), Alex Elman (deputy IC), and Sarah Butt (incident commander) continue their debrief with Courtney, reflecting on how team behavior evolved under stress, the importance of expertise in managing non-technical aspects of an incident like saturation, and how deeply held assumptions often go unspoken until tested under pressure.

This episode emphasizes the complex social and cognitive dimensions of incident response, such as how people coordinate, communicate, and construct shared understanding. It highlights the value of analyzing drills not for failure points, but for what they reveal about real work, adaptation, and human coordination.

Key Highlights

  • Incident Analysis as a Practice:
    • Eric Dobbs emphasized understanding how people make sense of unfolding events, rather than judging decisions in hindsight.
    • The goal is to study the “why it made sense at the time,” not what was “right” or “wrong.”
  • Drills Expose Hidden Assumptions:
    • Even experienced responders bring unspoken mental models into incidents.
    • The drill revealed assumptions about communication flows, authority boundaries, and vendor interactions that were not made explicit in planning.
  • The Value of Human Expertise:
    • Everyone involved in this incident brought an unparalleled level of expertise to the work. 
    • Often this kind of expertise goes unnoticed or is taken for granted, however this kind of knowledge is precisely what makes for smoother, better coordinated (and sometimes), faster incident response.
  • Importance of Framing:
    • The way questions are asked in retrospectives can shape what is revealed—e.g., “What made that hard?” is more productive than “What did you miss?”
    • Reframing incidents around constraints and tradeoffs leads to deeper insight.
  • Team Learning and Culture:
    • Safe, high-trust environments enable better learning during drills.
    • Psychological safety allows team members to admit confusion or raise alternate interpretations during real incidents.

Resources and References

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
A few moments later.
So we had an incident whilerecording our incident.
Retro.
For those of you who didn'twatch the part one of this, go
back and watch the part onebecause this one's not gonna
make any sense.
We're gonna drop you right intoit.
But Eric had a power outage.
I got a migraine.
Sarah I got a migraine.
We decided to break, but we'reback.

(00:20):
Lucky you.
We got the band back together.
So, buckle up, strap in.
Here we go folks.

Courtney (00:25):
Here we are again.
Podcast recording two.
Electric Boogaloo.

Eric Dobbs (00:31):
I, what I remember of what Sarah was saying before
I dropped was, commenting on thefact that the comparison of me
claiming I would be buried underthose circumstances.
Sarah's handling it fine.
She comes back with, yeah, but Ihad Alex and there's reasons
that that matters and, and thena lot of detail which just goes

(00:52):
on, like, and I hope we have therecording because the detail
reveals yet more expertise aboutSarah, understanding of overload
and being able to share the workwith, with Alex.

Sarah Butt (01:04):
I remember it, I think the two things that I had
said functionally were, Eric hadsaid, I would've been
overwhelmed.
And I said, keep in mind therewere two people here, not just
one.
And I had not only a deputy, buta really fantastic deputy.
Um, and there was a lot ofunique benefits to the fact that
it was Alex and I workingtogether.
And we have, even though we'venever run an incident before, we

(01:26):
have a really strong friendshipand we have a lot of common
ground.
And so there's, there's twopieces to this.
is that I know Alex.
And I know that he knows how tohandle saturation and monitor
his own saturation.
And, I trust him.
So I threw stuff at himinitially knowing that he would
like, trusting that he wouldprioritize the right things and

(01:48):
offload or otherwise shed loadhowever he needed to.
And so I basically just sent itto him and then knew that is,
well, unless he said I cannot dothis, it was gonna be taken care
of and he would manage his ownsaturation because he is a very
mature incident responder.
The second thing is that Alexand I have, a strong friendship
and that we've not run anincident together before.

(02:13):
We've written papers togetherand traveled together and all
sorts of stuff.
And we come from similarincident response backgrounds as
far as training.
So I don't have to say a lot ofthe normal sort of.
trees that I would do with aperson that I didn't know.
I can be pretty t.
I can be like, I need a can.
I need this, I need that.
Go, go, go.
we share a language.
So I'm not sitting there going,this is what a can report.

(02:35):
I'm saying, Alex, I need a can.
Alex, can you pop to the mainchannel and just head off, uh,
Ty us in Bez and say we're awareof an issue.
It appears right now to beaffecting primarily amia.
We've currently engaged lead allof that.
Just get them in a box.
This basic stuff go.
Alex put Bez in a box.
Go.
I'm not explaining what it meansto put in an executive in a box.

(02:55):
I'm not telling him how to doit.
And I think that piece helpstremendously.
Courtney's laughing and I don'tknow why.

Courtney (03:04):
Uh, I'm like, well, nobody needs, uh, to be told how
to put an executive in the box,uh, out of this crew.
So, yeah.

Alex Elman (03:12):
also, if you can mention what your experience
with Bez was during theincident, because I thought Bez
was very disruptive on my side,but you mentioned that, you
didn't see much of be.

Sarah Butt (03:23):
I saw him like, maybe once or twice he popped
his head in.
was once when he popped his headin and he got after the customer
support person for posting thestatus page too early, which
regardless of whether or nothe'd been involved, like that
entire conversation, I was gonnakick it to biz comms'cause I
just didn't want it in themiddle of the troubleshooting.
and then he popped in once getsort of angry about the data

(03:45):
center and that's when I think Isaid, because I didn't, I didn't
have the bandwidth to look atthe biz comms channel, the
entire incident.
So that was when I actuallysaid, Alex, can you put BES in a
box for me?
I had no idea that Alex wasalready like, actively,
continually putting the guy backin the bullpen.
Yeah, like the box was beingrepeatedly broken open and Alex

(04:06):
was just reconstructing itactively.
and it's funny because I thinkit speaks a little bit to
different people's demeanorduring incidents as well.
I was talking with, John Albaabout this incident, and he's
like, well, when we talked aboutit, we're like, Sarah, you're
like, Tigger.
And, and Alex is like, not likeior, but like, it's just the
energy level.
Alex is very calm, he's verycomposed, you kind of hit it

(04:28):
with a lot of energy and, andwear a little bit more of the
emotion.
So the most that you hear Alexsay is like this very calm, like
executives are, they have a lotof questions.

Alex Elman (04:39):
Sarah Bez seems pretty incensed and is asking
for status update on number ofcustomers, percentage of
customers impacted.

Sarah Butt (04:47):
Like it's something like that.
It's very measured.
So I had no idea that Bez washaving a meltdown in the other
channel because Alex justcompletely buffered that out.
I.

Alex Elman (04:56):
And I, I was so, so worried that Bez was being
disrupted because every time Iwould try to satisfy his
demands, he would go quiet.
And I didn't know if he wasgoing quiet because he went
elsewhere or because for themoment he was okay with what I
said.
But I'm glad to hear that itmaybe was effective.

Sarah Butt (05:14):
Hamed, what do you think?

Courtney (05:15):
Okay,

Sarah Butt (05:15):
I, I don't know who was playing Bez or helping the
AI behind the scenes with Bez,but do, can you like, put a be
Bez hat on?
do you know what Bez would sayabout his experience during the
incident?

Hamed (05:29):
so Ed was playing Bez.
I was Tinus, Tanya, and, Hamed.

Courtney (05:38):
There was another uptime labs person behind the
scenes, not you, that isn't herewith us today.
And that was who was runningBez, is that correct?

Hamed (05:46):
it was.
There were two of us,

Courtney (05:49):
Yeah.

Hamed (05:50):
because it's a very involved, involved drill.
but I spoke to Ed about, aboutit.
how was, how was Bes experienceduring, during incident?
he, so he, he, I'm quoting Edhere.
he was, it was pretty, if onehand he, I think he enjoyed the

(06:15):
information that Alex wassharing with him at the same
time he was, he was feelingimpatient because of, of the
magnitude of the issue.
So all along he had thisstruggle of staying back, coming
in and the data center issueslike really blew him off four

(06:38):
hours, all our businesses down.
And I think that was where hehad a bit of a burst.
If I'm not wrong, Sarah.

Sarah Butt (06:49):
yeah, you're right.
Actually, you're right.
I remember him coming in.
That was exactly when he burstin.
and, and what I did, and it'sone of my sort of tricks when I
need to manage someone inleadership who, a has the
opportunity to be very helpfulfor me, but b also has the
opportunity to potentiallyreally derail an incident is,
um, I find a helpful task forthem to do that is uniquely

(07:12):
suited to them.
and one of the things thatexecutives are amazing at,
because they tend to have awhole different set of
connections than the standardsupport channels and stuff that
you use, and I do it often if Ineed to, is Hey, you know,
whoever the senior person is, Ineed you to go.
Get in touch with this company.

(07:32):
Do you know the accountexecutive?
Do you know someone inleadership over there from a
prior piece in your career?
Are you the person who signedthe contract I have literally
sent executives and said, wherethey've said, I don't know how
to get in.
Told, I said, you andprocurement go figure out a
different way to get in touchwith this vendor because I
cannot get in touch with themyet.
Or I want every possible endroad.
So as soon as he hit, Ibasically like tried to u-turn

(07:53):
him back and say you have twooptions here.
You can help escalate, or youcan go get information from
Alex, but you cannot be in themiddle of the technical bridge
just like churning.
You just don't get to do that.

Hamed (08:05):
And, and I, I think so.
Now remember it, that was acritical moment because Ed
brought as well be like for thefirst time in that incident,
felt that he could be useful.
So he had a task, he had apurpose, he stopped disrupting
and that, that was interesting.
It's, we have, we have drillsmany, probably over, 40 times.

(08:31):
uh, already it was the firsttime that actually someone gave
a task to Bez.

Eric Dobbs (08:38):
This is, this is the thing.
I'm so excited to hear this.
Sarah, you're, the, here'sanother thing about expertise,
right?
So I geeked out before we lostpower last time, about what a
powerful experience it is towitness expertise in action.
Sarah, I don't know if you haveany sense of how deep your own

(09:02):
expertise is.
That as many times as they'vehad people run this drill,
you're the first one to show upwith a project for, for Bez use
like a, a concretely usefulthing give Bez something to do.

Sarah Butt (09:16):
no.

Eric Dobbs (09:17):
and, and I hear you saying is you have, you have
experience.
You have played this gamebefore.

Sarah Butt (09:22):
so it's, it's interesting for me because like
to some extent, I, I don't, andyou, you can hear this on
Incident Ferst, uh, there's awebinar that I did with, Beth
and a few other folks who hadalso run this drill.
And I, I did talk a little bitabout the law of fluency, but, I
think the other thing that'sinteresting is this is actually
something that we train, this isin our advanced incident
commander training.
And so I learned this from,several other incident

(09:47):
commanders who are, kind of camebefore me and trained me in all
of that when I was, you know,probably four or five years ago
when I was first starting withthis employer.
but it's a pretty standardstrategy we use and we use it
for two reasons, but it's notentirely just oh, I wanna get
the executive off the bridge.
Like that's, it's not that atall, but, There's, there's a
whole interesting discussion,and I don't know if Incident

(10:08):
Fest got into it at all with,the webinar.
There's, there's a reallyinteresting discussion about
like, what's the role of anexecutive on a bridge in
different ways to use them?
And, and I don't have time to gointo all of that, but what I
will say is executives, nobodygets on a bridge intending to be
disruptful.
And often people don'trecognize, how intimidating they

(10:28):
can be just based on theirtitle.
Like I know when I was inleadership, when I was leading,
uh, SRE for an organization,people had to pull me aside as I
was learning and say like, Hey,you don't realize this, but when
you get on an incident andyou've got people three or four
levels, you know, like it's abig deal.
It's disruptive.
You don't think you are, youthink you're just getting on to
help.
And I think that's magnified bya hundred x when you have an EVP

(10:52):
on the incident.
So executives.
Are humans and they get on ashumans just wanting to help.
one of the things that I thinkis valuable in an organization
is how do you figure out thebest way for them to help?
And they often actually do, ifyou give in the right structure,
have unique ways to help.
There have been times where Ihave literally LinkedIn searched

(11:14):
during an incident if I neededto get in touch with a vendor.
I don't know if I've done thiswith my current employer, but
certainly with past employers, Ihave LinkedIn searched or a
company.
people who previously worked atthe vendor I'm trying to get in
touch with, who are executivesat our company called them cold
and said, Hey, I have anincident.

(11:35):
It's for this part.
I know it's not, you're notresponsible for this piece of
the business, but would you bewilling to step in and call
whoever you know that thatcompany to help us try and get
moving?
it's normally done in parallelwith the official paths.
But I, I do think like there isof ways that executives can lift
internal and external roadblocksand they're totally willing to

(11:55):
do it.
In my experience, a matter ofmaking a path for success in how
they land on the incident andhow they interact with your
responders.

Eric Dobbs (12:07):
This is,

Courtney (12:07):
amazing, y'all.

Eric Dobbs (12:08):
solid gold.

Courtney (12:10):
I, so I wanna keep going though on the themes if
we're okay.
Eric, and let.

Eric Dobbs (12:16):
to, I was about to intervene in the same way,
Courtney.
Here's the thing, so, and, andI'm gonna go a little meta.
I'm trying to facilitate adiscussion where we learn stuff
about the incident, where welearn stuff that's not about the
incident.
We learn the stuff aboutincidents in general or about
our business in general.
So where we were in the plot andwhere we were in the, in, in the

(12:39):
document was I was trying tointroduce what even is a theme.
And there's this common problemthat a theme is an abstraction
that's hard to explain withoutgiving a concrete example.
So I entered by saying, youknow, themes are some pattern
that came up out of the incidentwhere I heard from more than one
person the same kind of topicseem as a word is carrying a lot

(13:03):
of weight.
Like what makes these things thesame?
Well, I heard about saturationfrom Alex.
In one context, I heardsaturation from Sarah.
In another context, I saw them,skillfully navigate their mutual
saturation at this most tensepoint in the incident.
So this was a specific theme Idrew out, and it is, it is

(13:25):
inevitable that if as you'retrying to talk about the
abstraction, you have to give anexample.
Because the abstraction doesn'tmake any sense.
And once you're looking at theexample, you get lost in the
weeds of the example, becausethat's the nature of, like, the
example is if you get a goodtheme, it is so compelling you
can't not talk about it.

(13:45):
So one of the themes I washoping we would discuss, and
it's sort of out of order forthe agenda that I had planned
for the retro, but totally fine.
I'm happy, like we've prettythoroughly covered the theme of
saturation and seen it from manydifferent views.
this is so much more valuablethan the things we might fix in

(14:06):
the air conditioning, in thedata center for this specific
incident.
So if we are inside thesimulated company, we don't even
have control over the airconditioners.
They're in a vendor's hands.
We can't work those leversexcept maybe changing our
support contract or settingdifferent expectations with the
vendor.
Like we could pursue orcontractual avenues, but we

(14:30):
don't, we don't have our handson the air conditioner or the
maintenance schedule or any ofit, right?
We, and certainly none of us,even the vendor can control the
weather that is certainlycontributing to the air
conditioning problems.
So, but what we can do is learnhow to coordinate and learn how
important saturation is in anyincident.

(14:51):
But in this one in particular.
And it came up so specificallyand in so many deep ways in
this, this is the kind of goldthat's available in an incident.
If you look a little moreclosely, then how do we fix it?
How do we prevent the the nextone?

Courtney (15:05):
Yeah.

Eric Dobbs (15:05):
so

Courtney (15:06):
Here.

Eric Dobbs (15:08):
that's the thing we're looking for when we're
trying to draw out themes.
Let me finish through the list.
this, this item came up in a fewplaces for both Sarah and Alex
in particular.
I enter this as an analystsaying, you know, this looks
like a pretty unusual situation.
This is, you know, sales aredown, you've got the executive,

(15:29):
you're expecting a big sale,unusual loads of traffic.
And both of them were like,yeah, this is nothing unusual
about this.
This is business as usual whenyou're in incident commander.
and we'll, we can, we canpossibly dig into that.
a, a fourth item of interest andthis is sort of the most,

(15:51):
anchored and specific for thisincident, is there's a, once we
know that the, the source of theproblem, the trigger of the
problem, the thing that's goingto give us the most leverage to
get out of the problem is aboutair conditioning.
Once we know that's the problem,we have two paths that we're
pursuing, waiting for the, thevendor to fix the air

(16:13):
conditioning or getting out ofthat data center.
Both of those paths have risk.
And there's a, there's a debatein the midst of the, of the
incident, this is, this isprobably narrowly, if we were
looking for fixes, this is theplace that we would get lost, as
a, as a team trying to learn inthe weeds of the, this trade off

(16:40):
decision about wait for the airconditioning to get fixed or get
out of the data center.
and so those are the sort offour items that were the most
obvious.
But I wanna, I, I need tobriefly just for visual impact.
I dropped in the bottom a listof what looks to me as a skim of

(17:01):
maybe 25 that are wellsubstantiated with evidence.
They wouldn't, they didn't comeup from everybody.
The four that I named are theones that everybody I talked to
and everything I saw reinforcedthose four things.
But any of these 25, I thinkthere's plenty of evidence, in
the material we have of richthemes that could be worth

(17:24):
talking about.
So I'm gonna back off of those25.
I'm gonna use the, thenavigation to get back up to the
themes.
I would normally do, if I hadn'tlet us get derailed, digging
deeply into saturation, isinvite, some discussion from the

(17:46):
group about which of these wewanna talk about most.
we, the group already sort ofdeci decided that saturation was
important to talk about.
I need to hit this one otherdetail.
it's in the insights.
I had already sort of scrolledit into view while we were
talking about it.
this is particularly greatbecause it's happening while the

(18:09):
debate about, and the paralleltracks about waiting for the
data center to get the, the, airconditioning fixed or executing
the business continuity plan.
Sarah asks Alex, this is one ofthe places where she fluently
has too much on her plate.
She's delegating the thing, athing to Alex, but the way she

(18:31):
delegates it is

Sarah Butt (18:33):
alex, there's a, BCP doc.
Do you have, bandwidth to readit?
They sent it over to me.
It's, it's this one.

Alex Elman (18:40):
Yeah.
Once I get this CAN out, I'llread it.
I'm almost done.

Eric Dobbs (18:43):
Now, the backstory here is that she's asked about
four times in Slack for somebodyto, to tell her like, what's the
story with this ba with thisplan?
she's already been pursuing it.
Nobody's answered.

Sarah Butt (18:55):
I'm sitting there and I'm like, nobody is
acknowledging, like I wantsomeone to say like, act, I've
got it.
I'll be there in five minutes.

Eric Dobbs (19:01):
She's finally like, Alex, we need details about the
plan.
Have you got the bandwidth todeal with it?
So she's delegating with deepawareness that Alex needs to be
not saturated in order to takein the importance of it.
And she's checking that firstAlex responds fluidly almost

(19:25):
there.
I need to finish the thing I'mdoing and then I'll get on it.
So the fact this is, this is thesort of, it's so subtle, you
could completely miss it.
Sarah checks before saying,damnit, I need this thing.
Although, like I, Sarah, I'mprojecting into your head,
you've asked several times.
I think at the point you'rehanding it to Alex, you're

(19:46):
feeling impatient for an answerabout the thing.

Sarah Butt (19:50):
Yeah.
I mean, yes.
Like we needed to be moving inthat direction then and I was
sort of, I think if I, if Iremember right, I, I used this
sort of typical strategy oflike, I push on a person one,
two, maybe three times, and thenI start to round robin a little
bit and pull in people who mightbe useful.
And that's when the disagreementhas started between, Tanya and

(20:13):
Hammed, which I will just add a,a slight bit of interesting
commentary of not knowing thearc chart.
I thought that Hammed was likethe chief customer Officer,
which is part of what I, Inecessarily want to lean in into
his recommendation to enact theBCP right away because I didn't
think he was on the sharp end ofthe system.
I didn't realize he was actuallyTanya's leader and deep into the

(20:35):
platform side.
But regardless, um, that startshappening.
And as soon as that happens,it's like, I want to hear that
information from them because Iwanna understand from the
sharpened.
Why they're nervous about thisBCP, but someone's gotta give me
the freaking BCP, like someone'sgotta tell me what we're doing.
I just, just basic steps.
How long is it gonna take?
Like I just need information.

(20:56):
And so I look around and again,it's me reaching for that deputy
of like, where's the trustedunit of adaptive capacity I can
grab at?
And so I grab for Alex.

Eric Dobbs (21:06):
So there's this, there's this reciprocal example
of it from Alex back to Sarah.
Moments after Sarah's asked.
Alex is saying, I'm working onthe can.
I'm almost done.
There's a moment when Sarah issort of voicing and typing a set
of pretty complicated questions,trying to manage the parallel

(21:27):
tasks.
And there's sort of parallelquestions in this complicated,
blurb that she's typing intoSlack and Alex is impatient to
deliver on the request she madeof him.
He knows how important the, the,the business continuity plan is,
but he's savvy, he recognizesshe's overloaded.
Doesn't just drop it on her, buthe does at some point voice.

(21:53):
I, I've got a thing for you whenyou're ready, kind of thing.
And she says, you know, holdwhile I get these questions
typed.
So it's exactly the same thing.
Alex is signaling he's got athing for her, just enough of a
signal that it's not going toblow the stack that she's
managing.
But also enough to get in thereso that it's so that he's

(22:15):
elbowing his way in with thisthing she asked for.
such a savvy skill in thecoordination to have just the
right amount of interruptionthere.
Sarah's skilled at can't handleit right now, and then when she
gets through typing thequestion, she's like,

Sarah Butt (22:34):
Alex, sorry, I put you in a buffer.
What was that about the BCP?

Alex Elman (22:39):
Yeah.
We have a BCP.

Eric Dobbs (22:41):
The, the fluency with which they are negotiating,
each other's overload, thefluency with which they are
aware of each other's overloadand signaling.
They can handle it.
They're both recruiting eachother as resources to try to
manage the complexity of thishigh tense.
a high pressure moment in the,in the process.

(23:02):
and this, the, the expertise ondisplay in just this little
exchange extraordinary.
And y'all, this was a drill.

Alex Elman (23:15):
I wanna stop for, for a movement to point out the
sort of, artful facilitationthat Eric is engaging in, in
this retrospective.
So Eric introduced us all to thefour themes that he identified
in his analysis, but he invitedthat, he certainly hasn't
identified all the themes.
And then he showed us about 25observations that he made.

(23:37):
They're not themes yet.
They're observations.
Some of them, we can call itproto themes.
And then he set, let us sit withthat for a bit.
And then he zeroed in on, on animportant observation, went into
detail on it.
And I think what Eric's doinghere is he's trying to lead us
to another theme, not telling uswhat that theme is.

(24:00):
creating themes is verydifficult.
It's something that, takes a lotof time and experience.
And so it can be easy to, tolook at leading on a deputy that
she has and trying to use acounterfactual and saying, well,
without the deputy.
Sarah would be underwater.
So that's a thing.
we can't use counterfactuals.
We have to talk about whathappened.

(24:21):
And so Eric, what are, you know,four different ways that Sarah
can deal with that saturationnormally, and how did she use
deputy with that?

Eric Dobbs (24:34):
Oh, thank you Alex.
and thank you for the craft of,of your own question.
I'm, I'm not gonna do meta onwhat Alex just did about what
I'm doing as meta.
'cause we can go all the waydown.
We can keep going with

Sarah Butt (24:51):
I love incident.

Eric Dobbs (24:52):
The specific, so good.
So the specific, are fourresponses to overload.
and these apply to, the patternof overload, is everywhere.
It's in bacteria, it's in theway humans communicate.
It's in our individualexperience of a to-do list.
It's everywhere.

(25:13):
It's it, it's in a Kafka queue,which

Courtney (25:17):
I am, I'm having like parenting, PTSD while you guys
are talking about this, of like,where I have to be like, no,
hold the, oh, you shut up overthere politely and kindly to my
11-year-old while I'm talking tomy husband.
While my thir, I'm like, it'severy, it is everywhere, you
know?

Eric Dobbs (25:33):
Overload is everywhere.
So the pattern of overload hasfour, mechanisms, four
responses.
There's four ways that you canadapt to overload.
Let me, we maybe pair, there isa model of overload that
suggests there's these fourthings.
It's possible, the model'swrong, but I think it's pretty
good.
Two of them are urgent thingsthat you do without thinking,

(25:59):
two of them are.
more effective, but require someanticipation.
So the first two that are, thatare like the default ways that
overload gets handled reducingthoroughness on the collection
of things you're trying tomanage just dropping things from
them getting done.

(26:19):
My new hypothesis, this isn't inthe, the research I've read, but
my claim is what humans do bydefault is reduce thoroughness
without even thinking.
Before we even start sheddingload, we reduce thoroughness and
try as long as we can to managethe to-do list that's in our

(26:40):
brain.
and and at some point our brainhas lost capacity and we just
forget the list of things wewere trying to work on.
That's where we were droppingload.
But the default mode is reducingthoroughness, shedding load is
the other one that happens.
Both of those tend to lead to,suboptimal outcomes.

(27:03):
The other two that are, that aresort of better, I'm going to
throw a judgmental language onthat, are to recruit more
resources, get help from otherpeople or other, systems, or to
defer work until later.
So those four strategies arewhat you do when you're

(27:24):
overloaded.
the beautiful thing is thisexample of saturation.
What we have on display in thismoment of communications back
and forth between Alex and Sarahwhen they are at saturation is
we can see both of themdeferring recruiting resources.

(27:46):
And it's, it is precisely theengagement in those more
effective strategies.
That is the expertise they'redemonstrating.
They're not just dropping it.
they're not just reducingthoroughness.
They're communicating andsignaling, saying, Hey, I need
your help.
Hey, you asked for my help.
and in a, in, in an incrediblysavvy way, and mutually managing

(28:12):
overload, that's really too muchfor both of them, but
successfully coping with it.
This right here is actually itfor every, for every incident
everywhere.
Managing overload is the thingthat is common to every incident
everywhere.
There's a paper, we can put thisinto show notes.

(28:32):
It's the theory of gracefulextensibility.
and the, one of the coreassertions in the theory of
graceful extensibility is thatit is precisely managing the
capacity to adapt, managing therisk of saturation.
Those are, those are inverses.
There's a risk of saturation,and there's a adaptive capacity.

(28:56):
And managing those things is howyou do graceful extensibility.
I'm gonna, that's as much as Iwanna say about the paper,
because that's a whole deepdive.

Courtney (29:07):
We'll put a pin in it and yes, we'll put all of that
in the show notes.
and I mean, the, the other thingI wanna highlight about this
for, for folks listening orwatching along, is none of those
themes are technical things.
And you're, you're alreadysaying this, but I'm gonna, I'm
gonna reiterate it one more timefor everyone at home.
and I mean, we have somecandidates for recommendations

(29:29):
below, right?
Like what people would morelikely consider to be action
items to come out of a retro.
but the themes have nothing todo with the technical stuff.
And even if you go and do all ofthose action items and you
change your BCP or you, youknow, have different contracts

(29:49):
with your vendors.
This will still be there.
And, and that's what you'resaying.
And it's like, so if you havethis laser narrow focus on
action items, you're missing thebigger picture.

Eric Dobbs (30:02):
Absolutely, and if I can just reinforce that around
the specific content of themulti-party dilemma.
The problem we're in as anindustry is that we just spent a
decade tying our systemstogether technically.
Across company boundaries wherewe are separated And so the

(30:26):
communications between thehumans that keep all the
software running inhibited or,or held up by the fact that
we've gotta go through supportcommunication channels to get to
the air conditioner the case ofthis specific incident.
that's happening everywhere.
So this notion of coordinationacross company boundaries is the

(30:51):
hidden problem that all softwarehas right now, unless you're in
the rare situation of beingcompletely in control of your
hardware and software.
and I actually, I think that'sclose to zero in the, in the
business world.
So I think that the topic we'reon the, the, the, the specific
subject of the drill is one ofthe most important things and

(31:14):
how to coordinate another one ofthe most important things.
And it is really hard to do whenwe've got these legal entities
slowing down our communicationswhen it matters most
technically.

Sarah Butt (31:28):
Eric, let me jump off of that and just mention two
things off of the, multi-partydilemma work that Alex and I
did, which, we should probablyalso, link in the show notes.
There's a, paper and apresentation, but, comes out of
the presentation that Alex and Idid.
Two things.
One is a lot of the research andthe case studies and such that
Alex and I looked at for the MPDwas around external

(31:49):
relationships.
But when we look at companiesthat do have a significant
amount of this in-house, we seethis internally as well,
normally between teams.
As the systems got more complexas we needed different bits of
expertise, it became easier tohave people specialize.
And so we created boundarieswithin our own companies,

(32:10):
whether that's there's anetworking team or there's a
security team.
So these dynamics don'tnecessarily just happen.
Externally, they also happeninternally sometimes in
different forms and flavors, andsometimes you have different
ways to potentially addressthem.
The other thing that we talkabout in the paper that's
interesting is we talk about,nested or hidden dependencies

(32:30):
where your vendor has a vendorthat you had no idea about until
something happens.
And so, like in this case, I amsure that the HVAC vendor is
probably not directly the colo,it's probably someone else.
we see this commonly with, justthe, there's like really, really
big vendors that, a lot ofpeople tend to build on.
And I won't go through and tryand name them, but you might sit

(32:53):
there and go, no, no, we usethis vendor, not that vendor.
And it's like, but this vendoruses this vendor that uses this
vendor that uses this vendor.
And so sometimes you see theseincidents and I think we're
seeing more and more of themnow.
And um, like CrowdStrike isprobably a good example where
right around the one yearanniversary of that, where a
bunch of people would'vepotentially said.
No, we're, we're notCrowdStrike.

(33:14):
We use someone else.
We use this and that, but theirvendors used.
And so we end up in these verytangled complex systems, and
it's not necessarily a badthing, like the systems were
going to become complex.
We had to start to outsourceexpertise and break it up.
I know we talk about this piecein the paper and why this sort
of became this way, my goodness,does it make it complicated?

Courtney (33:37):
Yeah, it's a third party turtles all the way down.
so

Eric Dobbs (33:42):
And I think we're, we're only weeks away from a
pretty high profile incidentthat, that that sort of shared
between Google and uh, andCloudFlare.

Courtney (33:52):
Yep.

Eric Dobbs (33:53):
Everybody uses CloudFlare.
So there were cascading failuresfor all of the brand names.
All of the 500

Courtney (34:00):
Yeah.
And all of us were like, wait,

Eric Dobbs (34:03):
that.

Courtney (34:03):
CloudFlare used we were all like, what?
Nobody, nobody knew.
Like, yeah.
It was

Eric Dobbs (34:08):
Yeah.

Courtney (34:09):
Interesting.

Eric Dobbs (34:10):
And not in none of these cases, not to single out
any of the companies that we'venamed there.
It's, this is endemic, this isthe entire industry.
and it, we can name specific,companies where there have been
incidents and their name gotout, you know, so, but it, but
this is happening everywhere.

Courtney (34:29):
Yeah.
Agreed.

Eric Dobbs (34:31):
Much smaller impacts.
I'm conscious of the time.
Are

Courtney (34:35):
Yeah.

Eric Dobbs (34:35):
Close to the end of our recorded time together?

Courtney (34:38):
we're getting close.
But, like maybe lay out the restof the structure and let's pick
what we're gonna choose for the,our remaining time here.

Eric Dobbs (34:45):
Yeah.
So a, a quick sketch of thedocument.
We hit the themes.
There's clearly there's more.
Here other of these themes ofinterest, and the one I have the
most evidence already in thedocument to talk about is
fixation.
So maybe that's a candidatebecause it's available, but also
fixation might be well covered.

(35:06):
The doc doesn't have the thingabout unusual pressure or about
the disagreement.
Does anybody want to, and, andin fact, this is the one that's
most narrowly focused for theincident.
Is there anybody in the room whowants us to get to that part?
what do people think?

Sarah Butt (35:24):
think that one's gonna be the most interesting
because I know, and this ismaybe just my personal interest,
but like Alex and I have beenable to talk, Eric and I have
been able to talk theperspectives that we've not been
able to hear yet.
Much like how you get to a retroand you haven't heard people's
perspectives are the people thatHamed is currently representing.
And those were the folks thathad very big opinions on this,
to the point where Hamed wasplaying both sides, I think.

(35:46):
So I'd love to hear more aboutthat piece.

Hamed (35:49):
So the, the argument between Hamed and Tanya, should
we do the fade over or should wenot?
for me, that was the mostinteresting part because I was
playing both of them.
I was Hamed and I was Tanyaessentially.
I was like arguing with myself,but it felt so real.

(36:09):
Each time I was hammered, I wasgenuinely like, was thinking and
pushing for it.
And I was Tanya.
almost like I chopped off andbrought another personality and
very passionately like, no, thisis risky not to do it.
And I could, I could think whyTanya would be, would be against
it.
So from, from that perspectiveis like very interesting,

(36:30):
conflicting, experience.
Tanya being a, kind of a morehands-on person.
She had worries that all thetime we do BCP exercises.
It's mainly as a checkbox totick compliance.
Everything in months we have todo and we need to submit a

(36:52):
report that we have done it.
It has taken this amount of timeand there was a desire in
business, and by the way, thisis all coming from a real life
experience that I lived.
I'm not gonna mention whichcompany it was, the companies I
worked for that, okay, we haveto do this BCP exercise data

(37:14):
center failover.
It was called Flip to Report toregulators, but we don't want
disrupt the business.
We don't want it to be goingbeyond certain time.
So it was like a soft BCPexercise and a hard BCP exercise
and the soft one was verycontrolled.

(37:35):
So Tanya's worry was everythingthey've done so far was soft.
And we dunno what's gonna happenin like where one of the data
centers is like really notaccessible.
Can data move around?
Can not move around.
What's gonna happen the, to, ato, to the syncing of the data?

(37:56):
so that was one of her worries.
And then the other worry wasabout we out of last three
times, only one of them was donein the timeframes that we were
talking about.
Hamed's perspective beingTanya's manager orchestrator of
basically he was responsible forthis execution of bcps.
Was, was he, he was a little bitconflicted because.

(38:18):
He runs this BCP, so businesscan rely on it.
So knew that Tanya might have apoint, but it was also his
responsibility to make sure thatbusiness have a credible BCP
plan.
So he naturally overlooks the,that we had in the past and it
tends to more on now.

(38:42):
We have done it at least once inthe timeframe we practice.
Even last time took a lotlonger, but we fixed the issues
so we should be able to do it.
And covering all of that is ourvendor looks.
It doesn't look like trustworthyat the moment.
I can't believe what it says.
So that was like two arguments Iwas leaving in the moment.

Eric Dobbs (39:06):
I, I love these perspectives.
And, there's a, the piece that'sjumping out at me, I guess I'm
just gonna have to ask them asto the persona.
I am that in the pressure of anincident, Tanya feels safe,

(39:26):
challenging her boss.
That's that, that suggestsextraordinary psychological
safety.
and I, I wonder if you cancomment on that from, from
either or both of their pointsof view.

Hamed (39:45):
so I believe that moment so I can, I can talk about
experience, yes and no.
So, it could be psychologicalsafety, but it could be that
Tanya knows that this is thepath that is gonna be to
failure.
leading this path.
I'm going to be in a lot oftrouble.

Eric Dobbs (40:05):
Uh, okay.

Courtney (40:07):
Uh, it was ruinous to go.

Hamed (40:09):
Speak.

Courtney (40:12):
So she was willing to gamble though, because the risk
to her of the other avenue feltlike it would be worse for her
outcome and her team's outcome.
Is that accurate

Hamed (40:26):
Yeah,

Courtney (40:28):
or not?
Gamble?
Actually, she wanted to notgamble.
You wanted to gamble.
I did the other way.
Yeah.
So the,

Hamed (40:34):
but Hamed was the

Courtney (40:35):
yeah.

Hamed (40:36):
Who was going to do it.
And if he failed, would be underhuge amount of stress to work
out

Courtney (40:42):
Yeah.

Hamed (40:42):
and fix it.
And again, probably she willseem responsible, oh, why this
happened, why that happened, whythis is correct.
from her perspective, like noway to win.
I might as well just really saywhat I think.

Courtney (40:57):
Be safer.
Yeah.

Sarah Butt (41:00):
I think it's interesting to hear, from both
personas the responsibility thatthey feel, in this time of
uncertainty because I think backon it and I know, that I felt
responsible as well as theperson that ultimately was, was
Like at the end of the day,making the decision, for lack of
a better word.
and that's something I, I oftenfeel in, in all reference

(41:22):
Richard Cook and how complexsystems fail here, where he
talks about how practitioneractions at the sharpened in
these novel and unknownsituations, like they're always
gambles and it's really easy tolook at this.
I mean, this is the thing thatwe miss in retros so often.
And it's really easy to look atthis because it ended up working
out and go, oh yeah, that's agreat decision.
It would've, it could have beena bad decision where we got

(41:44):
lucky.
It could have been a greatdecision where things went well.
A good decision or a baddecision looked at in
retrospect.
it, there's really no way to dothat.
Like you do the best if you can.
I, I, I talk a lot aboutjudgment being a timestamp
decision because you don't knowwhat happens in the future.
And that's where I think theseretrospectives get so
interesting because we've gotthree people on this call

(42:06):
functionally all saying.
feel responsible for the entireincident if this goes south.
But every single one of uswanted to do the right thing,
but there was were always goingto be in a gamble.
And it's so easy if you are justreading a document after the
fact or coming in after thefact, or not living the place of

(42:27):
having to make that decision notunderstand that.
I mean, you really are making agamble and hoping that you've
made the right choice and ourpractitioners carry that.

Courtney (42:38):
Yeah.
I think, if there were to be ameta, meta theme of all
incidents is the principle oflocal rationality, right?
And that you, everyone who'sinvolved in this is a, trying to
do the best that they can.
No one's showing up to anincident trying to make it
worse.
even the executives, when itfeels like they might be, and
secondly, everyone is making thebest possible decisions they can

(42:59):
make given the information theyhave at hand at the time.
And, and that is the best,right?
And the information you have atthe time at hand is never.
All the information.
and, and then the third partymulti-party dilemma piece,
amplifies that significantly.

Eric Dobbs (43:16):
So like we have so much evidence about Sarah and
Alex deftly hedging throughoutthe incident on the gambles that
are there.
That's extraordinary.
and, the, the, the critical, oneof the other critical insights
why bother learning fromincidents is that because the,

(43:39):
that sense of responsibilitythat Sarah points to all of the
people involved have a, anurgent sense of responsibility
during the incident is where weget to see the lived
prioritization.
Of what the people in the roomthink is the most important
thing for us to be dealing with,or the most important fear for
us to be dealing with.

(44:00):
So one of the things that, oneof the reasons incidents are
such a powerful lens tounderstand how the business
really works is that it isliterally where the rubber meets
the road.
In our business, we have allthese ideas.
We have policies and procedures,we have trainings, we have,
quarterly expectations and OKRsand everything else, but it's

(44:21):
during the incident where peoplereally make those prioritization
decisions for the business.
And so inspecting them is aplace to see really was most
important according to thepeople closest to the, to
keeping the business running.
What is that thing?

Courtney (44:36):
Alex has to make his own prioritization right now and
escape us to go to anothermeeting.

Alex Elman (44:41):
This was so much fun.

Courtney (44:44):
Thanks, Alex.

Sarah Butt (44:45):
By Alex,

Alex Elman (44:47):
Bye.
Yeah.

Sarah Butt (44:48):
Can I ask, uh, Eric a question?

Eric Dobbs (44:51):
Yeah.

Sarah Butt (44:52):
Eric, I'm curious, like, let's say you're
facilitating this retro, If wetake out all of the, whoops, we
lost power and people lostinternet connection and had
migraines and all, you know, allof, all of the things that we've
had happen we're still probablywell over an hour we're, it
feels like, just barelyscratching the surface.
if you were facilitating thisat, at, you know, in, in quote

(45:14):
unquote real life as a retro,what would you do here?
Like, would you keep going?
Is there multiple retromeetings?
what's your advice to people whowant to try and start some of
this format and are trying tofigure out how you time box it
with these really richdiscussions?

Courtney (45:29):
It is such a good

Eric Dobbs (45:30):
mean that's incredibly, there's an
incredibly difficult, how do youstop an interesting
conversation?
Richard Cook has a, a reallybeautiful metaphor he uses in
many of his talks.
At the end of his talk, he says,my.
my psychiatry fr as mypsychiatry friends like to say,
I see we are reaching our timetogether.

(45:52):
We are reaching the end of ourtime together.
So, I were a doctor could referto my psychiatry friends.
You know, maybe I could, maybe Icould make, I, my, as my
therapist would say, I see we'rereaching our time together.
This is a, so that's a gracethat, that's a graceful,
entertaining way to, to justsay, look, we have to draw the

(46:15):
line somewhere.
I know everybody's got a busyschedule.
so, you know, thank you so muchfor your time and what I hope
you take away from our timetogether is that, there's ton of
value to be gotten by havingexactly this kind of
conversation about the messydetails of the incident.
we, because of the way I managedit, I'm sorry we didn't get to

(46:38):
the action items and I'm surethat's gonna irritate some of my
people in the room.
If we're in a real incident, Iwill be happy to schedule
another half an hour for us totalk about that, or we could
possibly take it upasynchronously.
and I you to,

Courtney (46:51):
or the folks who already know this stuff are
gonna go do it anyways.

Eric Dobbs (46:55):
this is, this is the thing.
can't stop engineers from fixingstuff.
the, the, the most importantfixes that I've seen in the
three or four years that I'vebeen doing incident analysis,
the most important fixes happenbefore you even get to the
retro.

Courtney (47:11):
Yeah.

Eric Dobbs (47:12):
That as, as people are cleaning up the mess, they
do the thing that is mosturgent.
And that doesn't even show up inyour action items.
'cause it was already done bythe time you got to the retro.
so the, the, the lowest hangingfruit that's probab and it's,
hmm.
Yeah, I insert Lorin's law.

(47:33):
Some of that stuff that you doin those minutes becomes the
source of your next incident.
sorry, Courtney.
Another, another thing to add tothe, the

Courtney (47:42):
is.
It's going in the notebook.
Here it is.

Eric Dobbs (47:45):
for the leaders in the room who are worried that we
haven't left with action items,rest assured that there will be
action items your teams can'thelp themselves.
And, nevertheless, I'm happy totake on the invitation for
another 30 minutes to talk aboutaction items, if that's really
important for folks.

Courtney (48:04):
We're not gonna do that for this podcast.
it's okay.
But I, I would, if I can, if Ican be so indulged, I wanna ask
the final question of Hammed.
you as the executive of thiscompany that this incident
happened of, and you were giventhis report, what do you do with
this?

(48:31):
Or what do you expect to happenout of this from the team or
your teams that were involved?

Hamed (48:39):
First of all, I'm really overwhelmed by this
conversation.
Like the amount of learning thatI got out of this conversation,
I just dunno how to react forit.
With my engineering backgroundon top of it.
So hard for me to think what,what would I do as an executive?
But it's, I would suggesteveryone read this report and

(49:05):
watch this conversation.
There was so much learning.

Courtney (49:08):
Everyone at the company.

Hamed (49:09):
in the company, me,

Courtney (49:12):
Okay.

Hamed (49:13):
and I think it goes, it goes beyond engineering teams as
well because if you think about,the points that, the learnings
that I had from thisconversation about.
Saturations and how to be awareof it and how to be deal with
it.
That doesn't just apply to tech.
The learning about how, howmake, how make executives in a

(49:36):
stressful situation of incidenta useful, and I think by being
useful, they will feel a lotmore comfortable.
That's, again, very, very richyeah.
I'm just thinking is there anyother way to learn so much about

(50:00):
what is going on in an incident?
I.
Such a short amount of timethan, than what we did here.
Like I was just thinking if, ifI was going to watch this
podcast, I can't think of anyother way to learn so much in
this short amount of time.

Courtney (50:17):
I wanna put a, I wanna put a pin in that too, because a
lot of pushback I hear at higherlevels or in organizations is
like, oh, this, this incidentanalyst stuff, this stuff, it's
like, it's so time consuming,it's so whatever, whatever.
And for the actual analyst, Iwill argue yes it is, but that's

(50:38):
their job, right?
Like if you have somebody who isa dedicated incident analyst,
indeed they will be spending thevast majority of their time
doing that, and that's what theyshould be doing.
And that is great and wellworthwhile to the other people
involved.
Okay.
We're, we've cracked a lot ofjokes about how long this could
go on, but for the other peopleinvolved, you've got one, maybe

(50:58):
two hours, maybe, you know, wedid a lot of meta talking and,
you know, had a power outageand, but, so maybe it's an.
How valuable is that?
90 minutes to your company,right?
what other activity, like yousaid, could you possibly do

(51:19):
that, would deliver that muchvalue in 90 minutes of four
engineers time?
And, and to me, I just can'teven unsee that that math makes
perfect sense.
but you know, I'm not anexecutive at a large tech
company, so, but I, I just wannaunderscore that point that
you're making is, there is noother way that is this effective

(51:39):
and this efficient to have thekinds of insights that come out
of this that could dramaticallychange the way your teams manage
incidents, handle them, preparefor them, all of that.
There is nothing better.

Hamed (51:53):
Yeah, exactly.
But, so I will ask last, lastquestion from Eric,

Courtney (51:59):
Okay.

Hamed (51:59):
uh, Eric, analyzing this incident.
Did it make any difference that.
You went through this drillyourself, like you lived it, you
knew what happened.
Did it have any bearing on howyou approach analysis?

Eric Dobbs (52:16):
You know, I have one brain, so I can't completely
separate.
I can't rewind the universe andrun the experiment to see what I
do if I hadn't run the drillmyself.
were enough differences betweenwhat I understood when I was
doing it myself and what I sawAlex and Sarah doing that it was
pretty easy to focus on theirexperience.
although I have access to therecordings of my own, I haven't

(52:37):
watched them.
So, the one place that I was,uh, trying to stay faithful to
the idea that I haven't, I don'treally know what's going on.
I'm, I'm more from the outside,is that I really immersed myself
in the data and evidence fromSarah and Alex's, uh, running of
it.
And I haven't, I'm going to,now, now that we've gotten here,

(52:59):
I can go back and look at my ownand, and, and genuinely compare
myself to the two of them.
That'll be humbling,

Sarah Butt (53:07):
I would say I think it's apples to oranges though,
because you ran it solo runningthis thing solo is a completely
different beast than runningthis thing as part of a team.

Courtney (53:20):
but that's kind of exactly the point

Eric Dobbs (53:22):
Yeah.

Courtney (53:22):
is we always say no two incidents are the same.
You will never have the sameincident again, you know, yada
yada.
But the exact same set oftechnical details, procedural
details, business details.
And given one person iscompletely different than two
people, if you run this exactsame incident with some other

(53:44):
two people, you're gonna have acompletely different set of
outcomes because every personbrings their own individual set
of skills and experiences andeverything to it.
And that's the other piece thatwe tend to not see about
incidences is we see it as thisblack box in which physics
happens and, and expect sameincomes, you know, inputs and
outputs and outcomes.

(54:05):
And you are not gonna have thesame people on call, you're not
gonna have the same responders.
So I think it also really doeshighlight that fact of how, even
when the details are so similarin, in your organization, like,
I've seen this before, but youhaven't seen exactly that before
with the same people and, and sothere's, yeah, there's a whole

(54:25):
other path to go down aboutthat.

Eric Dobbs (54:27):
Yeah.

Courtney (54:27):
will be curious, Eric, if you go back and look.

Sarah Butt (54:29):
down, and I think it's interesting because that
was actually done for incidentFest because we had three pairs
of people run the identicaldrill.
It was myself and Alex, and thentwo other pairs.
and I think it, it's like that'sprobably a whole separate
podcast or discussion orsomething, but like, how cool is
it that we had the opportunityto do that?
Because I've certainly neverbeen able to watch peers that I

(54:53):
really respect an industry alsorun same thing that I saw.
And I think the learning that wecan have from how we all handle
it and the things we've pickedup along the way and the
expertise we bring is amazing.

Courtney (55:04):
And next thing you know, you're gonna be, uh,
uptime labs on Twitch.
That'd be great

Sarah Butt (55:08):
That would be awesome.

Courtney (55:12):
if you actually do that.
I just like put me in a webcredit somewhere that's always
don't do it.
Okay.

Hamed (55:19):
hard.
They considered it.
They looked, looked to it, but.

Courtney (55:23):
I know some nerds who'd watch it, so, you know.

Sarah Butt (55:26):
I would,

Courtney (55:26):
Um.

Sarah Butt (55:26):
would watch that all the time.
I, I listen to like, broadcastdispatch and stuff because I
think it's so like soothing.
'cause I, I just, I love hearingthe details.
I would, I would do that all thetime.
I'm a huge nerd.
I, Courtney is gonna need toerase that from the podcast.
I'm a huge nerd.

Courtney (55:42):
No, we're all huge nerds.
And everyone who's listening tothis is a giant nerd too.
So, uh, we're all here together,thank goodness.
All right.
Thank you all.
This is the weirdest and mostwonderful podcast I've ever
done, and I hope we get toconvene on some of these topics
again someday.
For those listening, there's allkinds of stuff in the show notes
we've talked about a lot, and,so thank you all very much.

(56:04):
I wish you all a happy Fridaywithout power outages or
migraines or any otherincidents.
Advertise With Us

Popular Podcasts

Stuff You Should Know
My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder is a true crime comedy podcast hosted by Karen Kilgariff and Georgia Hardstark. Each week, Karen and Georgia share compelling true crimes and hometown stories from friends and listeners. Since MFM launched in January of 2016, Karen and Georgia have shared their lifelong interest in true crime and have covered stories of infamous serial killers like the Night Stalker, mysterious cold cases, captivating cults, incredible survivor stories and important events from history like the Tulsa race massacre of 1921. My Favorite Murder is part of the Exactly Right podcast network that provides a platform for bold, creative voices to bring to life provocative, entertaining and relatable stories for audiences everywhere. The Exactly Right roster of podcasts covers a variety of topics including historic true crime, comedic interviews and news, science, pop culture and more. Podcasts on the network include Buried Bones with Kate Winkler Dawson and Paul Holes, That's Messed Up: An SVU Podcast, This Podcast Will Kill You, Bananas and more.

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.