Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:01):
Welcome to The Operative Word,
a podcast brought to you by the
Journal of the American College of
Surgeons.
I'm Dr. Jamie Coleman, and
throughout this series, Dr. Dante
Yeh and I will speak with recently
published authors about the
motivation behind their latest
research and the clinical
implications it has for the
practicing surgeon.
(00:22):
The opinions expressed in this
podcast are those of the
participants and not necessarily
that of the American College of
Surgeons.
Welcome to The Operative Word,
a podcast from the Journal of the
American College of Surgeons.
I'm Dr. Dante Yeh, one of your
co-hosts for this series.
In this episode we will be taking an
in-depth look into the current
(00:42):
article, "Quantifying the Prognostic
Value of Preoperative Surgeon
Intuition (00:46):
Comparing Surgeon
Intuition and Clinical Risk
Prediction as Derived from the
American College of Surgeons NSQIP
Risk Calculator." I'm
honored to be joined by the first
author, Jayson Marwaha,
MD, MBI and senior
author Gabriel Brat, MD,
mPH. from Beth Israel Deaconess
Medical Center.
Doctors Marwaha and Brat, thank
(01:08):
you for joining me today.
Before we begin, do you have any
potential conflicts of interest to
disclose?
I have one brief disclosure,
and it's that during the course of
the study, I was funded by a NIH
National Library of Medicine
t 15 trainee Grant.
Just I'd like to disclose that I
have ongoing funding from the
(01:29):
National Institutes of Health.
Great. Thank you very much.
Dr. Marwaha, can you give us
a brief summary of your study design
and describe to us your main
findings?
Yes, absolutely.
And Dr Yeh, thanks so much for
having us on the podcast.
We're both really honored
to be here.
So, again, as you mentioned, the
title of our study was "Quantifying
(01:49):
the Prognostic Value of Preoperative
Surgery Intuition."
It was published a few months ago in
the Journal of the American College
of Surgeons.
And the sort of
objective of our study was
to sort of get
a, you know, build
out a quantitative understanding
of surgeon intuition, particularly
(02:09):
as it relates to
the surgeon's ability to predict
what will happen to their patients
after surgery.
And the the
motivation or the inspiration for
the study actually came largely from
the
NSQIP risk calculator developed by
the American College of Surgeons
that we consult sort of
(02:30):
very routinely when making clinical
decisions about patients.
And I'm sure many of our listeners
are familiar with
the prediction page,
the prediction screen, rather, on
the NSQIP risk calculator,
where in the bottom
right of a screen there is a little
widget called the
Surgeon Adjustment of Risks
(02:52):
widget, and that essentially is the
dropdown menu that allows
a surgeon's intuition
to modify the output
of the risk calculator.
So if, say you're, for
example, you're sitting in front of
a patient and you want to calculate
and you want to sort of estimate
their their risks
of postoperative morbidity and
mortality, and
(03:13):
your intuition sort of tells you
that the patient's
risks of postoperative morbidity and
mortality are somewhat higher or
significantly higher than what the
risk calculator predicts based on
the quantitative data that you give
it, then you can modify
its output accordingly.
And you know, the
reason why this this
(03:34):
sort of function in the risk
calculator served as a
source of inspiration is because we
wondered, you know, is
there a sort of,
more, is there a scientific or
precise or quantitative way
to actually adjust for surgeon
intuition when predicting what will
happen to patients after surgery?
And what we found and when we sort
of did a deep dive on how this
(03:55):
surgeon adjustment of risk
function works, is that
it actually sort of
to some degree, somewhat arbitrarily
adjusts for for surgeon
intuition. So when you lift up the
hood and you look under the hood of
the risk calculator, it actually
just bumps
up to the patient's risk
(04:17):
of post-operative complications by
one standard deviation.
Now, is one standard deviation
of risk truly representative of
a or a you know, does that truly
sort of properly adjust for how
a surgeon thinks about a patient?
Unclear. And so that's that's sort
of the question we sought out to ask
is, is is
(04:38):
there a way to precisely adjust for
surgeon intuition when predicting
what will happen to patients after
surgery? And so in order to do that,
we conducted a retrospective
cohort study and
we collected two sources of data.
One was our,
was NSQIP data from our
institutional registry at
(04:59):
the Beth Israel Deaconess Medical
Center in Boston,
Massachusetts.
So over the course of a number of
years, we collect we've collected
retrospective
data on NSQIP variables
for patients undergoing surgery
at our institution.
That was sort of once one
of our data sets.
The other source of data was
(05:20):
prospectively collected, and that
was actually surgeon intuition data.
And the way we measured that is
for those same patients that we were
collecting NSQIP data on, we
also sent a
text message, one question
questionnaire to
to surgeons at our institution right
before they were about to operate on
(05:40):
that patient.
And we essentially asked them the
same question that the
NSQIP risk calculator surgeon
adjustment of risks widget asks,
which is, "how
would you estimate to this patient's
risk of morbidity, postoperative
risk of morbidity and mortality?
Would you say it's about average for
patients undergoing this procedure?
(06:01):
Would you say it's significant?
Would you say it's higher than
average or would you say it's a
lower than average?" And
based on their response to
that one question, we
sort of got a rough understanding
of what their intuition was
about this particular patient.
And then once we had those two
collected, once we had collected
those two sources of data,
(06:22):
the retrospective NSQIP data and the
prospective intuition data,
we combined
them in various combinations to
train various models,
to train various prediction models
on postoperative
outcomes for these patients.
And our main findings were that
when we, unsurprisingly,
when we trained a
(06:44):
logistic regression model on
NSQIP variables alone,
we developed a model that had
predictive performance roughly
similar to that of the
original NSQIP risk
calculator as reported when it was
originally developed by the American
College of Surgeons, so the AUC, or
area under the curve, which is a
measure of predictive performance
was 0.83,
(07:05):
which is roughly similar to
the 0.82
predictive performance that
NSQIP Risk Calculator, that the NSQIP
risk calculator
reports.
Now, when we train a model to
predict surgeon to predict patient
postoperative outcomes based on
surgeon intuition alone,
(07:26):
without any NSQIP data,
we got to a, we found
that it was a independent risk
factor of
patient postoperative morbidity and
mortality. So we found that the area
under the curve was about a 0.7.
And what's interesting about those
two results is that, you know, the
we found that to be,
(07:47):
that the
NSQIP risk calculator
does, sorry the NSQIP data alone
model predicted
predicted postoperative outcomes
statistically significantly better
than the model trained on
surgeon intuition alone.
And then the third model we built
was one that combined the two
(08:07):
sources of data.
So we brought together
the NSQIP risk calculator
variables and combine that with
our surgeon intuition variables.
And interestingly, we found that the
predictive performance of this sort
of combined model
that included both quantitative
data about the patient as well as
the surgeon's intuition actually
(08:28):
did no better than the model
that predicted
than the model that was trained on
the NSQIP data alone.
And so those were our main
findings. And we'll sort of
go into implications and,
you know, our sort of takeaways
from the study from there.
(08:50):
Great. Thank you.
All right. So I'm going to,
I'm going to try and summarize this
as best I can understand.
So you you
tried to developed a model
to predict postoperative
complications,
and then you compared them to
surgeon intuition, as
predicted before the operation
(09:12):
began.
And what you found was that surgeons
were okay, right an AUC
of like 0.7, which is
not bad, not great, but not bad.
The model that you
predict that you created
did better than the surgeons,
and it didn't do any better
when adding in the surgeons'
(09:33):
intuition.
So, okay,
so actually I'd like to ask you
the primary outcome or the endpoint
of complications after the
operation. This was a composite.
Right.
Do you have a list?
Can you tell us what what sort of
complications you were looking for?
So the the overall we
(09:55):
basically use the NSQIP outcome
list. So we use both the mortality
and morbidity outcome
complication list that is used
for the NSQIP data. So functionally
what we did is we
collected the NSQIP
outcome information for all the
patients that were enrolled in the
study.
All right.
(10:15):
Got it. And so I have it in front of
me, it looks like.
We are looking at
examples, for example, superficial
surgical site infection, deep
incisional surgical site infection,
organ space, wound disruption,
pneumonia, unplanned intubation,
pulmonary embolism, mechanical
ventilation requirement greater than
(10:35):
48 hours, and a whole slew
of others, like a bunch of others.
I was thinking about something while
you were describing the study, Dr.
Marwaha.
So.
I'm asked as the surgeon, is this
patient average risk,
greater than average risk, or lower
than average risk?
And I'm trying to think in my mind,
(10:56):
like, when would I ever say, oh,
this is a lower than average risk
patient?
How how common was that
in your study for
a surgeon to to say, Oh, yeah,
this is going to be super easy
and you know, this this patient
has lower than average risk.
Yeah. So in our data, we found
(11:17):
that the most common
response across all
respondents was
was average risk.
So about 45%
of respondents
when, when, when responding to our
questionnaire preoperatively found
that the patient was average risk
when compared to other
(11:38):
patients undergoing this
procedure.
The second most common was
higher than average risk, and
that represented 40% of
response of responses.
And then as your sort
of intuition correctly
predicted, the least common
response was lower than average
(11:58):
risk.
About 15% of respondents
said that the patient sitting in
front of them that they were about
to operate on was lower than average
risk for for morbidity and
mortality. Now, some element
of why this sort of
pattern emerges is probably
because if I if I sort of
had to posit an explanation
is probably because
(12:19):
the majority
of patients in our dataset were
collected from the Emergency
General Surgery Service at Beth
Israel Deaconess Medical Center,
where patients generally are
quite critically ill.
I'd imagine that
as, and we're working on this now,
but as we expand to
other specialties and other
pathologies, the distribution
(12:40):
of responses may may change.
And also maybe the surgeon intuition
may be different if your sample
was predominantly a certain
subset of surgeons with a certain
shared mindset.
You know, you may find that elective
endocrine surgeons may be better or
worse than than acute care
surgeons.
(13:01):
I think that's actually a really
great point is that
one of the clear characteristics
of this study is that it was heavy
on EGS/ACS
surgeons
who see a certain type of patient
who is at high risk for having
a post-surgical complication
and other specialties
(13:22):
may have different ways of
evaluating risk.
That work that
was not captured as part of the
study. I completely agree with that.
You also enroll, you also surveyed
surgeons at various stages
of of experience.
Can you can you tell
us a little bit more about that?
Yeah, absolutely.
(13:42):
So
that's that's another very
interesting sort of aspect of our
study is on subgroup.
One of the subgroup analysis that we
performed is we examined
how the intuition of
attending surgeons compared to
the that compared to the intuition
of resident surgeons when
(14:03):
they were responding to this
questionnaire.
And so we to to,
to sort of get a better
understanding of that, we built two
separate models, one
that used attending surgeon
intuition only to predict
postoperative outcomes and a similar
one that used resident surgeon
intuition only to predict
postoperative outcomes.
And we actually did find that
(14:25):
the attending surgeon only
model did significantly
outperform the
resident model at predicting
whether this patient would
experience any postoperative
morbidity and mortality.
Now,
while that was an interesting
finding, that sort of the
sort of next step in our analysis
was to see if
(14:46):
the the combined
model that included intuition and
NSQIP data did significantly
improve when attending-only
intuition was incorporated
and unfortunately it didn't.
So a model that
included that was trained on both
NSQIP data as well as attending
(15:06):
surgeon intuition data only still
did not outperform
a model that was trained on NSQIP
data alone.
Great. Thank you.
So in reading your
manuscript, I was
intrigued by a comment that you made
in the introduction.
You mentioned how several common
cognitive biases and
(15:28):
counterproductive heuristics
affect clinical decision making.
Can you give us some examples?
In terms of kind of an overall
kind of
push for this study.
There is an increasing amount of
data, especially in the last few
years, that
the way in which we make decisions
(15:49):
in some cases can be
affected by
information that is external to the
patient.
So instead of taking every
patient individually, what we do is
we fall back on
experiences we had previously
and we use those to make decisions.
And sometimes that can actually lead
us down in a
(16:11):
poor direction. So one great
example of this is something called
recency bias.
So there is a really nice paper that
was published in Science a couple of
years ago that showed
that the likelihood of having
a C-section
was related to
whether a
OBGYN had a complication
with the previous patient.
(16:33):
And so what we all
know this to be true is that
our recent experiences
affect our ability to
be objective in the way that we
evaluate any given patient.
So we are there are multitudes
of examples as
one being recency bias
where we're not using the
(16:54):
information fully that's in front of
us. We're looking at where
we're affected by our experiences
in the past.
Yeah. Thank you.
Thank you for that example.
I definitely recognize
when I am affected by recency bias,
and I'm sure that affects all of us.
Jayson, do you want to mention
another great example,
(17:14):
which I think we can just if
because I think this is such a
important aspect of understanding
that if you're okay with it,
that we, we, we
sometimes overlook these things that
we all know inherently.
And yet when
they're made explicit, they seem
really clear to us.
And so, Jayson, do you have another
example you want to bring up?
(17:36):
Yeah, absolutely.
I think there are lots of really
interesting examples.
And, Dr Brat, your example
of recency bias is one
that is, I think,
very relatable to to
surgeons and surgical decision
making.
To take a step back, so when we
were when we were first starting out
this study and reading up on,
(17:56):
you know, other people's work on on
intuition more broadly, we found
that intuition, as Dr. Brat
just just sort of briefly mentioned,
that human intuition more broadly,
not specific to surgeons, one
one big element of how we make
sense of the world is these
cognitive biases and heuristics.
And essentially what they are are
(18:17):
mental shortcuts that help us
grapple with large amounts of data
when we're confronted with it.
And sometimes those shortcuts
are very helpful.
Sometimes they're they
can be counterproductive.
And, you know, they're not always
sort of counterproductive
as the recency bias
example seems to suggest.
(18:37):
You know, one very useful
example of where
heuristics are incredibly valuable
are amongst trauma surgeons.
So when you're in a data poor
environment such as a trauma
activation, and you fall back
on common patterns
and sort of
you need to fall back on it, on
(18:58):
quick decision making, then
these types of mental shortcuts can
be incredibly valuable and in some
cases lifesaving.
However, there are lots of
documented examples that other
people have studied in other fields
where these kinds of
mental shortcuts can potentially be
counterproductive.
And one additional example,
(19:18):
in addition to the recency bias, one
that Dr. Brat just mentioned with
obstetricians is this idea of
left digit bias.
And there was a really beautiful
example of this that was published
in the New England Journal of
Medicine a couple of years ago.
And what they
sort of wanted to expose was
the was was was
cognitive biases in surgical
(19:39):
decision making for patient
selection for patients undergoing
a CABG.
And what they found
is that
that's the
the numerical age of the patient
plays a disproportionately
large has a disproportionately
large influence on patient
(20:01):
selection for
recommendation to undergo a CABG
versus not undergo a CABG.
So
to add a little bit more detail to
that, what they found was that
patients that were 79,
but just about two weeks
away from turning 80 years old,
were significantly more likely to
be recommended to undergo a CABG
(20:22):
than patients who had just turned 80
two weeks ago.
So this idea of being in your
seventies or being in your
eighties, even though these two
cohorts of patients were only four
weeks apart in age and therefore
probably not very physiologically
different, played an
undue, uh,
role in a surgeon's
ultimate recommendation to undergo
(20:43):
the surgery or not.
And so that's another example of how
these sort of mental shortcuts that
we fall back on when confronted with
lots of data, can sometimes
influence how we make decisions,
even if they're not the most, even
if we're not making decisions based
on the most relevant pieces of
information about the patient
sitting in front of us.
Yeah, I think the marketing people
(21:05):
have that all figured out, which is
why I go buy the avocado for
$4.99, but not when it's $5.
Right.
Exactly.
Great. Well, thanks.
Thanks for that wonderful example.
Yeah. I can see now that that
we're plagued by biases
all around. And something like
this is this risk
(21:26):
calculator is very useful
to us to help overcome that.
I have a question about the methods,
and so I know a
little bit about statistics,
but there was a unfamiliar
term that I encountered in
your methods. So can you explain to
me what is a multivariate lasso
regression and how does this differ
(21:48):
from a logistic or a linear
regression?
So the idea
here is in certain situations
we have a
lot of variables and not a huge
number of
samples.
So in this case, we
we had to basically collect
this data prospectively from
(22:09):
colleagues across the medical
center.
And so it was very difficult
to get a large dataset, but
yet we have all the
variables that exist in NSQIP that
we can apply to this,
to this analysis.
And in those scenarios where you
have not a huge number of samples
and a large number of variables, you
(22:29):
have to have a way to basically
filter through the relevant samples.
And so
lasso is a
form of penalized logistic
regression where essentially
it allows us
to better isolate the subset
of most important and relevant
variables that are
(22:50):
related to the outcome without
dramatically overfitting the model.
So yeah, actually, I, I encounter
this a lot when I read papers.
And you know what little statistics
knowledge I have, I
seem to remember that if you don't
have enough events per variable,
then then you are at risk of what we
say overfitting the model, meaning
(23:12):
that your model is only specific
to your particular sample
and it's less generalizable
outside.
So. So whenever I read a paper where
they said, Oh, well, we looked at
these 30 variables and
plugged them into our regression
analysis, and yet there's only
like a total of, for example,
50 or 60 events,
then that makes me really, really
(23:34):
worried that the results
are not generalizable.
So you're saying that lasso is a way
that we can sort of prune through
all of these potential
candidate variables and focus on
the ones that are the highest yield?
That's right. So it really
is a method that has
more applicability when
you just have a
(23:54):
very large number of dimensions,
in other words, with a very large
number of variables and you don't
have a huge number of
outcomes so that you
aren't so that you get to the
smallest number of variables
that are still representing
all the variance and all
the effect that exists in in the
model.
(24:15):
Great. Thank you. Well, I'm going to
start using lasso on all of my
regressions from now on.
So there are obviously, there are
downsides to using
any statistical methods.
There are alternatives that
don't have some
of the downsides of lasso.
The thing about lasso and you don't
necessarily need
(24:36):
to go into deep into detail, but
the downsides of these methods
are that they, for example, might
discard one variable
versus another if the two of them
have equivalent value.
And so if you're trying to if
your goal is to identify all
the variables that are relevant,
lasso may not be the most valuable,
(24:56):
but if your goal is to just
understand what the performance of
the model will be without
overfitting the model, then then
lasso is an effective way of doing
it.
Great. Thank you.
So I want to ask you,
how do you envision
the results of your study
being applied clinically
(25:16):
by future surgeons?
And if it's not intended to be
used by surgeons in the clinic or at
the bedside, how do you intended to
be used?
Thanks. Yeah, I think that's a
really important question.
So so
when we sort of took a step back and
and thought about the results
of our study, the I
think one of the most important
(25:38):
takeaways is that
um, that
uh, you know,
particularly in the finding where
when you combine the intuition
data and then the NSQIP data, you
find that it's no better
than the NSQIP data alone.
I think one of the biggest sort of
clinical, clinically actionable
takeaways from this finding is that
(25:59):
in the national
NSQIP risk calculator,
as far as as far as it
relates to the surgeon adjustment of
risks,
I think our study seems to
suggest that there is actually no
statistical value to
that surgeon adjustment, surgeon
adjustment of risks
(26:20):
adjustment or surgeon adjustment of
risks tool.
Maybe now maybe there is some
psychological value such as, you
know, better surgeon buy in when
they're able to sort of,
you know, exert some influence over
what to the models output is.
But, you know, I think our study
seems to suggest that there is
minimal to no statistical value to
that actual adjustment.
And so what does that
(26:42):
mean then, clinically about surgeons
when they when they're making
decisions about their patients?
And I think one of the most
important clinical implications
of this is that quite simply,
the risk, the
NSQIP risk calculator is a very,
very powerful tool that we should
be consulting often
to augment our decision making,
(27:02):
as far as it relates to patient
patients that that
that we're taking to the operating
room. And this shouldn't be viewed
as a tool that supplants
our decision making in any form.
It, you know, instead should be
viewed as a tool, again, that that
should be that can be used to
augment our decision making and sort
of essentially free
(27:23):
us to use our judgment in scenarios
where our intuition
is more powerful.
Dr. Brat, any thoughts?
Any additional thoughts?
Yeah, I think
Jayson addressed
a really important element here,
which is the fact that
if we're
(27:43):
we, we need to think of
these tools
as what they are, which is tools.
And there are scenarios where we've
been comfortable adopting
tools because we know they make our
lives better so we don't divide
our tissue using our fingers
anymore. We use a pair of scissors
because we recognize that that
tool is better at that
(28:04):
specific task.
But we have difficulty applying
that same kind of methodology
to digital health solutions
because we believe that somehow
it's undermining our ability to
do the thing that we're really good
at, which is to think about
our patients and make difficult
decisions about them.
And the reality is that
(28:26):
these tools don't do that at all.
They help us
make better decisions and
apply our
skills, our judgment and our
understanding in the scenarios
that really matter and then allow
these algorithms to do what they do
really well, which is take a huge
amount of data and organize it into
(28:46):
patterns.
So I think the implications
for this work generally
is that at least in the preoperative
period, we should
continue to use surgical
risk calculators
and and really
they have significant value in that
respect because
(29:07):
they're more likely to be accurate,
certainly for the populations that
we described that we've evaluated
in this study.
And then the one other thing that I
would say is I think this work kind
of naturally leads to
the next question, which is
where does surgeon intuition
matter the most?
And that's work that we're
(29:29):
currently looking at,
which is not only are we looking at
differences in specialties,
but are there different patient
populations that are better
served by surgeon intuition
over a risk calculator?
And then what we've also been
looking at, which I think is is
has significant
impact is the is the
(29:50):
analysis of what what date
what patients
what is the risk of a patient after
the surgeon has been inside of their
belly.
So does a surgeon have an
increased understanding of the
risk profile of a patient once
they've actually operated?
And certainly our preliminary
results suggest that that's the
case.
So I think that overall, to answer
(30:11):
your question, the important
the importance of this type of work
is to both understand
the value of these digital health
tools and understand their use
cases, and then be confident
to say that the use of these tools
doesn't undermine our
ability as surgeons.
It actually augments it.
Great. Thank you very much.
(30:32):
I really appreciate the time
spent speaking with doctors Marwaha
and Brat today.
I encourage everyone to read this
excellent paper, which was first
published online several months
ago but is now in print
in the June 2023 issue
of the Journal of the American
College of Surgeons.
Thank you for listening to The
(30:52):
Operative Word.
Please send us any feedback and
postmaster@FACS.org.
Thank you for listening to the
Journal of the American College of
Surgeons Operative Word Podcast.
If you've enjoyed today's episode,
spread the word on social media by
using the hashtag
#JACSOperativeWord.
(31:13):
Subscribe to The Operative Word
wherever podcasts are available
or listen on the American College of
Surgeons website at
FACS.org/Podcast.