Should you Observe ML Metrics or Inferences?

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:02):
Welcome back to Safe and Sound AI.
You know, we're all aboutmaking sure those AI models of
yours are behaving themselves.
And, uh, today we're diving deep inthis, uh, this debate that's been
brewing in the world of AI observability.
Yeah, a really interesting one.
Absolutely.
It's all about two mainapproaches: observing metrics

(00:22):
versus observing inferences.
Mm, big question is...
Which one is better foryour specific AI models?
Right, which one's right for you?
We're gonna break it all down,get into the nitty gritty, look
at what the experts are saying.
Exactly.
So, um, I think the thing that'sfascinating about this is both approaches,
you know, they have the same goal,make sure your models are reliable,

(00:43):
they're doing what they're supposedto do, but the way they go about it,
that's where things get interesting.
So let's just jump right into it.
What's the core of these methods?
Okay, so, when we talk about observingmetrics, think about it like you're
getting like an executive summary.
Like the TLDR.
Exactly.
Of how healthy your model is.
Right.
It's all about looking atthis, um, pre-aggregated data.

(01:05):
So we're talking things like accuracy,drift, performance, all this is being
calculated like right next to the model.
Got it.
And just that summary.
Those results get sent to your platform,so you're not getting all the nitty
gritty details, just the overall picture.
So it's like you're looking at yourcar's dashboard, getting a quick read
on speed, fuel level, temperature,but you're not seeing, like, how the

(01:28):
engine's actually working, right?
Exactly.
Perfect analogy.
What are the upsides to this approach?
Why would someone choose tojust Get that high level view.
Well, there are a couple of reallybig advantages, and I think this is
especially true for organizationswhere security is a top priority.
Okay.
So first off, think about it.
Because that data is being aggregatedbefore it ever hits the platform,

(01:49):
you know, you're drasticallyreducing the risk of exposing any
PII, any sensitive information.
That's huge, right?
Especially in today's worldwhere data privacy is everything.
Exactly.
So big win there.
What's the second advantage?
Cost savings.
Straight up.
Makes sense.
You're transferring less data,you're storing less data, and
that often translates to justlower software costs overall.

(02:11):
I like that.
Bonus points.
It's a good bonus.
So we've got the dashboardview with metrics.
Is observing inferences more like poppingthe hood, getting your hands dirty?
Yes, you got it.
With inference-based observability, we'resending all the raw stuff, the inputs, the
outputs, every prediction the model makes,straight to the platform to be analyzed.

(02:32):
Okay, so
lots of data.
It can seem overwhelming, butthat raw data, that's the key.
That's the treasure trove.
It is.
You unlock true root cause analysis.
Now we're talking.
See, this is where things get reallyinteresting for me, because we all know
model performance can drift over time.
But figuring out why that'sa holy grail, and that's what

(02:53):
inferences give you, right?
Exactly.
So let's say your model'saccuracy just takes a nosedive.
Okay.
Metrics.
They'll tell you it happened.
That's it.
But with inferences, you canzoom in and figure out why.
Was there a certain type ofinput that messed things up?
Or is performance tanking for,like, a certain group of users?

(03:15):
These are the questions inference-basedobservability lets you answer.
So it's not just beingreactive, it's proactive.
You're getting to know your modelat such a deeper level, and you
can actually go in and improve it.
Absolutely.
I'm guessing this also savesdata scientists a ton of time.
It does.
Think about it.
It's like trying to debug somesuper complex software, but you
only have a couple error codes.

(03:35):
Nightmare.
Total nightmare.
Right.
With inferences, you've gotthat full picture so you can
pinpoint the root cause fast.
No more wasting days sifting throughdata in a million different notebooks.
And that time, that is precious.
That's time they can spend developing newmodels, refining existing ones, you know,
working on all those high value tasks.
What else can we do with this approach?

(03:56):
Well, I mean, we've talkedabout root cause analysis, time
savings, but there's a lot more.
Like, you can do segmentation monitoring,which means you're analyzing performance
across specific groups, cohorts.
This is especially helpful if youthink certain segments might be harder
for your model to handle accurately.
Can you give us an example of that?
Like, in the real world, how's that used?

(04:18):
Sure.
Imagine you've got a model predicting,let's say, loan repayment likelihood.
Yeah.
Now, some applicants, they mighthave low credit scores, but they
have a lot of money in the bank.
Right.
Those are tricky.
Makes sense.
Your model might struggle with those, butinference-based observability, it lets
you zoom in on how that specific segmentis performing, compare it to others.

(04:41):
You can even set alerts if accuracydrops below a certain level.
Wow.
So it's not just that bird's eyeview, it's getting really granular.
Exactly.
Imagine that's really valuable for,you know, applications where fairness
and accuracy are super critical.
Yeah.
Anything else?
Oh, we're just getting started.
It also enables techniqueslike, um, Shapley values, LIME.

(05:02):
These are used to get whatwe call local explanations.
What does that mean?
It means you're not just understandinghow the model behaves overall, you're
understanding why it made a specificprediction for a specific person.
Hold on.
Are we saying we can actuallypeek inside that black box?
box and understand why it approved thatloan or recommended a particular product.

(05:23):
Exactly.
You're seeing the inner workings.
Wow.
That's incredible.
And that level of transparency,it's not just about performance.
It's about trust.
Right.
It's about trust with users,regulatory compliance.
Yeah.
It's about demystifying AI.
It's about helping us graspthat why behind those decisions
that impact people's lives.
Exactly.
Okay.
This is powerful stuff.

(05:43):
It really seems like observinginferences, it takes model
monitoring to a whole new level.
But there's got to be a trade off, right?
All this extra data, theanalysis, it can't be cheap.
You're right.
Platforms that use inferences, they dotend to be more complex, potentially
more expensive up front comparedto those metrics based solutions.
But you have to think about it.
What's the cost of nothaving this level of insight?

(06:04):
Yeah, that's a good point.
Yeah.
I mean, think about what couldhappen if a model starts misbehaving
and they can notice this.
You could have financial losses,reputational damage, even legal issues.
Exactly.
All those things.
Suddenly that upfront investmentin a more powerful observability
platform seems like a bargain.
Exactly.
And remember, you know,it's not just about Yeah.
You know, mitigating risk, it'sabout maximizing the good stuff, the

(06:27):
opportunity, the insights you get fromobserving inferences, they can lead to
better performance, which means higherrevenue, more efficiency, a stronger
advantage over your competitors.
It's about moving from just monitoringto actively optimizing your AI.
Okay, so we've got these two approaches.
Metrics and inferences, bothhave their pros and cons.

(06:48):
I mean, it's kind of like choosingbetween, I don't know, a Swiss Army knife
and a, like, specialized surgical tool.
Yeah, I like that.
Both super useful, but fordifferent things, right?
But how do our listeners figureout which one is right for them?
That's the million dollarquestion, isn't it?
It is.
What questions shouldthey be asking themselves?

(07:09):
Well, I think, first and foremost,you gotta think about the data.
How sensitive is the informationyour models are handling?
If you're dealing with, like,really sensitive stuff, financial
data, that kind of thing.
Right.
Then those privacy advantagesof the metrics based approach,
that might be super appealing.
Yeah, I mean, you're aggregating thatdata before it even gets to the platform.

(07:29):
So that's an extra layer of protection.
Exactly.
Especially with all the dataprivacy regulations these days.
For sure.
But what if you're in an industrywhere understanding the why behind your
model's decisions is mission critical?
Like, let's say healthcare, where you'vegot a model that's helping diagnose
patients or recommending treatment plans.
Right, high stakes situations.
Exactly.

(07:50):
Wouldn't you need that deep dive thatinferences provide in those cases?
Absolutely.
When the stakes are that high, beingable to do root cause analysis, identify
those tricky cohorts, understandindividual predictions, that's not
just a nice to have, it's a must have.
And that's where inference-basedobservability really shines.
So the level of risk associatedwith your AI application.

(08:12):
That's a major factor.
Higher the stakes, the morelikely you need those granular
insights from inferences.
Exactly.
But we got to talk about cost too, right?
And we mentioned it earlier, butfor companies that are, you know,
working with tight budgets, wouldn'tthat lower cost of a metrics
based solution be a big advantage?
It's a good point.
Cost is always a consideration, butI think it's about looking at the

(08:34):
big picture, weighing those upfrontsavings against the potential cost of,
you know, not having enough visibilityinto your models down the line.
That's a really good point.
Like, imagine your model startsmaking biased decisions, right?
And that leads to unfair outcomesfor certain groups of customers.
Oof.
That's not good.
Right.
The damage to your reputation, thepotential legal issues, that could end up

(08:55):
costing way more than any initial savingsyou got on your monitoring platform.
Yeah.
It's about thinking long term.
It's easy to just look at that initialprice tag, but we have to remember that
AI observability, it's an investment.
It's about the long term success andsustainability of all your AI initiatives.
Exactly.
And at the end of the day, thechoice really comes down to your

(09:17):
organizations, you know, whereare you in your AI journey?
What are your goals?
If you're just getting started withAI, you know, you're focused on
simpler use cases, a metrics basedapproach might be all you need.
That's a good starting point.
Yeah.
But.
As your AI footprint grows, youstart tackling those more complex
challenges, you're going toneed the power and flexibility
of inference-based observability.

(09:38):
So let's say a company decides,okay, inference-based observability,
that's a way to go for us.
What are some real world examplesof how this approach is being
used to solve problems and,you know, make things better?
Oh, there are tons.
We've seen companies usingit to, like, really improve
their fraud detection models.
Yeah.
They're able to pinpoint thosespecific transaction patterns that

(10:01):
were slipping through the cracks.
Interesting.
Yeah, or, you know, optimizing thoserecommendation engines by really
understanding how individual usersinteract with different types of
content and even, uh, we've seen itused to improve medical diagnosis
models, you know, by identifying subtlebiases that were leading to inaccurate
predictions for certain patient groups.

(10:21):
So it's really making adifference across industries.
It's great to see those realworld results, but it can't all
be sunshine and roses, right?
Are there any downsides to observinginferences that we should be aware of?
Well, I mean, of course,nothing's perfect.
One potential downside is that,you know, storing and processing
all that raw inference data,it can be a lot, especially for

(10:42):
applications with high volume.
Yeah, lots of data to manage.
It could mean higher storage costs,you know, put a bigger strain on your
platform's computational resources.
Right.
Okay, so what level oftransparency and understanding
do we need to feel comfortableand confident in AI's decisions?
Are we okay with just seeing the what's,or do we need to understand the why?

(11:05):
You know, that's a really good question.
And, uh, I think it really speaksto the bigger picture here.
It's about trust.
Absolutely.
If we want people to embrace AI,to really see its potential, we
need to make it understandable.
Not just for the datascientists, but for everyone.
Yeah.
It's about bridging that gap, you know,between those complex inner workings
of AI and the people who are actuallybeing impacted by its decisions.

(11:28):
Exactly.
I mean, think about it.
If people can see that our models aremaking decisions in a way that's clear
and makes sense, they're going tobe much more comfortable using them.
Right.
It's about empowering users,regulators, developers.
Everybody should be able to ask thosetough questions, demand explanations,
and really shape the future of AI ina way that aligns with our values.
I completely agree.

(11:49):
So while this whole debate betweenobserving metrics and observing
inferences might seem like it'sjust a technical choice, it actually
has much bigger implications.
It does.
It's about how we build AI, how we deployit, how we govern it in the years ahead.
It's about choosing a philosophyfor AI development, one that puts
transparency, accountability, andexplainability front and center.

(12:11):
This has been a truly fascinating deepdive into the world of AI observability.
Hopefully you're all walking awaywith some new insights and knowledge
to help you make informed decisionsabout how you're monitoring and
managing your own AI models.
And remember, this is just the beginning.
As AI keeps evolving, so will the toolsand techniques we use to understand it.
It's an ongoing journey.

(12:31):
Stay curious, stay informed, andstay engaged in this important
conversation about how we canharness the power of AI for good.
I couldn't have said it better myself.
This podcast was broughtto you by Fiddler AI.
For more on monitoring LLMOps performance,see the article in the description.

All Episodes

Episode Transcript

Popular Podcasts

Stuff You Should Know

Crime Junkie

NFL Daily with Gregg Rosenthal

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Should you Observe ML Metrics or Inferences?

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Stuff You Should Know

Crime Junkie

NFL Daily with Gregg Rosenthal

All Episodes

Should you Observe ML Metrics or Inferences?

Stuff You Should Know