All Episodes

May 30, 2025 • 9 mins

In this episode of Safe and Sound AI, we dive into the challenge of drift in machine learning models. We break down the key differences between concept and data drift (including feature and label drift), explaining how each affects ML model performance over time. Learn practical detection methods using statistical tools, discover how to identify root causes, and explore strategies for maintaining model accuracy.

Read the article by Fiddler AI and explore additional resources on how AI Observability can help build trust into LLMs and ML models.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:01):
Welcome to Safe and Sound AI.
Today we're tackling something, uh,really critical if you're working with

ML models in production (00:07):
model drift.
Yeah, absolutely.
It's something everyoneeventually runs into.
We wanna dig into why even modelsthat seem super accurate at
first can well degrade over time.
It impacts things like fraud detection,loan approvals, even ad targeting.
Right.
And it can cause some serious issuesif you're not watching for it.
So think of this as getting a handle onhow to keep those models performing well.

(00:31):
Exactly.
It's basically because, the datacoming in starts to look different from
what the model originally learned on.
Precisely.
It's not usually a flawin the original training.
More like the world changes, youknow, the data patterns shift.
So our goal today is to kind of unpackthe different ways drift shows up
And how to spot it, howto detect it effectively,
And crucially what youcan actually do about it.

(00:53):
Right.
Let's get into it.
Okay.
So drift isn't just one single thing.
We usually talk about twomain categories, right?
Concept drift, and then data drift, whichincludes feature drift and label drift.
That's the main breakdown.
Yeah, and it's worth remembering,they're not always separate.
You can definitely haveboth happening at once.
Good point.
So concept drift.

(01:13):
It's really about a change in theactual relationship between the
inputs, the features, and what you'retrying to predict: the outcome.
Let's use a loan application example.
A model looks at income, credit score,age, stuff like that to predict risk.
Right.
And it might work perfectlyfine when the economy is stable.
But then uhoh, maybe a big recession hits.

(01:35):
Suddenly that same income level,that same credit score, it might mean
something different in terms of risk.
The underlying concept ofcredit worthiness has shifted
because of the economy.
So the models learned rulesaren't quite right anymore.
The boundary it drew betweenapprove and reject is now in the
wrong place for the new reality.
Precisely.
Even if the applicant's details looksimilar on paper to ones approved

(01:57):
before, the economic context changesthe outcome, that's concept drift.
It's like the definitionof risky has changed.
You got it.
The model becomes outdated even if theinput data looks superficially similar.
Okay, that makes sense.
Now, what about data drift?
Data drift is, uh, a bit more general.
It just means the statistical propertiesof the data your model sees in production

(02:20):
are different from the training data.
So the data distribution changes.
Exactly.
And the key difference is this changein distribution might or might not
actually affect that core relationshipwe talked about with concept drift.
Okay.
And you mentioned two main types,feature drift and label drift.
Yep.
Let's take feature drift first.
This is about changes in thedistribution of the model's inputs.

(02:41):
So PX, the probability ofseeing certain input features.
Imagine your bank runs a bigmarketing campaign in say, Texas.
Suddenly you get way more applicationsfrom Texas than you used to.
Ah, so the distribution of the state orregion feature has changed dramatically.
Exactly.
Even if the income or credit scoreswithin Texas are similar to your training

(03:02):
data, the mix of inputs has shifted.
That's feature drift.
Got it.
That makes sense.
So that's one type.
What's the other main kind of data drift?
The other key type is label drift.
This focuses on the distributionof the model's predictions, the PY.
So if our loan model suddenlystarts predicting approve much
more often than it used to

(03:22):
That could be labeled drift.
You're seeing a shift in theproportion of predicted outcomes,
maybe more approvals, maybe morerejections compared to the baseline.
And that could be caused by featuredrift, like if suddenly only super
qualified people started applying.
It could be, yes, feature drift candefinitely lead to label drift, but label
drift is worth monitoring on its own, asit can sometimes signal other issues too.

(03:45):
It tells you something haschanged in the output pattern.
Okay, so drift happens.
Concept drift, data drift . How do weactually spot this in a live system?
Does it happen slowly or all at once?
It really varies.
Sometimes it's abrupt.
Think about, um, the startof the COVID-19 pandemic.
Consumer behaviorchanged almost overnight.
That caused sudden massivedrift for many models.

(04:08):
Right?
I remember reading about that huge impact
But other times it's reallygradual, a slow creep over months.
Or it could even be cyclical,maybe seasonal patterns.
Which means we need to beconstantly looking out for it.
Absolutely continuous monitoringis key, and how we detect it often
boils down to whether we have groundtruth labels for the new data.
Okay.
So if we do have labels, weeventually find out if the loan was

(04:31):
actually good or bad, for instance.
Then you can rely on standard performancemetrics, track your accuracy, precision,
maybe false positive rate, AUC,whatever makes sense for your model.
And if those metrics start to consistentlydrop, that's a big red flag for drift.
Definitely.
You might even build specificmodels just to detect that kind

(04:51):
of performance degradation.
It's like a supervised learningproblem on top of your main model.
But what if we don't have labels rightaway, like in real time fraud detection?
That's common.
In that case, you have to shift focusto the data distributions themselves.
You compare the stats of your incomingdata to your original training data.
And there are statistical tools for this.
You can use distance metrics likeKullback-Leibler Divergence, or

(05:13):
Jenson-Shannon divergence to seehow different two distributions are,
or statistical tests
Yep.
Things like Kolmogorov-Smirnov test.
It helps tell you if twosamples likely came from the
same underlying distribution.
Each test has different strengths andassumptions, so you choose what fits.
You could even potentially buildunsupervised models just to flag these

(05:34):
distributional shifts, I suppose.
For sure.
That's another approach.
The main goal is just detecting thatsomething has changed in the data stream
compared to what the model was trained on.
Okay, so we've detected drift.
Alarm bells are ringing.
But just knowing it'shappening isn't enough, is it?
Not at all.
That's really just the first step,the critical next step, and often

(05:55):
the harder one is figuring out why.
What's the root cause?
Because the solution dependsentirely on the cause.
Absolutely.
And the causes can be varied.
It might be a genuine change inthe real world, like that economic
shift causing concept drift.
Or maybe just changes in howpeople are using the product
leading to feature drift.
That happens too.
But crucially, we also need toconsider data integrity issues.

(06:17):
Ah, like bugs yeah.
It's breaking
A bug in the frontend.
Data.
Capture an error in how datais transformed in the backend.
Maybe an API change broke something,or just general pipeline degradation.
Any of those can look like drift.
But it's not really reflecting achange in the underlying patterns
just bad data getting through.
Right.

(06:38):
So when you detect drift, thefirst thing you should probably do
is talk to your engineering team.
Check for recent code changes,product updates, known
issues in the data pipeline.
Mm-hmm.
That kind of stuff.
Yeah.
Rule out the infrastructureproblems first.
That's often the, uh,the low hanging fruit.
Okay.
Let's say it's not an obvious bug.
What then?
Then you need to dive deeperinto the model analytics.

(06:58):
When did the drift start?
Which features are most affected?
Is it concept drift,feature drift, label drift.
Use those statistical tools we mentioned.
And then the fix dependson that root cause.
Exactly.
If it was a data integritybug, fix the bug.
If it was, say, feature drift dueto a known product change, but the
underlying concept is stable, maybejust updating some data processing or

(07:22):
refreshing feature stats is enough.
But if you've confirmed its concept drift
And usually you need to retrainthe model, the old relationships
it learned are no longer valid.
You need fresh representative datathat reflects the new reality.
So the big takeaway here seems tobe that drift is, well inevitable.
Pretty much.
Any model deployed in the real worldis going to face this eventually.

(07:44):
The data just doesn't stand still.
Which means awareness, continuousmonitoring, and having the right tools
and processes in place are just essential.
It's not a nice to have.
It's fundamental, especially if youthink about responsible AI principles.
Maintaining model accuracy andreliability over time is responsible AI.
You can't just deploy and forget.

(08:04):
Okay, so just to wrapup our deep dive today.
We've looked at model drift, breakingdown into concept drift, where the
underlying meaning changes and datadrift shifts in the data distributions,
like feature or label drift.
We talked about how to detectit using performance metrics, if
you have labels or statisticaldistribution comparisons if you don't.
And really emphasize thatfinding the root cause is key.

(08:25):
Is it the real world changing,product usage shifting, or maybe
just a data pipeline issue?
Because the solution, whether it'sa bug fix, a pipeline update, or
a full model retrain for conceptdrift depends entirely on that why.
It really underscores that ML inproduction is a dynamic ongoing process.
Constant vigilance required.

(08:46):
So maybe a final thought for you toconsider is how dynamic is the data
your models rely on, and are yourcurrent processes really equipped
to handle the inevitable evolutionof those patterns over time?
This podcast is broughtto you by Fiddler AI.
For more on monitoring ML drift, or moredetails on the concepts we discussed,
see the article in the description.
Advertise With Us

Popular Podcasts

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Crime Junkie

Crime Junkie

Does hearing about a true crime case always leave you scouring the internet for the truth behind the story? Dive into your next mystery with Crime Junkie. Every Monday, join your host Ashley Flowers as she unravels all the details of infamous and underreported true crime cases with her best friend Brit Prawat. From cold cases to missing persons and heroes in our community who seek justice, Crime Junkie is your destination for theories and stories you won’t hear anywhere else. Whether you're a seasoned true crime enthusiast or new to the genre, you'll find yourself on the edge of your seat awaiting a new episode every Monday. If you can never get enough true crime... Congratulations, you’ve found your people. Follow to join a community of Crime Junkies! Crime Junkie is presented by audiochuck Media Company.

Ridiculous History

Ridiculous History

History is beautiful, brutal and, often, ridiculous. Join Ben Bowlin and Noel Brown as they dive into some of the weirdest stories from across the span of human civilization in Ridiculous History, a podcast by iHeartRadio.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.