All Episodes

November 7, 2024 38 mins

Kevin Werbach speaks with Krishna Gade, founder and CEO of Fiddler AI, on the the state of explainability for AI models. One of the big challenges of contemporary AI is understanding just why a system generated a certain output. Fiddler is one of the startups offering tools that help developers and deployers of AI understand what exactly is going on.  In the conversation, Kevin and Krishna explore the importance of explainability in building trust with consumers, companies, and developers, and then dive into the mechanics of Fiddler's approach to the problem. The conversation covers current and potential regulations that mandate or incentivize explainability, and the prospects for AI explainability standards as AI models grow in complexity. Krishna distinguishes explainability from the broader process of observability, including the necessity of maintaining model accuracy through different times and contexts. Finally, Kevin and Krishna discuss the need for proactive AI model monitoring to mitigate business risks and engage stakeholders. 

Krishna Gade is the founder and CEO of Fiddler AI, an AI Observability startup, which focuses on monitoring, explainability, fairness, and governance for predictive and generative models. An entrepreneur and engineering leader with strong technical experience in creating scalable platforms and delightful products,Krishna previously held senior engineering leadership roles at Facebook, Pinterest, Twitter, and Microsoft. At Facebook, Krishna led the News Feed Ranking Platform that created the infrastructure for ranking content in News Feed and powered use-cases like Facebook Stories and user recommendations.

 

Fiddler.Ai

How Explainable AI Keeps Decision-Making Algorithms Understandable, Efficient, and Trustworthy - Krishna Gade x Intelligent Automation Radio

 

 

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
kevin (00:09):
Welcome to the Road to Accountable AI.
I'm Kevin Warbeck, professor of Legal
Studies and Business Ethics at the Wharton
School of the University of Pennsylvania.
For decades, I've studied emerging
technologies from broadband to blockchain.
Today, artificial intelligence promises to
transform our world, but AI won't reach its
potential without accountability mechanisms

(00:29):
to ensure it's deployed in responsible,
safe and trustworthy ways.
On this podcast, I speak with the experts
leading the charge for accountable AI.
If AI is going to be accountable, then we
need to understand what the AI is doing.
That is the area of AI explainability.

(00:50):
I speak with Krishna Gade, founder and CEO
of Fiddler, one of the leading companies in
the AI explainability space.
We talk about how it's possible to provide
useful explanations about machine learning
systems, what changes in the generative AI
case, explanations about machine learning
systems, what changes in the generative AI
case and going forward, how it's possible
for explanations to keep up with the
increasing complexity of AI systems.

(01:11):
We also talk about how explainability means
different things in different contexts, how
organizations need to think about what's
necessary in terms of explainability, how
regulation relates to the need for
explainability and how explainability fits
into the larger paradigm of AI
observability.
I hope you'll enjoy my conversation with
Krishna Gade.

(01:33):
Krishna, welcome.

krishna (01:35):
Hey Kevin, Thanks for having me on the
podcast.

kevin (01:38):
Why is it important to be able to explain
the outputs of AI systems, to be able to
explain the outputs of AI systems.

krishna (01:46):
It's a great question.
I think it probably comes from the human
desire to build trust, and I think the way
we have scaled as a species seems like we
are able to propagate trust better than any
other organism on the planet and we work

(02:07):
better in abstractions.
You can have a billion people in a country
ruled by one single individual.
You can have 100,000 people go to a
football stadium, watch a soccer game and
come back peacefully, because there is
implicit trust, even with strangers.
I think a lot of that germinates from the
fact that we, we sort of like understand

(02:29):
enough about each other to trust each other,
uh, at different levels, uh, professional
life, personal life, you know, you know in
general, day-to-day life, right, and so
when it comes to machines, so how does
explanation promote trust?

kevin (02:42):
I know, know you worked at Meta or Facebook
developing explanations for things like why
did I see a certain ad?
So how does that promote trust if I know a
certain factor that was involved?

krishna (02:53):
why they're doing.
When we try to understand where they're
coming from, we develop trust with the
person, right?
Imagine someone who is like complete black
box, where we cannot read that person and
they're making critical decisions for us or
you know, like would we trust them?

(03:15):
It's very hard for humans to be able to
relate to them and trust them, right?
So I think it comes from that right.
So if the machine is making decisions in
the black box and we cannot really find out
why that is doing what it is doing and we
cannot develop intuition in terms of how
it's working, it's pretty hard Even think
about like regular machines that we use in
our day to day life.
You know, like we kind of reason about a

(03:37):
machine how that works by trying and
failing, and trying and failing, and try to
develop a mental model.
You know, how does a washing machine work?
How does a you know microwave oven work,
right?
How does a coffee maker work?
Now, same thing with the computer, where,
like, an AI system could be making, maybe,
a decision in terms of, like you know,
you're going to be accepted for this loan

(03:58):
or this is how it's going to be if a
mortgage rate for you or your resume was
selected for the screening process, or this
is the clinical diagnosis, based on the
conditions that you have.
How do you take those things at face value?
It's pretty hard for humans to trust that
machine at that point and you have
different questions.

(04:18):
Is it working fairly for me?
Is it discriminating?
What kind of factors did it look at?
All of those things come to question, and I
think that's where explainability of
machine learning and AI models becomes very
important, because it fills that gap.
Is it explaining everything?

(04:40):
No, it's not.
Just like a human can never explain
everything in terms of how they do, what
they do, in terms of their actions, it's
pretty hard to explain the entirety of a
very complex AI model.
But the more you can actually ask questions,
the more you can interrogate it, the more
you can get answers from it, the more human
intuition you can develop around it.

kevin (05:01):
So what kind of explanation then can you
actually provide with Fiddler?

krishna (05:06):
So Fiddler basically takes the approach of
treating the model like a black box and
asking counterfactual questions towards it.
So we believe that the whole explainability
problem is also a human-computer
interaction problem, not necessarily just a
technical.
You know understanding the neural network

(05:27):
type of a problem, right.
So that's why you know we've developed
technologies that can probe the models
behind the scenes, ask different
counterfactual questions, basically probing
different line of questioning.
You know.
Let's say you's say if a customer applies
for a loan and there are three, four

(05:49):
factors in their loan application, let's
say their current salary, their previous
debt and the occupation that they're in and
based on that, let's say, the AI model is
making a credit risk decision.
Now what Fiddler tries to do is basically
tries to probe with these factors by
slightly, slightly perturbing each of these
factors and increasing their salary a

(06:10):
little bit, decreasing it a little bit, you
know, increasing the previous debt and
whatnot, and then trying to see how the
model reacts to that change scenario or a
counterfactual scenario.
And based on that it derives that the
importance of each of these variables we
call them features in the machine learning
world and so each importance of each of
these features to the model's decision, and

(06:32):
that tells me OK, you know, if I'm getting
a low credit risk score or a high credit
risk score, what are the different factors,
you know that are contributing to it?
Is my salary not good enough, or is it,
like my previous debt, too high?
So what's my relationship with occupation

(06:53):
on the type of loan that I'm applying?
All of those things can be made available.

kevin (06:58):
How do you determine what counterfactuals
to ask?
Because I would think, for example, in the
loan example, you could do a series of
counterfactuals about income and it might
show that it's sensitive to income, but if
you don't ask about race, then you don't
know whether that's a factor.

krishna (07:10):
Correct.
So I think that's a great question, right?
So, first of all, from a competition
perspective, you run into, you know,
exponential complexity, right?
So you have too many things to ask.
You can, you know, you can your salary can
be $0 to like whatever you know a million
dollars, right?
So there are obviously things that don't
matter, right?

(07:32):
So what you try to pick for is like the
range of questions that make sense for that
individual.
You know, let's say, an inference in this
case, like, in this case, an individual
loan application.
So what Fiddle tries to do is we call it
the reference baseline.
So it basically plucks the counterfactuals
from a reference baseline, and the
reference baseline comes from, typically,
the data that was being used to train the

(07:52):
model.
So you're kind of like trying to look at
okay, I'm trying to build a credit scoring
model and I have like a set of land loan
applications that have, you know, people
applied to my bank in the past few years.
I know which loans were granted, which ones
were denied.
So I have like the ground truth.
So now, from that training data, I've
curated the training data, I'm building the
model, right, so that becomes the reference

(08:14):
set for us as well, for Fiddler as well.
So when, basically, we try to take like a
counterfactual, we're not drawing it from
the vacuum, we're trying to sort of pick
from a reference set.
So of course the explanations are dependent
on the reference set.
So this is what we call contrastive
explanation.
So Fidler's Shapley explanation algorithm

(08:35):
actually depends upon the reference set.
So you can actually choose different
reference sets, so you can ask a
contrastive explanation saying that, okay,
compared to people like you know, I'm a
software engineer in san francisco compared
to other software engineers that are living
in the similar zip code you know how is my
explanation in terms of a loan acceptance
or loan denial, so you can actually do that
type of contrastive analysis.

(08:57):
It is heavily inspired by the kahneman
miller's norm theory.
We call them these norms, so you can have
different reference norms that you can
compare a single point that you're
explaining and get explanations that you
can try to then look at all the 360 degree
view of why the model might be deciding the
way it is deciding.

kevin (09:16):
How precise can the explanation be?

krishna (09:19):
See, I think this is a good question, right?
So of course, the explanations we are
basically constructing a game theoretic
formulation to do this.
So it goes back to the axiomatic axioms
that stand behind these algorithms.
So it turns out, the algorithms that most

(09:40):
people are using to explain machine
learning models come from this person,
lloyd Shapley, who invented this Shapley
value in the 1950s and he was trying to
solve this cooperative game theory where,
you know, the problem was the following
Imagine three people come together to start
a business and the business starts making
profits or revenues.
Now you want to do a fair and appropriate

(10:02):
allocation of who's going to get how much
profit out of this business, right?
So how do you do that?
So he constructed this game theoretic
formulation where, in a simulation where
person A joins the business first and the
business makes $100 in revenue versus
business B joining first and then business
makes $50 versus C.
So you can construct these various game

(10:22):
theoretic formulations and all the
combinations versus C.
So you can construct these various game
theoretic formulations and all the
combinations.
And he came up with this axiomatic kind of
theorem called Shapley value, which he
proved out, where he said, like, if you
take the marginal contribution of each of
these individuals across all those possible
combinations.
He called it the Shapley value.
Which he's called it, that is basically the

(10:43):
fair and appropriate, appropriate value
that can assign to that person, right?
And he said basically that this Shapley
value holds, like several different things.
It it's basically very, you know, symmetric
in the sense that if you sort of if you
know it's, it's, it's if a person has, like,
if two people have same, then they get the

(11:05):
same Shapley value.
If someone is not having a contribution,
you know they will get a zero Shapley value.
So it basically follows, like some of these
axioms that hold true.
Now, that is how all of the explainability
algorithms are.
You know, they're that sort of held upon
that axiomatic principles, right, because
otherwise it's it's harder to prove them
out without having a human, human

(11:27):
verification that you know, is this
explanation good enough for you or not,
right?
And and that changes from scenario to
scenario.
So so most of the explainability, I feel
stands upon these types and the the
shoulders of the shapley algorithm.
There are various other variants of it.
You know there's common shap, oman Shapley
value that you know people use for, you
know, debugging, deep learning models and

(11:48):
other types of models.

kevin (11:52):
And does anything change as we move from
predictive models and supervised learning
to generative AI?

krishna (11:58):
Of course, a lot of things change, right?
So now, with the case of generative AI,
you're not coming in.
First of all, the model model is not
generating one single prediction or a set
of predictions.
It's actually generating a piece of content
and the piece of content also.
I mean, in some ways it's predicting the
next word, but then the next word is

(12:19):
dependent on the word that already is
predicted, but then it's a pretty complex
mechanism, that's that's working underneath.
So explaining generative AI through this
means is interesting.
It's also kind of like the questions are
like slightly different in the case of
generative AI, right?
So the questions are more like okay, are

(12:41):
you like?
The key questions people ask is like okay,
you're generating me this content.
How did you generate this content you know
most people are interested in?
Where did you come, come up with this
content?
Right, because it's kind of like this,
right?
So if I'm asking you an interview question,
or you're asking an interview question, if
I'm trying to say some bs, if you cannot,

(13:02):
like, really figure out where I'm talking
from, or you don't have prior knowledge, or
you at least cannot mentally map it to
something that you already know, it's very
hard for you to believe me what I'm saying,
and so people want to be able to have this
sort of attribution in terms of what
documents it's actually providing this
content from.

(13:22):
So one way to formulate explainability in
generative AI is to think about, instead of
feature impact, more of a document impact.
Right, especially in a rag like setting
where people are saying that, okay, I would
scope my generative AI model only to these
documents that I am, you know, my.
It could be my set of policy documents if

(13:43):
I'm the customer know.
It could be my set of policy documents if
I'm on the customer support, or it could be
my internal wiki pages if I'm building a
search engine for my company, and so in
such a scenario, what I'm most interested
in is how did you get that response, which
document has contributed to that response
and how much did it contribute?
So you can construct a similar type of a
game theoretic formulation there, where you

(14:05):
can say, out of like 10 or 100 documents
that the model is looking at, you can try
to figure out the impact of each of these
documents by, you know, using the similar
mechanism that we talked about for feature
impact, and so that's something that we are
exploring.
Uh, of course, there's a.
There's a.
There's the other notion of white box
explainability, where people want to go

(14:26):
deep into the neural network and expose the
neurological pathway and what is happening
at each layer of the neural network as the
inference is being made.
That's interesting, but we've not seen any
practical applications towards that until
now.

kevin (14:41):
Until now, yeah, and I was going to ask you
about that.
At the beginning.
It would seem like, intuitively, the best
way to write an explanation is to know
everything about the internals of the model
and build up from there.
So I'm curious why you focused on the
BlackRock approach.

krishna (14:55):
See, I think so.
If you think about, where does the whole
notion of explainability in machine
learning comes from?
Is it actually comes from financial
services industry, right, a lot of the you
know.
So if you think about, like, financial
services, there is this you know,
previously they used to build a lot of
these statistical models.
They would still build these statistical
models, so you know, being able to explain

(15:16):
the model in a closed form manner, right?
So where you can write down the equation or
the formula and break it down and actually
be able to reproduce it, it's very
important from a risk management
perspective, right, and they've been doing
this for years on.
And now I mean the human, the sort of the
desire is can you, can you do this for
neural networks?
Can you actually, you know, take the whole

(15:38):
sort of deep neural network and decompose
it into some sort of a closed form?
You know equation, and that's that's
actually hard to do, right?
You know how do you come up with like
something, uh, as simple as that, right?
Um, so I think that's where the whole the
rise of black box explainability, because
you know so.
So people call the other one
interpretability.

(15:59):
So I'm, interpretability is being able to
like if I can write down it and break it
down into constituent parts that actually
see it.
So you know the explainability is more
around.
Can I explain it to a human?
Can I build trust with it?
So people have developed different types of
techniques, and one is the Shapley value
technique.
The other one is like being able to know
the relationships between the inputs and

(16:19):
outputs.
You know so, if I can.
So a lot of the times what you want to be
able to know is, even in the classical
machine learning world, a regulator might
want to know if a customer is paying the
interest on time.
Is the model actually keeping their credit
score flat or upright?
They want to also make sure that the model

(16:42):
is capturing the relationships in an
appropriate manner.
So if, if, like you have a non-linear model
and if the explanations are not rational,
then that won't pass the regulator sniff
test.
So so you have these partial dependency
plots, conditional ice plots and all of
these different ways that expose the model.
Uh, even if it's like a model that you

(17:02):
cannot write down as an equation, you can
sort of like expose that this is, this is
the relation between the inputs and outputs.
Under these conditions, that would, that
would still.
Uh, you know that has still been accepted
in the, in the, in the, in the industry as
a possible explanation right now.
Could you do that with more and more
complex models.
That's been the journey of explainable ai.

(17:23):
Could you do that at very complex deep
learning models, let's say computer vision
model that's detecting objects, language
model, now with generative AI.
So that is where the whole sort of the need
for explainability, to sort of explain
these complex models, is coming from.
And people are developing various different
techniques here.

(17:45):
But the more the complex the model is, the
harder it becomes to make it interpretable,
the harder it becomes more to make it
explainable.

kevin (17:53):
So is a financial services company that's
had to deal with model risk management or
those other kinds of compliance
requirements.
Are they better situated to understand
what's necessary for explainability, for
trust as we get to deep learning and
generative models, or is it companies that
are not necessarily stuck in?
You know, having dealt with these simpler

(18:14):
models?

krishna (18:15):
See, I think you know, if you think about
all the regulations that are being drafted
by various different continents, you know
most of them are taking inspirations from
existing regulations, right?
So if you look at like the Biden's
executive order, it talks about model
monitoring, monitoring, drift,
deterioration of models.
Eu AI Act talks about explainability and
bias detection in models, and so these

(18:38):
things have already existed.
In financial services, they have this thing
called SR117, which is a model risk
management guideline document which talks
about explainability of models, and they've
been existing from 2008 on, so after the
whole finition, typical, right?
So?
So a lot of the inspiration to model
governance work that the rest of the
industry is adopting comes from the FSIs,

(18:59):
to be honest, right, and so now when people
talk about explainability, they're not just
talking about, you know, fsis, although
they are the ones that have a lot more
strict regulations or some you know
processes to get there.
But now you have healthcare companies,
pharma companies, you have a lot of other
HR companies that are basically trying to

(19:23):
get to that.
So the question is, what level of
explainability is good enough?
That also depends on the consumer of
explanations.
So what we have found out is explanations
that are important for a developer that is
debugging or testing a model are different
from explanations that a regulator might
need to approve a model or explanations
that a business person might need to

(19:44):
consume a model output, or explanations
that some sort of a business leader might
need to that wants to know the higher level
picture of the models.
So explainability itself differs from the
person to person, the user right.
So it's this almost sort of a vague concept,
to be honest.

(20:05):
So you make your own definition of
explainability.
But I think more or less what we have
solved at Fiddler is we have taken an
opinionated approach that here is what a
developer would need to test their models.
You know here, like, for example, if I can
look at feature importance and compare it
between training and test time, if I'm

(20:26):
looking at feature importance at different
segments of data, where I can see if my
features are being stable across all
segments of data, let's say, if my feature
importance are not vastly changing across
different segments of data.
How is my feature importance between
production time and development time?
All of those things help me to ensure that
my model is performing the same way and

(20:47):
it's stable From a compliance perspective.
It could be different.
It could be that I want to know a few
examples of why did you deny this loan?
What are the similar users that were denied
for this loan?
Show me how the model is doing on different
protected segments.
Show me the variables that are impacting
the denial of these loan applicants for

(21:09):
this particular protected group.
So I might want to poke holes into things.
I might want to have a report that I want
to consume so they can actually, as a
regulator or the compliance person, I can
bless this model.
This is good enough, right?
A business user might have might be
different.
So, let's say, I'm like on a customer
support team and I'm trying to get the
customer not, you know, churn from my

(21:31):
company.
I'm like I'm using, I'm using this churn
prediction model and it's telling me this
customer is likely to churn with 90%.
Now what I want to know is what are the
factors that are making them churn and how
can I intervene?
Can I intervene?
Can I give them some more discount?
Can I add them one more product from our
portfolio to their basket?

(21:53):
You know, what is it that is needed?
Is our service not good enough?
So I can?
Actually, what is it that is important,
right?
So?
Is our service going on good enough so I
can actually, what is it that is important,
right?
So all of these?
So we at Fiddler we have built a product
that can solve for multiple of these
personas.
You know, and that's what you know sort of
we've been, you know sort of found that is
the North Star for explainabilities to

(22:13):
solve for these things.
So you're trying to build trust within the
organization, you're trying to make
everyone comfortable with the transparency.

kevin (22:20):
Do you have to help your customers
understand what questions they should be
asking?

krishna (22:26):
So we have made it easier for them to ask
these questions.
So, like we have created this interface, we
call it fiddling with the model, so hence
the name fiddler.
So you can actually ask a lot of these
counterfactual questions, because a lot of
the times people want to dial up and down,
people want to probe the model and see,
like, okay, if this person has three more

(22:47):
products in their product portfolio, would
they be churning the same likelihood or if
they're coming from a different protected
segment, would they be getting the same
credit score.
So I think those are the ways that they can
help them build that transparency.
So the interfaces are there for everybody.

(23:07):
Some interfaces may be only maybe more
technical for developers to use.
Some interfaces are a little bit less
technical that, like a compliance or a
business person could use.

kevin (23:17):
You've been talking more recently about
observability.
How is observability different from
explainability?

krishna (23:25):
So when explainability talks about being
able to understand the model at a micro and
macro level, it does not really talk about
the continuous monitoring aspect of it,
because one of the things that models
suffer from, apart from being a black box,
is there are also these stochastic entities,
right?
So we on the software side, we're used to

(23:46):
writing code and seeing it stand the test
of time, right Now.
Example I used to work as a software
engineer at Twitter.
You know I built some services in Twitter
search engine.
Some of this might still be running there,
right, you know, like here's some- I
thought they tore it all out when.
Elon came in.
Of course you know, that's yeah.
So of course it probably has happened, you

(24:07):
know.
But I think the thing is software.
You know, of course software gets rewritten
all the time.
But the fact of the matter is that you know
you have this sort of deterministic aspect
of software, right, you know.
But when it comes to models, because they

(24:27):
are highly dependent on the data that they
were trained with, their shelf life is not,
you know, very long, right, and so you and
they're sort of like the data changes.
Something else happens in the world.
You know, for example, a lot of supply
chain models you know suddenly became
inaccurate when covid hit, because you know
all the inventory management of like stores
and everything vanished, right.
So people suddenly started seeing a lot of

(24:49):
people were shopping for, like, you know,
sort of toilet paper and things like that,
right.
And then and then when interest rates go up,
you know you see like unemployment go up,
you see unemployment go up.
You see that affects your other pricing
models, your credit scoring models.
So there are a lot of external factors that
determine this sort of model drift, this

(25:13):
sort of change in accuracy of the models.
It could also be your own internal
infrastructure, because models are so
highly data dependent.
Data pipelines are the ones that are
feeding these models with the feature data,
right?
You have like these.
You're collecting the data from various
different sources and feeding them to the
models for their inferencing over their

(25:34):
training time.
If those data pipelines are broken for
various different issues system errors or
other human errors bad data can be fed to
the model by accident and that would also
cause model performance degradation.
So this model performance degradation
warrants monitoring of models by necessity,

(25:54):
right?
Essentially, if you don't monitor the
models and there's no guarantee all bets
are off Like there's no guarantee the model
that you train with 85% accuracy, 90%
accuracy, is going to run the same way.
So that notion of monitoring of those
models now is like a fancy term for this.
Is this observability, which is essentially

(26:14):
derived from the software engineering world,
where you observe these databases and
applications for errors and performance
issues.
You're trying to bring that again to the
models.
But one of the aspects of observability
where explainability and observability meet
is when you're trying to debug.
So let's say, if I'm trying to debug a

(26:35):
model performance, let's say I know the
model is having some issues.
I got an alert.
It it's having a drift.
If I can know which features are actually
having which data elements that the model
is consuming is having the the most impact,
or which data elements are contributing to
the data drift the most, then it helps me a
lot as a developer to dig, you know, sort
of dig into the problem very quickly and

(26:55):
isolate it very, very, very, very fast so
that root cause analysis can, powered by
explainability, can be very important for
observability.
And that's what Fiddler combines those two
things very nicely.

kevin (27:09):
So, ultimately, is observability a distinct
function from model governance or MLOps or
bias detection, or do all these things
ultimately come together?

krishna (27:20):
so it's all kind of the same thing, right?
So it's kind of which, which, which, which,
which, which, which level of view that
you're looking at.
So, from a purely operational perspective,
you know it may.
We just talked about models I create.
I'm an ml engineer.
I created a model, I deployed it a month
later.
Is it going to perform the same way?
I don't know.
So that's why I need monitoring, right, and
I'm an ml engineer.
I created a model, I deployed it a month

(27:41):
later.
Is it going to perform the same way?
I don't know.
So that's why I need monitoring, right, and
I'm an ml engineer.
I want to debug what's my model is going on.
I want to test it properly.
Or someone is asking me to constantly, you
know, explain it or, like you know, give,
give some examples how it runs and whatnot.
So that's where I need explain particular.
I need like, but then?
But then this aspect of I'm not working on
one model, but I'm working on my
organization is building tens of models,

(28:03):
hundreds of models, thousands of models,
and there's no single place to track it and
there's no single place to create all.
Keep all the reports around it, keep all
the metrics around it, keep all the
artifacts around it.
That's where governance comes in right.
You're trying to create a process to
control this.
You know that's where governance comes in
right.
You're trying to create a process to
control this.
You just don't want to have all kinds of
models.
You don't want to have team members

(28:23):
building whatever tool that they are using
to build their models running a mock.
Then you won't be able to control the
output of it, the outcome of it.
You won't be able to control the risks that
the models, if a model goes wrong, so
governance comes into it.
And then a particular aspect of model
performance if a model is underperforming
on a protected segment let's say your HR

(28:46):
company or if your company let's say your
hospitality company if your algorithms are
biased against certain protected groups,
then if your company might be in trouble
because of some sort of PR issue or
litigation.
So all of these are looking at model

(29:07):
performance from various different angles,
but the cornerstone we believe is AI
observability, and this is kind of the way
we define AI observability is that by being
able to monitor, explain these models, by
being able to report out about the models,
by being able to track the inventory of
these models, you can actually achieve
governance.
You can achieve, make sure that you have
controls in terms of fairness and you can
make sure your operational performance is

(29:29):
there and you have ability to root cause
issues and build better models for your
company.
So that's kind of how we define AI
observability.
That's kind of how we define AI
observability.

kevin (29:38):
I would assume in some ways the
proliferation of AI regulation is a good
thing for you, because companies are having
these problems they need to address for
compliance.
But on the other hand, there's now all
these different requirements.
Some of them are vague, and now we're in
the legal realm.
Does that make it harder to provide
solutions?

krishna (29:55):
Yeah, so I think there's a carrot versus
stick here, right.
Solutions yeah, so I think there's a carrot
versus stick here, right?
So essentially, the companies that identify
that, hey, you know, we don't want to be
that other company that where they didn't
monitor their models and they incurred,
like hundreds of millions of dollars of
business risk or they got into some sort of
a, you know, reputation damage, they go
ahead, they try to make sure that they put
up, put these sort of practices in place

(30:17):
right, right now.
Some, some of them could be laggards, you
know.
They might say, okay, I don't want to do
this until there's some regulation, that's,
that's out there, that that makes me do it.
Right, so we are not like, um, uh, we're
not.
So I think regulations are important.
I mean, we are dealing with the technology,
like here and then from ai, which is very
new and it's uh, we don't you and it's

(30:38):
developing at a rapid pace and you want to
make sure that you have some controls in
place, that you know you're not sort of,
you know you're not letting it take its own
course, right?
So you know, while we so we're not like,
we're not like, we're not sort of against

(30:59):
innovation, so it's sort of a.
It's sort of a.
You know we want people to build AI.
You know we think AI has a lot of benefits
and I've worked in companies that you know.
I've worked at Bing, twitter, pinterest.
I've been working on machine learning and
AI for a long time.
I've seen amazing benefits that it can add
to your business.
But we think that it has to be done in a

(31:21):
responsible manner, a trustworthy manner,
and, whether regulations exist or not, I
think AI observability, explainability
these things will help you build that
trustworthy AI application.

kevin (31:32):
What's the hardest challenge that you're
facing right now, at least from a technical
standpoint, and you've got a lot of
challenges scaling up the business.

krishna (31:42):
I think from a technical standpoint we
already talked about, as model complexity
rises, being able to monitor and make sense
of these models is an ongoing problem for
us.
We solve the problem at machine learning
level and then deep learning models, and
then different types of deep learning
models, and then there's large language
models.
Now we're trying to solve the problem for
large language models and even those models
are becoming more complex.

(32:03):
Now people are talking agents and
multi-model workflows and so so the
complexity of models keeps increasing.
So that's one technical side.
On the business side, I feel like the
biggest challenge for us is like just this
aspect of educating the customers, right?
So essentially, you know it's still like an
electric car right now.

(32:24):
You know people feel like, okay, let me
have it, you know when I need it, type of
thing.
It's not like a hair on fire type of
problem for many customers, but it is
indeed if you look at them, a lot of these
customers are just like sitting on like a
ticking time bomb, right?
If we speak to customers that have like, oh,
we're running a hundred models in

(32:45):
production, we ask them do you have any
monitoring for their models?
They're just like no.
Or they do it like you know, once in a
while.
You know manual checks, right.
So a lot of people don't realize the
gravity of the situation, right, and now,
with the rise of generative AI, they're
dealing with even more complex models.

(33:05):
So I think it's so a lot of the business
challenges being able to educate the
customers, being able to let them know the
challenges here and then.
So there's this.

kevin (33:20):
There's that part of the reason why I do
these podcasts is also towards that.
So on the first point, the technical point,
with the increasing complexity of models, I
assume that you're optimistic that this is
an arms race, that at least the
explainability side can hold its own.
But what gives you that confidence?

krishna (33:37):
I think the things that we have developed
are model agnostic.
We've done a few things which have helped
us in the long run.
One is staying model agnostic, staying
Cloud agnostic, staying this neutral
Switzerland for AI type of player, where we
have customers that leverage for their

(33:57):
income, combination with Google, cloud,
azure, ai, open source and Amazon, and
because of the way that we built the
product that only mostly relies on the
inputs and outputs you're feeding to the
model to give you a lot more insights.
We can actually, with very little loose
integration, we can provide a lot more

(34:22):
insights and understanding about the model.
And so, with the large language models, the
good part about it is the inputs and
outputs are unstructured, you know, so you
can sort of, like you know, consume them
quite easily and in some ways, like for

(34:42):
structured models, we have to build a lot
of infrastructure for data ingestion and
whatnot.
For unstructured, it seems quite easy.
And then now this notion of being able to
use, being able to operate at the
embeddings layer, is quite helpful.
So one of the things that we've done for
unstructured ML models, like deep learning
models, was we convert the images or
natural language into embeddings and to

(35:04):
track drift or to be able to understand the
model.
We deal in the embedding space right, so
you can cluster a bunch of embeddings as
vectors on a 2D space and you can group
them as objects, you can color code them,
these are good, good, these are bad, and
you can kind of look at them right.
And so so that a lot of that work has
benefited us to.

(35:25):
While we build llm, observability aspects
of it, where we can consume these inputs
and outputs, we can visualize them and say,
okay, here are the problems that are having
hallucinatory behavior, or here are the
problems that are having pii issues, or
here are the problems that might be leaking
toxic content to your users, and so you can
actually go and figure it out yourself.

(35:48):
And so that aspect of it has worked out.
And so now we basically can fit in with any
of these LLM providers quite easily, and
even as customers move towards multi-model
workflows and agentic workflows, this
approach seems to work fine.
Of course, there are nuances here.
The metrics are not very hard to compute

(36:09):
what is a hallucination, what is PI, what
is toxic?
It's pretty hard to determine.
So a lot of the work that we are building.
Recently we launched something called
Fiddler Trust Service, which basically is a
new approach that we've taken which uses

(36:31):
other language models to determine if a
particular prompt and response are in our
kind of, you know, a malicious doing a
malicious activity, right, so if the prompt
has a jailbreaking activity or the response
is not relevant to the prompt.
I think this is a very new approach that we
are working on and that seems to give good
results, because it's better than putting a
lot of humans to go and label which
response is hallucinatory, which response
is toxic.

kevin (36:52):
So as the technology gets more complex, you
can actually take advantage of it.
Correct, correct, exactly, absolutely Well,
fantastic, krishan, this has been really
fascinating and I appreciate your spending
the time going through it with me.

krishna (37:05):
Awesome.
Thank you so much, Kevin.
Lots of great questions and it was fun
talking to you.

kevin (37:11):
This has been the Road to Accountable AI.
If you like what you're hearing, please
give us a good review and check out my sub
stack for more insights on AI
accountability.
Thank you for listening.

(37:33):
This is Kevin.
We're back.
If you want to go deeper on AI governance,
trust and responsibility with me and other
distinguished faculty of the world's top
business school, sign up for the next
cohort of Wharton's Strategies for
Accountable AI online executive education
program, featuring live interaction with
faculty, expert interviews and
custom-designed asynchronous content.
Join fellow business leaders to learn
valuable skills you can put to work in your

(37:53):
organization.
Visit execedwhartonupennedu slash ACAI for
full details.
I hope to see you there.
Advertise With Us

Popular Podcasts

24/7 News: The Latest
Therapy Gecko

Therapy Gecko

An unlicensed lizard psychologist travels the universe talking to strangers about absolutely nothing. TO CALL THE GECKO: follow me on https://www.twitch.tv/lyleforever to get a notification for when I am taking calls. I am usually live Mondays, Wednesdays, and Fridays but lately a lot of other times too. I am a gecko.

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.