Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Or 85% of AI use cases are beingevaled by the engineer who built
it saying, Yep, seemed to work pretty well.
If you're going to build a system that's going to be
critical to the business, that'sgoing to be important that it
gets it right, then you you can't do that without evals.
Welcome to Analyse Asia, the premier podcast dedicated to
(00:22):
dissecting the pulse of business, technology and media
in Asia. I'm Bernard Leong, and today
we're diving in the forefront ofenterprise AI from model
development to deployment at scale.
With me today is Craig Wiley, Senior Director of Product
Management at Databricks, who leads Mosaic AI, the company's
unified platform for production grade generative AI.
(00:43):
So Craig, I'm a fan. Welcome to the show.
Thank you very much. Yes.
So very quickly we're always starting to talk about origin
stories. How do you start your career and
what drew you into the world of AI and product leadership?
Yeah, I, I was working at Amazonand in the consumer business and
found that, you know, there was still a lot while while the
(01:05):
company was data-driven, there was still a lot of opportunity
and, and we just started diving deep on questions that were
asked or on problems that we had.
And, and very quickly I realizedthat kind of the business
analysts I worked with and what have you, you know, that we had
questions they couldn't answer. So we started hiring economists
and, and then what we found was there were some questions the
(01:27):
economists couldn't answer. And so we started hiring this
new at the time, this brand new kind of class, which was, you
know, data scientists and machine learning engineers.
And this was back, you know, in 20/12/2013.
And you know, when, when those guys got started, they were
(01:48):
often, often they weren't movingnearly as quickly as the
business needed them to. And so it really became a, you
know, hey, what on that last project we worked on, what could
have helped you go faster? And, and as soon as they would
tell me, I'd grab a team of engineers, we'd see if we could
help solve that problem and thenjust drive the business faster.
And, and so that's, that's really how I got started.
(02:08):
And then when AWS decided they wanted to get into machine
learning, they came through to the consumer business to look
and see who, who was doing this.And, and, and I, and I, I was
there. I was and they said, you know,
hey, do you want to come build Sage Maker?
And literally at the time, Sage Maker was one sentence in a in a
document, you know, kind of thing.
(02:29):
SO11 curious, part of the careeris that you have let AI product
teams at both Google Cloud and Amazon Web Services.
So what are some of the pivotal lessons that you have learned
building AI platforms at scale across multiple cloud providers?
Because why? I'm a fan of your work because
of ML operations and it's the ability to deploy, build, train,
(02:49):
deploy models at scale. So can you talk about that
experience? Yeah, you know, I, I think I've
had this amazing opportunity to follow the both to follow and at
times to really help influence the industry.
And, and you know, when we firststarted, it was really just hard
for people to train models. You know, and Sagemaker really
went a long ways just to help people train models on large
(03:09):
amounts of data. Then, you know, when I, when I
was at Google, we really focusedon model OPS and how to help
people build and deploy models. And what we found there was that
customers were having a lot of success really getting their,
their cycle time down and going much, much faster.
And that's when I realized that if I really wanted to influence
the industry, the next opportunity was to figure out
(03:33):
how to take advantage of this connection between the, the AI
layer and the data layer. And if we could take advantage
of that connection, we could really probably get another
order of magnitude acceleration on that, on those development
cycles. And and that's what led once I
realized that, I think I, I started at data bricks 6 to 8
weeks later kind of. Thing so, so that's the real
(03:54):
reason for joining Data Bricks because you want to have the
intersection of data and AI together.
So one thing for sure, right? Data Bricks acquired Mosaic AI
two years back. So what?
And now you're leading that team.
What was your current role and responsibility there and how has
the scope of work changed after the acquisition, especially on
the product? Side yeah, so I, I had joined
(04:14):
Databricks back in early 2022 and, you know, pre ChatGPT and I
thought I was coming to build the most performant tabular
machine learning system in the world kind of thing.
But I still think I think Databricks has that.
But I, I was here to kind of work on that problem.
And then Jen, I, I hit and everybody went crazy for this
(04:36):
technology and, and all of a sudden I, I, I used to joke
that, you know, I used to have meetings with the head of data
science and then I got to instead have meetings with the
C-Suite because they needed to understand this technology.
And, and so I, I, we started helping them and we build a
vector database and we built, started building out all those
components necessary and then acquired Mosaic as a part of
(04:58):
that, that goal of trying to help our customers to have to be
able to kind of build and deployany of these.
You know, at the time we didn't use the word agents at the time,
we used the word, you know, Gen.AI solutions.
And, and so, you know, we acquired Mosaic for, to kind of
go after that problem. And and so I've been leading
product management both before and after the acquisition.
(05:21):
How is the transition like let'ssay before ML OPS now to LLM
OPS? Are they are the problems
similar or different? Because I think now you're
dealing with structured and unstructured data all at the
same time. You know, it's at some level
they're very similar. I've got data, I've got some
models I need to interact that data with, and I need to figure
(05:41):
out how to deploy that in production in a stable and
successful way and economical way.
But in other ways, they're very different.
You know, with classical machinelearning, you know, we had
statistics that would tell us precisely how accurate these
systems were. Whereas with, with Jen AI and
agents, really, you know, the vast majority of folks don't
(06:02):
have kind of these the ability to generate kind of a, a
statistically valid way of understanding the accuracy or
performance of this system. And so, you know, helping
companies figure out not just the mechanics of deploying this,
but also, hey, how can you buildthese systems such that you can
trust them has really been, I think one of the, the primary
(06:26):
challenges it's to be honest, it's one of the challenges that
I'm most excited because I, I think we're really leading the
charge in the industry on that. So I would like like to lay the
groundwork here, right. What is Mosaic AI and how does
it fit into say data breaks broader Ava now going to let's
say AI platform strategy? Yeah.
So, so Mosaic AI is the brand name we use to describe all of
(06:48):
our machine learning and AI products, right, just to
simplify it for customers. But fundamentally, you know,
Data Bricks is an end to end platform.
And and you know, you'll hear itfrom other, you know, from
others that kind of, you know, hey, our AI platform is end to
end. But they often mean, you know,
well, we started training and weended deployment for Data
Bricks. We start at data ingestion and
(07:11):
we go through the data ingestionto data transformations, the,
you know, the analytics on top of that, the exploratory work
that has to be done. And then also the kind of more
technical machine learning or AI, you know, capabilities.
And so Mosaic is really that setof capabilities around model
training and vector database andyou know, text to Sequel and
these kinds of things that give us the tools we need to be able
(07:34):
to build and deploy the agents that that the customers want to
have. So do you think that now given
that there is the you also coverthe ETL side, the extract,
transform and load side and thenwith the data injection coming
into the data platform to do themodel training all the way to
model deployment, what does it mean in practice say for
(07:54):
customers thinking about moves AI?
Yeah. So, you know, the, the best way
to think about this is maybe with a more a simpler kind of
classical machine learning. You know, you can the, the
biggest problem that we have seen for years in classical
machine learning is the fact that often the data that someone
trains their model on ends up being different.
You know, they, they train theirmodel on a set of data warehouse
(08:17):
tables and then they go deploy it on production data.
And the the data in production ends up looking much different
than the data in the data warehouse tables.
And so you know what data bricks, if you go to that data
warehouse table, you click on the lineage tab and you
immediately see the DAG that takes you back to production.
And so now all of a sudden, thisthis concept of training serving
(08:37):
skew is just gone because I can choose as a data scientist, do I
want to use that that kind of upthat downstream, you know, data
warehouse table, or do I want touse that upstream table that
might look more like production and might behave more like
production. And so, you know, whether
that's, you know, classical machine learning or Gen.
AI, having the ability to understand where that data set
(08:59):
came from and how that data set was created really is, is
critically important in buildingmachine learning and AI.
And also. Reduced overfitting, right?
That's right. Because that that also you want
your training data and the production data so that when the
model is being trained, it becomes there's no data drifting
and then and then you can actually deploy as quick as
(09:19):
possible, even if the different lineages.
That's right. You know what most companies
today, you know the model gets built by the data scientist,
handed to the ML engineer who often has to rebuild the model
for production. And so, you know, if we can help
eliminate that step and allow the data scientist to ensure
they're using the right data andthen just make it a couple of
clicks to deploy, then we're in a much better place in a much
(09:42):
faster cycle time for these customers.
So how does now mostly AI have to simplify that process now for
say enterprises that want to build and deploy large language
models with is, is the process slightly different because there
is also the alignment fine tuning piece that that is
actually also making the processa little bit more art than
science? Yeah, no, it is, you know, and,
(10:05):
and really helping customers transform that from art to
science has been one of our objectives.
But, you know, helping companiesfigure out how to kind of not
only, you know, effective, you know, for many of these models,
they're very large. It's very complex to actually to
kind of run them for for model for to serve them.
And so, you know, whether we help companies serve the models
(10:27):
themselves or whether they use some of our pre served models
that they can just sit on a witha kind of on a, you know, on a
perk question or per prompt basis.
We want to make it just as easy as possible for them to be able
to do that so that they can get those agents to market as
quickly as possible. So because I teach retrieval
method generation and fine tuning to all the engineers in
(10:47):
Singapore, so that's working in the government of Singapore.
So one one of the question I probably want to ask you is
between both approaches, where are the value and how should
enterprises think about choosingone over the other?
Yeah. You know, I'll say in the, in
the last year, we've really seen, you know, a year ago it
was really a split that we saw between whether whether fine
(11:10):
tuning was the right answer, whether or whether using RAG or
something like that was the right answer.
In the last year or so with the these newer, much smaller, much
more economical models, the needto fine tune has become much
less frequent. And so and so we see RAG used a
(11:30):
lot more commonly for, you know,whether it's building agents or
chat bots or what have you. Where we see where we do
continue to see fine tuning is, you know, many times folks are
building what, what I kind of think of as industrial use cases
or agencies. Hey, it's going to do the same
task thousands or millions of times.
And you know, if you're trying to do that same task millions of
(11:53):
times, then fine tuning the model to do specifically that
task can can both have the advantage of increasing the
accuracy, but also you can oftenfine tune a much, much smaller,
much, much more economical model.
So now instead of having to hit a, a big model to do this task
millions of times, maybe you canhit a model that you know, costs
(12:14):
kind of, you know, $0.10 on the dollar kind of thing in order to
in order to go after that. Do you do a lot?
Do you see more customers thinking about doing more
transfer learning given that youcan now shrink the models, our
sites are even longer and then specifically fine tuning will
actually become much more economical?
Yeah, certainly some of our sophisticated customers have are
(12:36):
doing those kinds of, you know, whether it's transfer learning
or whether it's kind of fine tuning or you know, quantization
and distillation of models. But you know, we also just see
that with, you know, some of thekind of flash models or the mini
models that are out there. You know, a lot of those tasks
(12:56):
and that were so important for the economics of this in the
past are just not always, as youknow, they're, they're important
when you want to move from hundreds of thousands to 10s of
millions. But if you're only in hundreds
of thousands, maybe you could just go with the existing model
that's out there. So I think one question, because
we're talking about enterprise AI, it's very important that the
(13:17):
word security and governance come into play.
How does Mosaic AI now help enterprises to deploy AI
responsibly? I think it's the appropriate
controls audit and even the safety mechanisms that come in
place, right? Yeah, this is huge, right.
And, and you know, for a lot of our customers, this is one of
their primary concerns, particularly in regulated
(13:38):
industries. You know, the great thing about
Databricks is that Databricks isa unified platform from data
ingestion all the way through toGen.
AI. And so, you know, there's a
common governance scheme, you know, and we govern, we don't
just govern tables, we govern tables.
We govern, you know, unstructured data.
(13:58):
We govern, you know, embeddings and, and the vector DB.
We also govern access to the models, you know, and, and when
you're building an agent on databricks, you have to give that
agent, you know, at each step, you have to explicitly give it
permission to this specific piece of data or this specific
tool. It's not just a kind of, you
(14:18):
know, right. So you have access control up to
that kind of level. Absolutely, because that's the
level of control, you know our banks and our hospitals need in
order to, to your point, do thiskind of with confidence and
responsibility. Yeah, that's basically most
enterprises. I think it's on the top of their
mind as well. Can you talk about say like
currently within the data bricksplatform, what is some of the
(14:41):
like core innovations or unique agents that you think Mosaic AI
can actually offer to the? Yeah.
Well, I mean, I mean, certainly the the thing I'm most excited
about is, is the recent announcement we had earlier this
summer around Agent Bricks. And Agent Bricks is really
designed. It's, it's a way of helping
folks build agents, but it really turns this problem on its
(15:04):
head. Most of most of the vendors out
there who are, who are kind of, you know, giving you these more
accessible or easier ways to build agents, they're really
focused on simplicity. And we, we, while we wanted to
make it simple, we weren't willing to make it simple at the
cost of accuracy. And so the, the difference with
agent bricks is it's really veryevaluation centric.
(15:25):
The goal is really to kind of lure the customer into an
evaluation loop so that they cancontinue to help drive the
performance of the system. And so, you know, you're not
just saying, hey, give me a rag system on this on this corpus of
data, You're saying, hey, give me a rag system on this.
And then let's go through some examples and let's see whether
(15:45):
or not those examples are good or are are up to up to
expectation or not. And if they're not, we'll learn
why not, We'll adjust the system, we'll move forward and
then continue to do that until we can get to a level of
accuracy that the customer is comfortable.
With like a reinforcement learning mechanism, it's it's
for for the enterprise too. Exactly that.
I mean, it's it's you know, I there, you know, some would say
(16:08):
it's been reinforcement learningall along, you know, since the
beginning. But but certainly, you know,
this idea of using reinforcementlearning as a part of an
evaluation loop is is a way of of really driving accuracy
beyond what anyone is currently able to do.
As a former CIO myself, I look at all the different tools, so I
(16:28):
know that Mosaic AI supports both open and proprietary
models. I think how important now is
model flexibility for enterprisecustomers from your perspective?
I think it's, it's critical and,and I think that, you know, the
more models you have access to, the better, you know, and I, and
(16:49):
the reason I say that is we so frequently see, you know, in
this evaluation platform that we've got, it's very easy to
swap out the model, you know, build an agent, swap out the
model and just see if the other model performs better or not.
And, you know, all too often what we see is, you know, we may
have an intuition about which model was best, but until you
test them, you just don't know which one is going to be best
(17:10):
for this. And so, you know, having access
to all of the models you possibly can is really a massive
differentiator for both speed tomarket and accuracy.
And so it's really, you know, one of these things I, I
constantly kind of advocate withcustomers is, you know, hey, I,
I get that you may have a favorite.
We, we all have a favorite model, but, but you know, you
(17:33):
know, make sure you're working and testing on all of the
models. Now say in data breaks AI
platform that allows me to say let's AB test this particular
use case, say with different maybe it's a proprietary one,
maybe it's a open source one. We just want to know what the
evaluation look like. Absolutely.
If you, if you run, you know, ifyou take an agent and you run it
(17:54):
through our evaluation system, you'll get a score and you'll
see how many of the prompts and responses passed versus failed
the, the LLM judges we have set up.
And then literally you, you put a new model into that agent, run
it again and you'll see a DIF ofhere's the ones that that were
green before and are red now. And here's the ones that were
red before and are green now. And, and so that makes, you
(18:16):
know, it was funny when when newmodels get launched, you know,
we often hear about it, but if we don't hear about it, we
immediately see it in the usage of our evaluation because
customers immediately come and start hammering to see if this
new model that's been released is better than the one they're
using. So any interesting customer
stories or real world use cases,I think globally all right,
(18:37):
where Mosaic AI is actually making a real tangible impact.
Yeah, I mean, you know, companies like Suncorp, you
know, use, you know, mosaic AI and the ability to kind of
choose which models they want touse and what have you to provide
kind of a Sun GPT experience to really, you know, accelerate the
productivity of their employees.Or on the other hand, we see
(18:59):
companies like AstraZeneca, right?
The pharmaceutical company who, you know, they had 400,000
documents to that described all of their clinical trials and all
of the challenges they were having.
And they were able to bring these in part and parse them and
kind of make sense of them and, and make it so that they were
able to chat with them and, and use them systematically and, you
(19:21):
know, run queries and analysis over them in ways they just
never had been able to before. So being able to derive kind of,
you know, critical net new insights from this unstructured
data that to them before was kind of locked away in in PDFs
and and other such things. So I I think since you are now
in town in the Asia Pacific region, what's your sense of say
(19:44):
the current state of enterprise AI adoption?
It's, I'll say it's really been exciting to see here locally
just how it's both excited and how, you know, let's be honest,
aggressive. The company, the companies that
I've gotten to meet with here are around building out kind of,
you know, whether it's building out.
(20:04):
Platforms for their employees tobe able to kind of build and,
and deploy more of these systemsor whether it's, you know,
building, you know, critical business, critical use cases on
top of this. It's been really exciting to get
to see and, and hear some of those stories And, and I'll say,
you know, it was a it's a real kind of breath of fresh air to
(20:25):
to get to see that this is not, you know, this is happening
global, you know, this is happening in every corner of
the. Problems are the same everywhere
you go. Right, exactly.
And and the solutions also, you know, the, this idea of, you
know, hey, do I have the abilityto run an evaluation that really
can help me, you know, gain confidence.
You know, the CFO of your company is never going to let
(20:48):
you deploy an agent into a critical kind of, you know, you,
you know, workflow or something like that unless they know that
it's going to work effectively. And so, you know, being able to
help the customers here, you know, use those tools so that
they can start really pushing these systems into production
has been really exciting. So 1 interesting question,
(21:11):
right? So now a lot of companies are
now moving from experimentation to production.
What kind of advice you will probably give them, say thinking
through the foundation model, the architecture for the right
use case, because it's always tricky in terms of thinking
about for the for the business owner to work out all these
(21:32):
different sort of choices and constraints as well.
Yeah, you know, it, it's been a challenge.
You know, they say it takes 10,000 hours to become an expert
in something. And we've only had agents for
about a year and a half kind of thing.
And in their current form, right.
And so, you know, no one has 10,000 hours.
And so building these things, you know, they have lots of lots
(21:54):
of knobs and dial, you know, youknow, people think like rag, oh,
just put some text in a vector DB and I'm done.
But no, like there's different search strategies you might use,
there's different chunking strategies you might use,
there's different embedding strategies, all of these
different things. And so This is why we've built
agent bricks. The reason for agent bricks is
so that, you know, you can come and describe the problem you're
(22:18):
trying to solve. And instead of having to be an
expert in, you know, all these different chunking and, and
parsing strategies, you can simply come describe the
problem, you know, bring in the data or, or, you know, point the
system at the data. And then, you know, we'll take
care of of the rest of that and and then we'll optimize it
depending on your feedback over time.
(22:40):
So in the next couple of questions, it's going to be fast
going. So I'm going to start off with
the first question. What's the one thing you know
about building enterprise grade generative AI platforms that
very few people do? Yeah.
I mean, you know, I hate to sound like a broken record here,
but it's eval, right? 85% of companies today are or
85% of of AI use cases are beingeval by the engineer who built
(23:04):
it saying Yep seemed to work pretty well.
That's interesting you mentionedthat, because I always tell the
business owners, if you get yourengineers to build this, you
better have the correct evaluation test for them.
Right. And so, and so, you know,
realizing the importance I, you know, I feel like I'm, I'm, I'm
out and my job is really educating folks on the
(23:24):
importance of, hey, if you want to get this beyond, you know,
listen, if all you're building is a system for you and your
buddies to keep track of the documents you wrote last year,
like use data bricks, use anybody like, you know, we can
help you do it, others can help you do it.
But if you're going to build a system that's going to be
critical to the business, that'sgoing to be important that it
gets it right, then you you can't do that without evals.
(23:46):
And, and that's the thing that that, you know, my hope is that,
you know, if 2025 was the year of agents, I hope that, you
know, the back half of 25 and 26is the year of evals, of evals
and eval and quality. Surprisingly, I get asked a lot
on that question by the engineers that I teach on the
course. So what's the one question you
(24:08):
wish more people would ask you about AI infrastructure, Mosaic
AI, or anything related to building enterprise grade AI
architecture? You know, I, I think you asked
it before and it was around models and, and model usage is,
is, you know, the vast majority of enterprises I work with, you
know, you walk in and they'll say, Oh yeah, we're, we're a
company that uses, you know, model A and we use model A and
(24:32):
we only use model A. And it's, you know, whether
that's open AI or Gemini or Claude or Llama or what have
you, you know, and they'll tell you to the blue in the face why
that's the only model they can use.
And you know, I wish more of them would say, you know, hey,
what models should I be using? Because my answer would be easy.
Whichever models work best for the use case you're building.
(24:52):
Or maybe have a selection of models so that you can switch it
out with for better cost benefitanalysis as well.
That's right. That's exactly right.
So what does grid look like for Mosaic AI within data bricks for
the next three to five years? Yeah, I mean, you know, we have
our work cut out for us. You know, I often joke that in
in this space right now, we often don't even know the nouns
(25:14):
of next year's nouns. You know, if you think back to
last summer, if we had been having this conversation, we
would have been talking about RAG, not agents.
True. And now we're talking about
agents. And, you know, hopefully we're
talking about agents again next year.
But if not, we're going to be talking about something new and
exciting. But.
But do you think now with what Mosaic is building with agent
brakes, you can now do a lot of orchestration work that you
(25:35):
previously couldn't do with that?
Absolutely right. And and not only can you do it,
you know, you could do it beforeyou just had to build it
yourself. You had to get outline chain or
the orchestrator of choice that you prefer.
And we, we support all the orchestration system.
I think we support 18 different orchestrators or something like
this. So you had to get out these, the
orchestrator and you had to kindof, you know, build that
(25:55):
yourself with, with agent bricks.
The goal is really just to take that complexity away, simplify
it, but but continuing to keep accuracy as the primary
objective kind of thing. And, and you know that, I think,
you know, regardless whether it's agents or solutions or
systems or what have you, our goal is going to continue to be,
hey, how can we help companies steer these systems to exactly
(26:19):
how they want them to behave? So Craig, Many thanks for coming
on the show again. Give me this valuable time with
you so that I can ask you a lot about ML OPS, everything else,
and LLM as well. So in closing, I have two quick
questions. Any recommendations that have
inspired you recently? Yeah, You know, one of the
things that inspired me recently, this, it was a huge
(26:40):
surprise. It was actually a a conversation
with a customer who's using our ability to throttle the number
of tokens that their employees can use in models as life cycle
management for their Gen. AI, you know, production.
So they could say like, oh, everybody gets this many, this
(27:01):
many tokens or this many prompts.
If you want to build a, an agent, you know, come to us and
we'll, we'll give you a larger budget.
And if you want to move that agent to production, give us,
you know, show us your evals andwe'll give you even more.
And the reason I say that is we didn't build those governance
capabilities. We built them for security and
(27:21):
tracking and monitoring. But I always love the creativity
of our, you know, I'm most inspired by the creativity of
our customers when they use whatwe built to do something really
exciting that we we didn't imagine that they would use it
for. So in building technology, it's
always about the people there. That's right.
That's absolutely. Right.
How do my audience find you and how do they keep themselves up
to date with Mosaic AI and and your work at data breaks?
(27:43):
Yeah. So certainly find me on LinkedIn
and would love to, would love tokind of connect with folks
there, you know, pay attention to data bricks LinkedIn as well
as data bricks on other socials.Certainly follow our CEO, Ali
Godsey. He always has a lot to say on
the topic as well. And, and I think, you know,
between those, they'll, they'll,they'll have a, a broad source
of, of great insights into how to build production agents.
(28:06):
So Many thanks for coming on theshow and I look forward to
having another conversation again I.
Look forward to it as well. Thank you.
Thank you.