Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
If you're going to be running a very elite research institution,
you have to have the best people.
And to have the best people, youhave to trust the people and
empower them. You can't hire a world expert in
some area and then tell them what to do.
They know more than you do. They're smarter than you are in
their area, right? So you've got to trust your
people. And so one of our really
foundational commitments, as I said before, to our people is we
(00:21):
trust you. We're going to work to empower
you. Go do the thing that you need to
do. And so if somebody in the labs,
you know, wants to spend 5/10/15years working on something they
think is really important, they're empowered to do that.
Welcome to Analyze Asia, the premium podcast dedicated to
dissecting the pulse of business, technology and media
(00:41):
in Asia. I'm Bernard Leong.
As AI reshapes our digital future, infrastructure becomes
the critical differentiator. With me today is Dart Berger,
Technical Fellow and Corporate Vice President, Microsoft
Research, to dive into how the innovations in AI infrastructure
from custom hardware to regionally ground dirt research
are now redefining what's possible.
(01:03):
Doug, welcome to the show. And 1st, congratulations on the
launch of Microsoft Research Asia in Singapore and welcome to
Singapore. Yes, thank you very much.
I'm I'm really, really happy to be here.
Opening the lab here in Singapore is exciting to me
because it's my first trip to the region and now I'll have an
excuse to come back many times. Great.
And before we start, I've also done some research on your
(01:25):
background, but I want to which actually is between in the midst
of academia and bringing them toreal real life application.
So maybe we start with your personal journey.
How do you get started in computer architecture and then
subsequently AI research? Like many, many people, I think
my which field I chose was an accident.
I went to Graduate School for myPhD in computer science, and I
(01:46):
took a course from Guri Soy, whowas a professor there, who was
teaching computer architecture, and I just loved it.
And so that was it. It was one of these happenstance
things. I went and got my PhD in that
area and then moved to form a research group at the University
of Texas at Austin. Then eventually.
How did you ended up with Microsoft then?
I spent with my colleague Steve Keckler.
(02:08):
We spent about 7 years building a very for academia, a crazy new
processing chip, and it was funded by the US government.
It was a ton of work. I've been a really fun project.
But after that, you know, you'recoming off the sugar rush of the
big project and wondering what'snext.
And Microsoft came calling aboutthe time I was wondering what I
(02:29):
was going to do next. So it was just good timing and
they recruited me up to the corporate headquarters in
Redmond. So I think 1 interesting
question I probably have. You went from academic like a
professor to now a corporate vice president at Microsoft,
right? What are the key decisions that
shape your trajectory moving from academia to real world
applications type research? I think the biggest thing for me
(02:53):
was just always wanting to feel challenged.
Like whenever a job is felt a little bit too easy, I get
restless. And this is career advice I give
to junior people. You know, if you're, if you're
not growing, if you're starting to feel like you're turning the
crank, move and do something hard.
I think that's the probably the biggest factor.
And I think that's a great life advice.
So I want to give us quickly dive into the one of the
(03:17):
subjects of the day, but it's actually regarding Microsoft
Research now in Singapore. It's also going to be a regional
research and development frontier.
So it is the first lab in Southeast Asia.
What motivated this expansion now in Why Singapore?
If you look at, so MSR has been around for over 3 decades.
We're by industrial standards a very old lab, but we've actually
(03:39):
grown quite a bit in the number of regional labs we've we have
over the last five years. And part of the reason is that
with this AI transition, it's really important that we tap
into local pools of talent, local culture, different ways of
operating, because we really want AI to work for everyone.
And so I think some of the unique characteristics of
(04:02):
Singapore, certainly the great local universities, the talent,
make it a very good place to puta lab, really innovate an AI in
a way that is appropriate to Singapore.
And so we, we will learn from all of this.
We have labs in, you know, Africa, we have labs in Europe,
we have labs in North America, of course, and other places in
Asia. But that Geo diversity is
(04:23):
allowing us, I think, to build better AI that works for
everyone, not just, you know, the companies that trained in
one location. So what would be the labs
strategic direction and how would you tie into say
Microsoft's broader research objectives?
Initially the lab is going to belooking at, of course,
foundational AI, you know, embodied or physical AI, you
(04:44):
know, health and AI and health. And so there'll be a a few
initial focus areas, in part because of the local interests
and local collaborations and in part just because of the talent
that we hire. And you know, that's the
expertise that we have right now.
As we grow, of course, we'll move into different areas.
One of the things that I'm really excited though is about
(05:04):
is when we, when we start a new lab and then those researchers
who as we hire more talent, theyget plugged into the global
Microsoft Research Network and they just Start learning from
all of these incredible people we have across many areas around
the globe. I'm actually a very beneficiary
of the Microsoft Research Centrein Cambridge when I was working
(05:26):
in first in cosmology and then went into the Human Genome
Project. And Chris Bishop from Microsoft
Research Cambridge was one of the academics that I learned a
lot, that I actually looked at alot of his work on Bayesian
networks that actually helped mewith unsupervised learning.
So I know that there is actuallyeven an ecosystem that's been
built up from the lab to his external environment around
(05:47):
them. Right.
And when and when we, when we, you know, have interns come here
or PhD fellows and collaborations with all
universities, but then full timeemployees, they're part of that
global network and they can tap into people like Chris.
And Chris, by the way, is just such an amazing person.
I count myself very lucky to be working with him and MSI.
I'm probably going to catch you when I visit Cambridge in this
(06:08):
coming winter, so I will catch up on that later.
But I want one of the interesting things that really
excites me is the labs healthcare initiatives that
stood up to me. I think using AI to determine to
detect seizures or epilepsy in children and also applying
computer vision to access motor function in Parkinson's
patients. How is Microsoft Research
advancing these technologies, and what does this signal about
(06:30):
the potential for AI to support early diagnosis or personalized
treatment in healthcare? Well, I think you know as well
as I do that AI is really good at finding patterns in large
amounts of data that we don't see.
And so, you know, the, the MSRA teams have already found some of
those patterns that will help people.
And you mentioned some of them, Parkinson's, Alzheimer's, you
(06:52):
know, cleft palate, We have other in Africa, we have an
initiative looking for prematureretinopathy and babies and, and,
and preventing blindness. And so I think there's just so
much opportunity as we, as our AI advanced technologies advance
to find more and more of these patterns and help people with
preventative diagnostics. And I think the partnerships
here in Singapore in particular with Sing Health and, and the
(07:14):
government and the pools of datawill, will provide a way to do
that and really help people. And then when we do that here,
another really important factor is when we come up with
something here, we can span it across the globe.
So discoveries that help the people in Singapore through
these collaborations eventually go global.
Yeah, I think specific for the cleft palate case because my
eldest daughter is actually a cleft patient.
(07:35):
I think she was blessed because Singapore has one of the best
cleft centers. I think even there was a
Berkeley genomics specialist whoI know who was very well known
was actually getting genomics data here to them.
And I think there is the Operation Smart where they need
to go to all the surrounding countries.
It's actually one in 70 Asians actually has a chance of getting
(07:56):
clap ballot, which is not very unfortunate, right?
And I think being able to help those kind of detection, what
you're doing, the Microsoft Research will be pretty good,
not just for here, but also for the rest of the world.
Yeah, that's when these stories become personal.
They become very moving. So I'm just, I'm actually really
happy to hear that this is important to you.
I think there is also now a strong emphasis on thinking
(08:17):
about regionally relevant because given Southeast Asia
have a diverse set of language, diverse set of cultures, and
because we're talking about foundational AI and what your
teams are going to be interesting and also
responsible. How are you embedding Southeast
Asia's societal norms and valuesinto the Research Center?
So if you look globally at MSR, we actually are doing this in
(08:39):
many places now. So we have initiatives in
Africa, we have initiatives in India that are looking at
building models that work well with uncommon languages to make
them accessible to everybody. We have work looking at adhering
to cultural norms and when images are generated, making
sure that they're representativeof the local cultures and not
just the data sets that they were trained on.
(09:01):
And so, as I was saying before, those technologies that we're
developing will be able to be applied here, but now we can
also use them with the local culture, the local languages.
And I think this is part of our,our goal to, to really make AI
accessible to everybody in a waythat's, you know, that's
equitable and, and sort of not coming in from the outside with
(09:22):
something that doesn't work for the culture.
And it also aligns with like theMicrosoft's responsible AII know
Brett's wrote a book, book on this subject talking about how
to embed responsible AI into theinto the realm of all the
technologies that you are building out.
That's right. And when I, when I read through,
you know, the summary of your, you know, Singapore National AI
(09:43):
Strategy 2.0, there was a big focus on responsible AI and
privacy and safety and accountability.
And, and those for us are also very important elements.
And so in MSR across the globe, we actually have many teams
working on all of those problemsto really make this safe for
everybody. That's that's very interesting.
One key thing before we get intomuch more interesting subject
(10:06):
would be Microsoft is actually partnering with institutions
like Sing Health, the National University of Singapore and
Young Technological University and Singapore Management
University. Can you just elaborate on some
of these collaborations and the early impact?
Well, what I would say, maybe I'll start at the very high
level. So I think many of our, your
listeners may not know that muchabout MSR before this, but we
(10:28):
are fundamentally an open research organization.
About 80% of the papers that we publish are actually done
jointly with academics around the world.
And so we are, we are very open,we are very collaborative.
You know, we, we publish most ofthe work that we do.
We open source much of the, the work that we produce and, and
our researchers are also given huge amounts of, of autonomy to
(10:52):
go and do the fundamental research that they think is most
important. So one of the great, great and
hard things about my job is I can't tell anyone what to do,
right. And, and, and for Peter, you
know, my boss, it's the same thing.
So, you know, we, we really, I mean, I can have an opinion and
I can share it, but I can't tella researcher you have to go do
this. That's part of our contract.
(11:13):
And so, and so everyone is really asked to do the most
important things that they that they can do, whether it's some
fundamental research or some curiosity driven thing or taking
a technology that we've developed or breakthrough and
then applying it to, you know, to Microsoft's products or to
advanced society. And so now a lot of our
(11:33):
collaborations with the local universities will be of that
flavor, you know, joint investigations and really
fundamental research. And our collaborations with Sing
Health will be looking at takingsome of, in some cases taking
the things we've developed and using it to help them and help
their patients. In other cases, partnering with
them on the really important problems that they're facing.
(11:54):
I think a lot of people don't appreciate that with the
innovations of deep learning that came out in 2010, a lot of
the world before that was mainlydone in Microsoft Research.
I think things like the Bayesianoptimization, we think about
hyper parameters and a lot of things about the those how to do
with neural networks is because of a lot of the speech
processing algorithms that was pioneered out of the lab itself.
(12:17):
Well, I'm very proud of the early work that MSR did in AI,
but I also, you know, have to have to acknowledge decades of
work in academia and, you know, and, and the giants in the field
that, that and of course, Jeff Hinton, who persevered through
multiple AI winters to get us towhere we are.
Yeah, I think that's that's where we get to now.
(12:38):
The pace of generative AI is moving so fast.
I mean, practically every week something comes out.
How do you see this affecting, say, academic researchers and
maybe even like in the infrastructure providers, be it
any of the tech companies like Microsoft?
It's such a great question because you know, with, with one
(12:58):
of the things that's I think happening now is that the, the,
the tools that we're building accelerate the tools that we're
building. And so now you can, you can
build a solution or investigate some, some new optimization that
will help AI, but you can use AIto build those tools faster.
That's right. And So what we're seeing is a,
is a, an acceleration of the turning of the technological and
(13:21):
innovation crank. And then the, the other thing
that I think is really striking is how general these models are.
I mean, they pull structure out of unstructured data.
And so it doesn't matter what field you're in, if there's
structures in there, the models and you have the data, the
models can go and find it and then you can use it.
(13:43):
And so it's not just impacting or affecting the field of AI
research, it's affecting most fields.
One, one example that I love, you know, that's near and dear
to my heart as a computer architect, is that it, if you
look at the historical stack andyou know this as well as I do,
you know, you have the CPU's andthey have an instruction set.
And then there's, you know, there's a, a compiler and an
(14:05):
assembler and a linker and a loader and then a programming
language. And there's all these layers
that we've built to be human interpretable.
And then we and then we use tools to translate from one
layer down to the other. But all those are kind of ad hoc
things that we've built by hand over many decades.
And I think AI is going to be able to define just completely
(14:26):
new interfaces and just bust through them and, and just
completely change that that whole hierarchy that we've that,
you know, I got trained on and Ispent 30 years trying to master.
Yeah, it's, it's something like,you know, like the first time I
started using cursor and then trying to use prom engineering
to code. But it actually up ends all the
things I've done in the past, I mean went through normal Cobalt
(14:48):
programming to object oriented programming.
And then after that we have containers, you know, all the
all that different works and practices.
And then we have GitHub Actions and everything program properly.
And then now we've got up end itagain.
So I, I have a great example of that.
So this, this past weekend, my wife was on a, a trip with a few
of my kids. And so one of my, my third son
(15:10):
and I sat down and decided to build a software project
together for a project we were doing.
And, and we, it was pretty complex and, and we spent about
four hours building it. And he doesn't code, but you
know, he, he's 13. But we were just having, you
know, good, good family bonding.But the crazy thing was I was
(15:31):
using GitHub Copilot. I was using, you know, sonnet 4I
was using the agent mode. And it was the first time I've
ever built a complex software solution where I did not look at
a single line of code. And it hit me then.
And of course, vibe coding is now getting big.
But, you know, lines of code arethe new assembly language
(15:51):
instructions. You know, after the 70s, no one
looked at assembly unless you were a specialist.
We had compilers for that. And now lines of code are
becoming like assembly. It's just a whole different
world. And then we're going to be
coding in English. That's going to be another layer
of complex. Natural language intent.
That's right. So then there's the other part
of it, right within the foundation models, there is the
open source models alongside with the commercial closed ones.
(16:15):
How do you view the open versus closed debate now in the
generative AI landscape? It's a big debate in the
community, but one of the one ofthe, one of the nice things
about being at Microsoft is where, you know, we, we are very
open source friendly. I mean, MSR publishes tons of
open source stuff. From a product perspective, we
(16:35):
just want to be the platform that empowers all of our
customers and users the most. And so we're going to deploy the
best open source models, we'll deploy the best closed source
models that we have access to, you know, so from a, from a
product perspective, we just want our customers to be happy.
And so, you know, we, we partnerwith other companies on closed
source models. We ingest open source models and
(16:56):
make them available. I think the one thing, you know,
that I, the one advantage in theclosed source models is that
it's, it's easier to guarantee safety, you know, because these
models are getting very powerfuland you can kind of you can, you
can monitor or at least restricthow they're being used.
But we, but Microsoft in general, like we bought GitHub,
(17:16):
like we, we really believe in open source.
So we're just like everyone else.
We're just trying to navigate that.
But I don't think we have a, youknow, we're not one or the
other. We're just it has to work for
our customers. So I guess given this open
versus close, then what about say, for example, when we think
about multimodal models and now reasoning agents, for example,
(17:37):
how do you, what are the trends that you're closely watching?
I mean, this year they, they, everybody says the year of the
agentic AI and we're starting tosee different variations of how
agents can go into systems righton, on there.
So I think reasoning, it's got alot of momentum.
It seems really important. I think it's actually more
important than even the momentumwas the momentum would suggest.
(18:00):
I think these reasoning models are going to be incredibly
foundational. And and the reason, no pun
intended, is that they can take steps and and look at what
they've done and backtrack and go forward.
Now, and maybe this is me with my researcher hat on.
I think where that gets incredibly powerful is when you
(18:21):
start to have a formal structures that you can reason
over and make correctness guarantees.
So obviously now the models are doing that with code.
They, they catch themselves making a mistake and fix it.
But but there's so many other areas where we should be able to
project those other areas into formal spaces and then reason
(18:42):
over those formal spaces and provide a strong guarantee of
correctness and then and then, you know, and then execute it.
So I think that the the combination of formal logic
structures and formal verification and reasoning
models will allow us to do many,many more domain safely in a way
that that we really can't today with sort of the one shot
probabilistic models. Do you think that, you know,
(19:04):
they're always talking about scaling laws and the more data
you put in, are we actually hitting the wall?
Or actually there's so many dimensions like for example,
test time compute, for example, right?
And all the different new metrics that's being drawn out
in multimodal that really we really are not.
We are to be barely scratching the surface rather than hitting
a wall. You know, as you know, I used to
(19:27):
work in quantization and we thought very hard about scaling
laws and what is the scaling lawfor quantization?
And how does it affect, you know, the main, you know, the
economic scaling laws. What I'm seeing in, in research
now is that is that we're, it's not just one scaling long or
two. It's, it's like there are 10
dimensions of scaling and we're scratching the surface in all of
them. So I, I, I think I don't know
(19:49):
that there are hard limits untilyou run out of some resource,
like whether you run, if you runout of data, you know, then
you're not going to be able to scale.
But these models seem so general.
That as long as you have more ofthat resource, you'll be able to
keep scaling. In fact, I think they're now
hiring PhD students to help themto look at complex academic
(20:10):
problems, so that the reinforcement learning is
actually becoming more and more intense.
That's right. That's right.
We're now, we're now getting better and better able to steer
these models in interesting directions, which by the way, I
think I, I'm also excited about because in, you know, in the
labs, we're starting to, to ask the question, how should we
steer these models in a way that's good for human health and
(20:30):
society as opposed to just just,you know, making the model more
capable. So I think that's going to be a
really promising direction. I want to ask this question, you
let project kind of fault and brain wave.
What are the key design choices that drove their success?
I think 1, maybe I'll answer at A at a high level and then a
(20:53):
more specific level. I mean, for, for me as a
researcher, it's always been about getting out into a new
space and struggling and then waiting until you have clarity.
And once you have clarity, then you just go and you're kind of
relentless. Totally agree with you.
That's how I felt when I'm doingmy PhD.
That's right. That's right.
And, and, and it's hard to be out because because very often
(21:15):
when you're doing something thatis going to be important and
new, by definition, it shouldn'tmake sense to everyone else.
So everyone kind of looks at youand scratches their head.
And so, so when we and said, whyare you doing what you're doing?
And, and so when we started on, on the catapult journey, what I
was really setting out to do, wewere, we were looking at CPU
trends and saw that, you know, per core per CPU performance was
levelling off. And, and, and we, we convinced
(21:39):
ourselves that, you know, specialized hardware for
different applications was goingto be the future.
And so we started building customized hardware as trying to
make a general platform for Microsoft's cloud that, that you
could program different applications onto.
And so the, you know, and so we decided to take a bet on Fpgas
'cause you could program them. And it was a first step towards,
(22:01):
you know, specialized, you know,a whole fleet of specialized
ASICS once the applications werebig enough, but we kept getting
the design wrong. Like we started off with a board
that had 6 on it. And then if your application
only needed one, then five were stranded.
If it needed seven, then you were screwed.
So we, we networked them into a configurable, configurable
fabric that hung off the back ofa rack. 48 of them and that was
(22:21):
great, but then, you know, the company wanted us to converge
the infrastructure for Bing and Azure and it was designed just
for Bing. And so then the third
generation, we came up with thisidea of putting the the chip as
a bump in the wire on the network path.
And so then it could go out and talk to other chips.
And So what we kind of built wasa configurable plane of hardware
acceleration. That's how Bing was using it.
(22:43):
Azure ended up using that just to accelerate their networking
stack, which they're still doingtoday.
Which also explains how your Azure computing platform is so
ubiquitous among the large companies because you have the
kind of acceleration of the networks talking to each other
at a very low latency situation.I would love to think that's the
reason, but the real reason is that the Azure team worked
really hard and the sales team'sworking really hard, and
(23:05):
everyone was just working reallyhard.
OK. Yeah, but I want to get to
quantum computing, right. Recently you're introduced the
Majorana 1 quantum chip actuallyusing topological qubits.
Yep, something that I've been monitoring for a very long time
because I'm a theoretical physicist by training.
So there's a lot of long standing challenge in.
So I should be interviewing. You.
I want to know a lot. No, I'm, I'm naturally curious,
(23:27):
right, because from your vantagepoint now leading global
research, how is this shaping towards the future of quantum
computing? And what are the milestones that
we need to cross in order for usto reach maybe practical fault
tolerant type of quantum systems?
Are we a few, maybe a few years away?
Or maybe are we a decade or maybe 3 decades away?
(23:48):
Well, I certainly don't think we're three decades away and the
team has made great progress. One thing I want to really call
out with that team and I, you know, they were in MSR and then
they, they, you know, graduated out into the business groups and
they've just been into, into theAzure team and they've been
pushing, you know, on that sincethen is the courage they had to
go after this crazy topological thing.
(24:09):
I mean, they've been at it for 17 years and people, you know,
and taking lots of slings and arrows because after 10 years,
you know, people, your, your peers will just say, are you
still doing that thing? Why haven't you given up yet?
But the reason that they stuck with it so long and, and now
they're finally, you know, showing that it works and
there's lots of work left to do.But I'm I'm just so proud of
(24:31):
them is because the topological qubits are just inherently much
more stable because of the separation of the Cooper.
Post. That's right.
And so like if you're trying to do a more traditional qubit and,
and do you know, lots of error correction as it deco here's you
need exponential numbers of, of error correction bits.
(24:52):
It doesn't scale or you're making them very big and they
have to be very cold and that doesn't scale.
And so really what they've done is is pushed down a path for,
you know, so many years towards what I think could ultimately be
the quantum transistor, and I mean the field effect transistor
because there was point junctiontransistors and vacuum tubes and
we're. Going to have a quantum chip as
(25:12):
well. That's.
Right, That's right. And so when when you know the
chip they have that they've built, in theory, it can hold
millions of logical qubits. Now they have to scale it.
And so there's still a lot of work to be done.
But in theory, you know that that paradigm is so stable that
if they get it to work at scale,you know you it'll solve all
(25:34):
these other problems that all that the other quantum
approaches also have to solve. And they're going to have to
solve it a different. Way I'm pretty impressed because
it solves the decoherence problem that I usually ask a lot
of people working in quantum computing how are you going to
solve that right right. But I think that 17 year journey
is the is the much more interesting part of the story
that I wanted to ask. How does Microsoft research
(25:56):
balance this kind of like blue sky foundational research
against like more application driven initiatives?
Like how do these how do this team actually had the morale to
persist to get to this stage? Well, there's 3 answers and and
so the with with a caveat. So the first answer is that, you
know, if you're going to be running a very elite research
(26:18):
institution, you have to have the best people.
And to have the best people, youhave to trust the people and
empower them. You can't hire a world expert in
some area and then tell them what to do.
They know more than you do. They're smarter than you are in
their area, right? So you've got to trust your
people. And so one of our really
foundational commitments, as I said before, to our people is we
trust you. We're going to work to empower
you, Go do the thing that you need to do.
(26:40):
And so if somebody in the labs, you know, wants to spend 5/10/15
years working on something they think is really important, you
know, they're empowered to do that.
Now, of course, I, you know, we,Peter, who runs, you know, MSR
globally and I and, and my peers, you know, we review
things, we give feedback, right?You know, because you don't want
to, but, but if they think, no, you're wrong, I'm going to keep
(27:02):
going. You know, that's what they're
supposed to be doing. So now on, on the quantum thing,
of course, if you're spending a lot of money doing that personal
vision of yours, the company's going to ask you, all right,
what are we getting for the money?
But one great, one great thing about Microsoft that I, I, I
really love about the company is, is how deeply intellectual
(27:23):
it is. And this really comes from Bill
and the founding. And you know, Bill is still
involved. We do reviews with him and MSR
regularly. He gives great feedback.
Sacha is very curious, you know,And so so you go to those
leaders and you ask them about our topological qubit program
and they understand, you know, the, the, the strategy, they
understand why we're doing what we're doing.
(27:43):
You know, the last card questions, Bill will dive deep.
But but that team, you know, just would show the progress and
say, This is why we're doing what we're doing.
You know, we're going to addressthe decoherence problem.
If we can get there, it's going to be really stable.
Yes, it's a lot of money, but you know, this would be a real
boon for humanity. And so the company gave them the
support to keep going. Yeah, I think because of that
(28:04):
this year I think Davos a lot ofpeople are talking about quantum
computing and I'm getting calls from venture capital firms that
I advise about. Can you give me a one hour
rundown on quantum computing? Likewise this is this is
getting. But I think given that's the
case, right, we are we are nearing approaching say the
limits of Moore's Law and what would be like a post von Neumann
type of computing world look like.
(28:25):
I mean, we, we live in the logicgates, you know, all these
binary things that have all beeninvented in the 1920s, nineteen
30s, right? Right, we're now in all the Bell
Labs. Yeah, all the Bell Labs and now
we are in the next phase of that.
So I maybe this higher level framing might be helpful.
I, I think we're sitting on three fundamental modes or
(28:46):
paradigms or pillars. I don't know what the word right
word is of computing. You know, we have logic, you
know, you do an operation on a zero or one like an hand gate.
You have probability, you know, and that's what the AI really
is. Now you know, I'm going to
predict from some distribution is am I going to get a 0 or A1?
And then you have quantum where you have superimposed.
(29:07):
Superimposed probabilities. Superimposed, it's still
probabilities, but they're superimposed and they have very,
that's right. It's a really important call.
And so there's really these three modes and they're very
complementary and synergistic. And I think the really
interesting thing is going to besome of the intersections
between those fields, those 3 modes.
Like if we get the big enough quantum computer working, or I
(29:27):
should say when, you'll be able to generate all sorts of very
accurate molecular data to trainprobabilistic models to, to
produce new materials even better than they can today.
And then you can synthesize those new materials in factories
being driven by the logic stack.Yeah, and I think they also have
a lot of profound implications to cryptography.
(29:48):
That's right. Specifically quantum resistant
quotes. We are talking that's right,
companies now talking about these quantum resistant quotes
that needs to be introduced intoour password encryption.
A lot of things are going to be another major change with is
coming. Right.
That's right. And so to come back to your
question, I think I mean, we've talked about the quantum pillar
on the probabilistic pillar, youknow, which is AILMS, all of
(30:11):
that. I think there's it's just
accelerating so fast. I mean, model costs are dropping
10X per year. We're just seeing, you know,
continued algorithmic silicon Coinnovation.
I think we've got decades of acceleration there.
And then logic is, you know, slowing down.
And I think you know, you know you, I think you were asking me
about the popular version of Moore's Law, which is
(30:32):
technological progress. There's the of course, the
formal version, which was GordonMoore's 1964 paper which talked
about the rate at which transistors on a chip double.
Martin Moore's Law also would undergo a reformulation.
Yes. Well, yes, it has.
But remember, I'm a computer architect and I read his paper
in grad school. And so when people ask me about
Moore's Law, it's like it's likedriving dragging catnip in front
(30:55):
of a cat. I have to answer the precise.
I totally understand that because you have to appreciate
what comes before and then you have to build on top of that and
then you have to take it to the next stage.
That's on it. I, I want to just switch course
and talk about like the concept of global research leadership.
From from where you are as the global head of Microsoft
Research, how do you balance a long term exploration with
(31:15):
practical near term impact? OK, well, I have to offer one
connection by, you know, Peter Lee, who?
I work for is. The global head of MSR I don't.
Want to be claiming that? That's me.
But you know, MSR has been around since the early 90s.
We've had lots of crazy adventures, but one thing we
figured out is that you you haveto be in exploration mode with
(31:40):
people you trust doing risky things that may not look like
they make sense to discover new stuff.
And it's always surprising and unpredictable.
And then once you have that breakthrough, you then think
about how to how to apply it, how to exploit it, how to use
it. And we've actually gotten very
good over the past three decadesin taking those discoveries and
(32:02):
getting them to scale. And we have lots of motions.
You know, we will take a research team and graduate them
into the product groups. We will work with NGOs to solve
some problem. You know, we will have the
product groups fund a center. You know, we'll, we'll create an
open consortium for some standard.
You know, we will, we'll create V teams where people work
(32:25):
together. So we, you know, we'll create a
mission lab to go chase after some important problem.
So we've just learned all these different patterns and, and
playbooks and over the years andwe're now pretty, pretty good
saying, OK, we've had this breakthrough, what do we do with
it? Then I mean, given that it's
distributed and also you, you will still give an autonomy to
people to work on problems that matter them to most.
(32:47):
And these are all very smart people.
How do you get the innovation tosynthesize across such a large
and distributed network of labs?They have all different
expertise. But then that's right.
There must be there must be something that or maybe some
high level principles within thelab say, hey, maybe if we take
this piece and this piece, we put it together, maybe this is
something is our value. That's right.
(33:07):
So, so the framing that I think we like to use with our teams is
we want everyone to do the most important work that they can.
And sometimes that's powering into some new space or something
you're really curious about and desperately want to understand.
But sometimes it's taking something that you've invented
or one of your colleagues has invented and taking it to solve
a real problem or help people. And so like we, you know, we
(33:31):
want to advance science and that's the exploration.
And we want to help, you know, our customers and we want to
help society. And for any researcher at
different times, you know, the most important thing will be in
those different buckets. Another thing that that, you
know, Peter and I and the rest of the MSR leadership do, we
spend a lot of time going aroundto the labs making connections,
(33:51):
you know, creating a culture where people can share things
and work together and find thosepieces.
Sometimes, you know, we, we see something and stitch them
together. Sometimes, you know, we just
help create the environment and then people do that naturally.
And, and so a lot of, a lot of the management leadership's job
is making connections and, you know, helping people see some of
(34:11):
the ways they might go and take stuff to scale.
Because really at the end of theday, you know, if researchers
care about deep science and deepunderstanding and then also
about, you know, having a lot ofthose things become real.
And so I've, I've always found that our people are highly
incentivized, you know, to make stuff real and matter to people
and help people like, you know, the cleft palate where the
(34:34):
Parkinson's work. You know, if you can help
millions of people with something that you're doing,
you'll probably do that for a while rather than just going on
to the next discovery. So tell me, what's the one thing
you know about building advancedAI infrastructure?
Or maybe leading global researchoffset?
Very few people do. What a great question.
(34:55):
I think for leading research, it's the more curious you are
and the more you know about different fields, like the more
connections you can make. And Peter's very good at this.
And so just that curiosity helpsyou to see this and this where
there's two people working in different domains and say, oh,
(35:17):
there's a connection here. This is something, by the way,
that Bill Gates is, you know, just phenomenal at just because
of the amount he's seen over thecourse of his career.
I don't think it's something unique.
I think it's something that research leaders learn that just
it's, it's that, it's that, oh, this and this.
Make something, make something. Of the Santa Fe institute or
(35:38):
multidisciplinary research or concilians is the word to draw
different fields together and put it together.
That's right, on the day. I like to say the tallest oaks
grow from cracks in between the fields.
That's a valid 1. So what is the one question that
you wish more people would ask you about AI research, but they
they don't? I don't know how to answer that.
(36:03):
I I feel, I think for me, the one thing I really want to know
maybe is a slightly different take on it.
Yeah. I just have this feeling in my
gut that there is some mathematical theory underlying
the the compression of data thatyou need to make a prediction
(36:26):
that's analogous to what Shannondid for, you know, digital codes
and communication channels. Right, Yeah, Yeah, you're
talking about the unstructured data, the ability for the
foundation AI. Yeah, I think a lot of people
haven't really done a lot of deeper architectural work into
the transformer model. What we are seeing is
representations of that model from in the forms of bird.
(36:49):
That's right. GPT and bot.
There's a deeper underlying, well, I, I, I think that there's
a deeper underlying mathematicaltheory there.
And I have, and again, this is just my gut is a researcher, and
I'm probably totally wrong that biology and evolution have
figured out how to extract it and encode it in something that
(37:10):
can run in 10 watts like our neocortex.
I agree with you. And that and that.
When we finally understand this,we'll find out that all the
hyper parameters in our brains have been set to be near that
optimal theory. Yeah, like the way how our
neurons are connected, because it's the connectivity of the
neurons that actually allow us to process thinking.
Yes, but they make connections and then the connections get
(37:32):
pruned, and the number of connections getting pruned every
day is massive. So you're constantly sampling
the information coming in and locking connections when there's
something consistent that you need to make a successful
prediction. Interesting.
I actually have that's a very good one because I haven't
talked about. I think we don't really
(37:52):
understand the Transformers architecture.
I teach it as a course, but sometimes I'm also wondering
some parts of it in my mind thatthere are some non linear
properties that may be giving rise to these.
There's a why where you are. You're way alluding to is a
deeper mathematical language that embeds within them, but
you're not sure whether. How do you extract it out?
(38:15):
That's right into a set of general rules, basically.
They're they're the Transformers.
And I mean, we understand why back prop works.
You know, we understand they're compressors, right?
Like we understand the pieces. Yeah, but we, I don't think we.
Understand how we put it all together.
That's right. There's a deeper theory of the
underlying representation that they're learning that's not tied
to, you know, 1000 dimensional continuous spaces that the
(38:35):
Transformers are learning. Interesting, yeah, because the
math is quite reminiscent of what I used to do in theoretical
physics like string theory, and there's a lot of embedded
structures there's already hidden in theoretical physics
with those models. Except the Transformers have
more dimensions than string theory.
Yes. That's sorry, that's a nerd.
Yeah, No, no, no, yes, yes, but it's linear.
(38:55):
But it's linear, but it's linear.
Yeah, no, it's a great. I haven't got, I haven't had, I
haven't had a chance to tell someone so deep into the
transformer architecture, But I have a traditional closing
question. What does great look like for
Microsoft Research, especially in Asia, over the next five
years? So great to me from a, from a
(39:17):
management perspective, says that, you know, great to me
looks like we have amazing people who feel empowered and
energized to do work that affects their societies and
makes a real difference for people.
I mean, of course Microsoft's business, but the larger
society, like I really want thisAI revolution to be a big jump
(39:40):
and a boon for humanity. And so great for me is that our,
our labs are filled with people that have a sense of purpose and
passion and pride and that are helping to do that.
I I have another secret fantasy,which is I think maybe more of
an, an ego thing, but I would love my secret dream, which is
now not so secret is, you know, 50 or 100 years from now.
(40:02):
People look back at MSR today like the early days of Bell
Labs, but for. AIII actually thought that you
have done one step beyond the Bell Labs by taking Research
Global, because in in Bell Labs used to be only in one part of
the region of of the of the US, right?
Right. But but this, I think we have to
(40:24):
because this is so important forall of humanity.
Yeah. You know, we can't.
We can't let one region decide, given the power of this AI
stuff, what it means for everybody else.
We have to do this together. Many thanks, Doubt for coming on
the show and I I think this is probably one of the most geeky
conversations I have. So in closing up, two quick
questions. It was a lot of fun, thank you.
Any anything that which have inspired you recently, recently
(40:47):
you can share Vogue movie, anything else.
Anything that's inspired I'm, I'm thinking of lots of research
projects. So that's, that's where my,
that's where my head is. Well, not too long ago, and this
is maybe topical, I read The Idea Factory which is about Bell
Labs. Bell Labs.
(41:07):
I read the book too. And, and I found that that book
incredibly inspiring, you know, but that was a little bit while
ago. I let me talk about one piece of
work that I found inspiring. We have a team in MSR that's
looking at using reinforcement learning to steer models, having
a multi turn conversation so that the the conversation can be
(41:28):
more successful. And what happened in that review
is we started talking about all the ways we could steer models
to have conversations that helped people's mental health or
were engaging without tipping over into addiction or could
challenge them so they would do more critical thinking.
And I came away from that meeting with my head spinning,
saying, you know, we have to steer these things for good in
(41:50):
ways that are good for people. And and so it's not a book or a
movie, but that. Was something, but that's,
that's interesting. I, I think that's where the
responsible AI PS comes from. It's very important, instead of
trying to get you into addiction, into challenging you,
to rethink your life and do something more positive.
That's right, steering for good,maybe.
Yes. So how would my audience be able
(42:10):
to follow the work on Microsoft Research latest developments?
Well, we have a website, research.microsoft.com and you
can go there and of course all our people are there and we
have, you know, we, we do podcasts, Peter just did a
podcast series. We have blogs we published, we
put papers. I mean, it's, it's a pretty
active feed. You know, we have well over 1000
(42:32):
really smart people working really hard, so it's a lot, but
just go to the website and you can find everything there.
I'll definitely point all the resources there in the
transcript and you can definitely find a podcast
anywhere, everywhere. So subscribe to us and of
course, we'll be happy to take any feedback.
So that Many thanks for having this, sharing this quality time
with me and I look forward to speak to you again soon.
(42:52):
It was a real pleasure. Thank you.