AMD's Challenge to NVIDIA: The Open Ecosystem Bet | Anush Elangovan & Sharon Zhou - Chain of Thought

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:05):
Welcome to Chain of Thought, thepodcast for developers and
leaders navigating the AI revolution.
We are broadcasting Live Today from AM DS Advancing AI 2025
event here in San Jose. There's a palpable energy in the
air for improving AI developer experience for an open AI
ecosystem. I am your host, Connor Bronsden,
Head of Developer Awareness at Galileo.

(00:25):
And we have a very special guestjoining us today.
We have a second special guest as well, but I'll tease that
later directly from the heart ofthe action, Anush, Ellen Govan,
VP of AI Software, AMD, Anusha at the forefront of building the
software ecosystem to power the next generation of AI
applications on AMD hardware. And we're delighted to have him
here today. Anush, thank you for making the

(00:46):
time to sit down with us and thanks for joining us here on
Chain of Thought. Thank you for having me.
I'm super excited to be here andtalk about all what we announced
today, the AMI three 50s and thehardware innovation and the
software innovation that goes with it.
For folks who are actually watching on YouTube, I am like
so excited about this because wejust came off this incredible
keynote with Anush, Lisa Sue, Sam Altman, and Open AI.

(01:09):
So many folks coming and sharingthe incredible announcements
that AMD has, some of the incredible partnerships that
they're going after, and some ofthe amazing investments they're
making in the infrastructure as AMD has transitioned from being
not just a hardware company, butone that is shipping software
all the time. There is so much going on and

(01:30):
you and the team here at AMD areright in the middle of it all.
So let's just dive directly in. Let's talk about scaling.
Let's talk about performance. Let's talk about those chips.
AMD has made major announcementsacross the board, as you said,
the new MI 350 GPU series with incredible performance,
continued customer momentum, andso much more.
As such as the developer cloud, new Rackham 7, all of which is

(01:54):
extended and aligned within a core vision of an open developer
ecosystem. What is the single take away
that you want developers and leaders to have coming out of
this event? The speed and innovation with an
open ecosystem is unmatched by any proprietary closest system
and it's not one vendor trying to sell you what they have to

(02:14):
move AI forward here. Everyone is welcome both at the
hardware layer, networking layer, CPUSGPUS, and in the
software layer, right? And we partner with everyone to
make sure that overall, there's a good experience for the end
customer, you know, deploying AI.
I love that idea of open partnerships because we've seen

(02:35):
open source win in so many cases, whether it's pre AI, and
we're seeing a lot of success within AI obviously as well.
But powering all of that is the hard right layer, and that's
what AMD is probably best known for.
The MI 350 GPU series that was announced today represents a
significant leap of up to 20FP4 performance.

(02:56):
Can you walk us through what makes this new architecture
fundamentally different and why developers should be considering
AMD hardware is their choice going forward?
So at the hardware level, there's, you know, the speeds
and feeds, right, like 20 petaflops of FB 4 is, is mind
blowing. That's like you're talking
petaflops. And there have been innovations

(03:20):
at every layer, even in the hardware, at the micro
architecture level, at how they are brought together in, in the
interconnects, the, the way the systems put together, the chip
is put together, power efficiencies across chips,
because we, we have a very unique triplet architecture
that's very good for, you know, distributed software system,

(03:40):
right? So you can do NUMA load
balancing, etcetera. And it's got power benefits
built in because you could turn off chips that you don't use,
you could, you could only power on what you need.
And so the MI 350 really brings the cDNA 4 architecture into the
forefront. And that's it's like decades in

(04:01):
the making, right? Like it's, it's experience that
has been built over the last, you know, few decades.
But now it's come to a point where AMD is able to provide the
cadence required to keep up withAI, right?
So we, we want to be able to deliver hardware every year.
So we did the MI 300, MI 325. Now we have the 350 series.

(04:25):
And like Lisa mentioned, the 400series right around the corner.
That's, you know, less than 12 months.
So and executing on that hardware cadence is just, yeah,
it's a, it's a machine that you need to like, you know, you need
to have it well oiled and buttoned on.
So I'm incredibly power of what the hardware teams being able to
pull off in executing on that hardware mission.

(04:48):
And I think what's particularly exciting is to see AMD also
start to have, you know, a biweekly software cadence that
is extremely well oiled as well.And we're going to talk a lot
more about that. But before we do, I want to ask
about this cDNA 4 architecture and what makes it special and
important as it fuels AM DS advancements around GPUs.
Yeah. So the CDNF 4 brings new data

(05:09):
types that are key for AI workloads.
We have FP4, we also have FP6. And one of the key innovations
in the cDNA 4 architecture is that the FP 6 throughput is as
it's almost, it's, it's, it's like an FP4 data type.
So what that means is you can doFP4 or FP6, but it gives data

(05:32):
scientists the ability to go from FP8 to FP6 as an
intermediary before you go to FP4.
And so you can move parts of your model, your training
algorithms dot FP 6 before you go to FP 4.
But you know, FP 4 is it's, it'sthe, it's the future in terms of
like where things are headed. And I think the CDF 4 positions

(05:54):
us really well in terms of supporting these advanced data
types and memory capacity and memory bandwidth.
These are two cornerstones of AMDS portfolio and we continue to
dominate in that area, right, both in the 300 series and 350
series. And then when we go to the 400

(06:14):
series, we have a very clear advantage over, you know, our
competitors road map getting to 350 series that are 288
gigabytes. That's, you know, you're
starting to get into like 500 billion parameter models that
can run on one GPU, right? And then you have 8, or you
deploy at scale, you're getting to very large deployment.

(06:36):
And, and like Sam and, and others on stage had alluded to,
we have deep partnerships with seven of the 10 top AI
companies. And it's really exciting to see
the generational investments that have been made now come to
fruition where it just clicks. And now we're like, oh, we got
that, that hardware and, and obviously we'll talk about the

(06:57):
software, which, you know, I'm super passionate about too.
Yeah. I do think it's interesting
though, to talk about this infrastructure investment as
laying the groundwork for all the software innovation that is
occurring. And I'll say like, I'm losing
track of the numbers already. I, I know I need to have a
better sense of them, but hearing, you know, Oracle be on
stage with y'all and say, you know, Zeta scale, I'm kind of

(07:18):
like, OK, wait, remind me what the is like, what is what does
this mean for it? So I guess my question would be,
what do you think will happen with this next layer of
infrastructure? Was it going to unlock for
developers? And that isn't possible today.
Right. So I, I, I view AI very
transformational. It's, it's, it's like

(07:39):
electricity. When, when we first had
electricity and the first Transformers were put up on your
street corners, people were like, oh, I don't know, what
will you do with that? You just had your, you know, oil
lamps and that, that, you know, you were thinking of replacing a
bulb, but then you realize it's transformational because entire
industries move to it and entireworkflows get, you know,

(08:03):
automated or, or differentiated and, and humans can do something
else, right? So to the point where even
electric cars are just, it's recent, right?
But the, so the, the innovation with the, the investment in the
infrastructure and the AI, you know, impact will take and it'll
be a few generations before, youknow, the, the entire impact of

(08:26):
what all it could affect. But we should view it as
transformational, as electricity.
And part of what's going to be fueling everyone to build upon
this transformational layer of new architectures, new GPU's,
massive scale is going to be Rock Em 7.
It's going to be AM DS developercloud.

(08:46):
Two big announcements coming outof the conference today.
And it clearly follows along this same route that AMD is
charting themselves down, which is, hey, we're going to be an
open source focus company. We're going to ship regularly,
and we're going to engage developers around the world to
help improve and speed up our innovation.
What does Day 0 support for leading models like Llama 4,

(09:09):
GPT, DeepSeek and others? How does that change the
developer experience for this open source layer for this AMD
developer cloud? Yep, Yep, that's a very good
question. The, the way I look at it is,
you know, even a few years ago, Rock'em support always was like
a port 2 platform. Like someone would launch a
model and then you go in and andyou try to make it and someone's

(09:30):
like fixing it. All the models that were
launched this year and last year, all the Frontier models,
deep Sea Lamas, Quens, all of them day 0.
It's fully supported natively asmuch as it is on the competitor
platform. What this means for the
developers that they can rest assured that they can work with
their developer flows on the latest models.

(09:52):
What it means for customers is that they're not left behind in
the AI revolution that, you know, AM DS invested, the
customers invested, the developers invested, the model
builders invested. And so one of the pieces of what
we didn't have good, you know, coverage on was cloud access to

(10:13):
AM DGPUS, which is why we launched the AMD Developer
Cloud. And it's really, really very
easy to just use your GitHub ID.You get in, spit up an instance,
and we even have 25 hours of free credits for anyone
attending AAI. And if you do, if you don't have
it, there's a little request credits.

(10:35):
So if we will get, we will, we will give it to you.
If not, tag me on X and I'll make sure you get some credits.
But it's also, it gives you a good life cycle of trying it
out, getting familiar, and then you can even deploy it there,
right? It's a real instance.
So we just wanted to make it. We want to make Rockham be

(11:01):
available everywhere and for everyone.
Speaking of X, it was interesting to hear from XAI on
stage over during the keynote aswell, and about how AMD is
helping fuel everything happening with Crock and so many
of these other incredible companies.
We've mentioned already that I'll say if if you're someone at
XAI you want to come on the show, you let us know.

(11:22):
What does it take to shift the culture of a legendary hardware
company like AMD from, hey, we're only fueling this hardware
layer to now, I mean, across theboard, everything from hardware
through the software layer, There's, there's so much going
on. It gets extremely complicated.
It's very different shipping schedules, very different

(11:42):
concerns. Some are more high consequence,
some are more less consequence, depending on where things are
being created. What it what is it meant to make
that cultural shift? So the way I look at it is there
were two things. One, you know, when AMD was
acquiring Nod dot AI, it was howI came to AMD Lisa, Lisa called
me to the side and said, Anush, think of it as nod is acquiring

(12:05):
AMD, not AMD is acquiring nod. And to this day, you know, the
principles of how I ran the startup for 10 years is how I'm
doing it at AMD and it it is resonating with developers.
We're moving fast and, and you know, to, to work hand in hand
with like the XAI folks. We, you know, I'd, I'd seen that

(12:27):
deployment from like go till live and, and we worked really
fast, really quick. And the deliveries were like
instantaneous. And a lot of that software
delivery mechanisms I built in in Chrome and Chrome OS when,
when I was working the Chrome OSteam in the early days of Chrome

(12:50):
OS from 2010 to 2013, that's when we were like, hey, mainland
has to be shippable. You ship every night, weekly,
you make some, you know, updatesto it, your test and your
quality buy increases as you deploy in scale.
And then you get like to the GA candidates.
It's not the other way around where your waterfall you're like
tied up in. So it, it, it's very interesting

(13:15):
to kind of like move the culturefrom what from one to the other.
But then once you get to the other side, it's fast moving and
everything is data-driven, right?
So it's oh, that thing failed that thing, you pull that out,
ship the rest, right? So, you know, it becomes very,
you know, responsive. And, and the way I think of how

(13:38):
AMD is looking at software now is software as a product right
until now there's like, oh, MI 200 MI 300 as a product.
And then there's it, it came from the lineage of BSPS, right?
Let's say, hey, here's a hardware piece, here's a
software piece. So that goes with it.
Go do what you must with it. But now we're thinking of it as
like Rockham. And, and the other piece that we

(14:00):
launched today was the Rockham Enterprise AI, which is Rockham
7. We got Rockham Enterprise AI on
top. We can do cluster management
capabilities, it's got ML operations capabilities and then
you got vertical integrations, you know, into verticals like
Health Sciences, etcetera, and putting all of them together.
In the end, if you don't give a solution for the customer, it
doesn't really matter. You're you're building some

(14:21):
parts of the puzzle and it doesn't matter.
So now we're taking a holistic view that covers the entire
stack, and we want to bring AI to the footsteps of the end
user. And part of that strategy is an
intentionally open approach, building an open standards OCP
design, Ultra Ethernet, ultra accelerator link.

(14:43):
What does this mean for AM DS vision of the future of AI to
focus on open source, so on? So open source software is 1,
but also going to open source open ecosystems is the next
level up where we are not makingan announcement saying hey, and
we like a fusion, right? And it's like everything is

(15:05):
open, but the chips will be built by us and you can, you
know, you can connect your thingin the periphery or something
like that. When we say open, we truly mean
open. And so we have chip companies
that are building like switches,switch companies that are
building Nicks, Nick companies that are building, you know,
unique connector technologies. We want the innovation to happen

(15:25):
at every layer of the stack. We're not trying to stifle that
innovation in terms of, you know, increasing our bottom line
for that particular case like, but obviously everyone's in the
thing to make money and everyoneto be successful.
But we wanted to be a, a holistic, open ecosystem
approach rather than a like it's, it's our, our way to move

(15:50):
the industry forward. And you have to follow our way
or you're out of the system. So if I'm a developer and I'm
building with Rokham, can I thencontribute to this open source
repo and say, hey look, that's me, This is going to end up in
something that you're actually dealt with AMD Yep, a. 100% So
one of the key things that we'vedone in the past few weeks is

(16:12):
the internal source repositoriesand external source repositories
are exactly the same. It is all external, right?
And So what that means is an as an external developer, you can
contribute any code changes you want.
And we actually take that seriously and merge it in.
For example, we'd launched the Strix Halo laptop and our team

(16:34):
was trying to get the Windows build of it ready.
But then we had a couple external contributors contribute
Triton on Windows, contribute Pytouch on Windows to the point
where it accelerated our abilityto like get Pytouch on Windows
ready because of these external developers who just, they just
bought a Strix Halo laptop. And I just want to do this.

(16:55):
And that's the power of open source.
If if we had, if you had flippedthe script, it would be like,
oh, we got a plan for 10 engineers that are going to sit
in a corner and try to make thiswork.
But here's one engineer half time.
He's just like, oh, I, I just want Windows Triton to work and
I'm going to do whatever it takes.
And over the weekend he fixed all of that and we're like,
great, now everyone's happy. OK.
What's your PR review process for this?

(17:17):
Very good question. So the the PR, so I do pull
request and press yeah, but but wearing my engineer head and,
and, and reading it as a pull request.
We're moving all of that review process externally.

(17:37):
So every developer who's workingon that area actually just
reviews it. So it can be from an internal
developer or an external developer.
Yeah. And so that that gives us
incredible velocity and we haven't even unlocked that
potentially we're just like getting the foundational layers
with the Rockham 7. But once Rockham seven hits, you
should see like Rockham 8910 be like 6 weeks release cadence,

(18:00):
right? It's like how Chrome, you don't
care whether you're running Chrome 138.
It's just Chrome. It's just you get the best, you
get the fastest, and then from 139 something happens and it
happens at night and you're like, great, I'll take that,
right. So we want to get Rockham to
that point. I love this comparison you made

(18:20):
earlier and this I guess advice it sounds like Lisa Sue gave you
saying, hey, look, we're not acquiring you, you're acquiring
us. You need to bring your DNA here
and change our company. And it's clear just from this
discussion that you've made suchan impact on how AMD is thinking
from a, a product velocity standpoint, from a philosophy
standpoint. And something I've heard you say

(18:43):
before is that software is a product that far outlives any
single generation of hardware, even that I mean, frankly,
because it evolves, right? Because, you know, Chrome is now
whatever number it is, and that AMD needs a software plan for
the next decade, not just a hardware plan.
How does this philosophy of software longevity shape your

(19:05):
open ecosystem strategy with Rockham and the other
investments that you're making? Yeah, very good question.
So imagine you're building something for the next 10 years
or 15 years, right? It it the investment required
for it just to do it in a closedlike, hey, only we're going to
do it. It'll be like funding the high

(19:27):
speed rail in California, right?It's and nobody else can do it.
Only the government can do it. Or, you know, but what if we
said, Hey, you can build this part of the track, you can build
this, you can build this. As long as, you know, the CICD,
the train can keep running safely.
We're fine. You just keep building it as
long as you want to go and, you know, build it where you want to
go. It really does unlock the

(19:49):
ability to build at scale. But build for longevity, right?
Because you want to have, you want to have the platform far
outlive generations of compute. So, MI 350, Yeah, it's the new
hotness. Two years, three years, four
years down the road, people willbe like, yeah, MA 250 is like

(20:11):
historic, that's fine. But the infrastructure that
you've built on it will continueto hum along, right?
And yeah, it evolves. And you want to make sure you
got backwards compatibility, forwards compatibility.
So people are investing in Rockham as a product.
I love this idea and this philosophy that you're bringing

(20:33):
because I think it's so interesting to see how these
different strategies are being approached by various companies
in this new AI era where we're shipping faster, hardware does
become historic faster, and we're all simply in the midst of
this IT insane revolution. You mentioned electricity as an

(20:55):
example. Folks, compare it to the early
Internet. How are you going to foster this
community LED innovation that you see unlocking the next level
of velocity and success for AMD around software?
Yeah, again, a very good question.
The the, you know, it kind of goes to how people say, hey, the

(21:15):
only constant is change, right. But it's, I would take it a step
further and say and, and, and, and the rate of change too is
going to continually improve. So it's not just that, hey, it's
going to change. Of course it's going to change
and not that important, you know, but my, my philosophical

(21:35):
view is the rate of change is going to change too, and not the
way you think it is. It's going to, it's going to
accelerate and being prepared toaddress that speed and, and the
velocity in which you're going to be and accelerating towards
that. You want to be able to be
prepared to maneuver. And that maneuverability comes

(21:58):
from a open ecosystem because you alone cannot drive that
train that fast. You're going to, you're going to
need everyone making the liftingall boats, right.
So that that's how that that that's the general philosophy of
how I think the tip of the sphere should be.
And I think if you talk to anyone in the AI space today,

(22:21):
we'll at least mention agents like I'm now going to force in
here because I, we all see it asa huge part of what that feature
looks like, at least for the next couple years, right?
There may be a paradigm change, there may be a change in how we
interface with these AI tools. But for now, agents are what
everyone is starting to build and or is already building.
And we're building multi agent architectures.

(22:42):
We're building massive groups ofagents that, you know, solve
problems together and can do cohesive tasks and solve
strategic challenges for businesses.
So being able to address and improve and align to this
agentic future is, is really important for most AI companies
today. So of course it was part of AM

(23:03):
DS keynote earlier. Your benchmarks now show that
there's a three-point 8X generational improvement for AI
agents and up to a 4.2 X improvement for summarization
tasks when leveraging AMD infrastructure.
How are you achieving those gains and what does this mean
for the performance, for the performance per dollar that AMD

(23:25):
is looking to deliver for customers and partners?
Yeah, I I think let's start withthe performance per dollar,
right? Like with the 350 series against
the competitors latest black hole platform, you're looking at
a 40% tokens per dollar savings,right?
40% that is just huge and and 40% adds up pretty quickly when

(23:47):
you're doing 250 million tokens a day or a billion tokens a day,
you know that that's that that translates to significant
savings and then that can be backed by anything.
It can be on Prem deployments, it can be CSP deployments.
But coming back to your questionon agentic future, I think the
agentic way of, of framing the problem is more about us

(24:13):
understanding it in a way, because an agent is like
autonomous in some way. It's it's a way of doing
something. But if you, if you peel all of
that back, it's how you're building intelligent autonomous
systems that could take the formof physical robots, It could
take the form of virtual robots,which is, you know, the agents,

(24:36):
you know, and we're, we're starting to see that future
where, you know, you don't want to be sitting on Kayak and
clicking, you know, plus -3 days.
Just tell the GPU operator to godo it or, or the agent to be
watching Kayak with through yourMCP server and, and say, Hey,
whenever it goes down this do this right?

(24:56):
And then add a natural translation layer, voice
translation layer and you're just interacting with, with this
agent. And then once it embodies itself
into the physical thing with a robot, then, you know, it starts
blurring the line of like, what is, what exactly is this agent,
right? And, and so it's, it's a, it's
an exciting future, but what allof that come down to is immense

(25:23):
compute infrastructure that AMD is investing in significantly
right now and the immense software infrastructure required
on top of it, which is also something that we're, you know,
doubling down on. I wish we had time for another
hour of conversation because you've shared so many great
insights and there's a lot of exciting things happening with

(25:44):
AMD. But I think we have a perfect
question to close on here, givenwhat you just said about
enabling software and infrastructure investments.
AMD announced today a landmark $10 billion agreement with
Humane Saudi Arabia's new AI enterprise to deploy 500
megawatts of AI compute capacityover five years, spanning from

(26:05):
Saudi Arabia to the United States.
How does this fit into AM DS sovereign AI strategy?
And what does building, as I believe I said on stage, the
world's most open AI infrastructure mean in practical
terms for global AI development and deployment, especially
aligning to these deep open infrastructure investments that

(26:26):
we've been talking about? Yeah.
So I think the philosophy of like the deep open
infrastructure investments are we actually are bringing a
consortium of innovators and companies that have the ability
to execute on different parts ofthe stack.
But then we validate everything together to make sure that you

(26:47):
do have the ability to execute the end vision of what it is
that you're trying to stand up, right?
And then the investment shows a long range plan, right?
Like because you are not just saying, hey, I'm just buying
some GPUs. It's you're investing in
infrastructure. An infrastructure build out
takes time. An infrastructure build out

(27:07):
effects people's lives, right? Similar going back to the
electricity investments, you know, do we do AC transmission
lines, the DC transmission lines?
Yes, The 1st 20 years it took like 10% of the GDP to put up
all the transmission lines. Well, the new electricity
investments we have to make for this AI infrastructure for that.
Matter. Yeah, now, now it goes back.
It's a full circle. It's like, oh, I got to go back

(27:28):
to my original thesis of like, how am I going to generate this
power? Maybe we do need whale oil lamps
after all. Well, I'm sure you know, when,
when we push human creativity interms of like solving problems,
we do come through. And so, you know, when we see
the value that AI unlocks and weknow that we need more of it.

(27:50):
And if power is what we need to,you know, figure out new ways to
do it, it'll be new geothermal, new power, new, you know,
whatever the way we're going to go push the envelope and and
find that power, we will and then, you know, tie it into
forward progress with innovationin AI.
Absolutely. It's a very exciting time and

(28:11):
Anush, I really appreciate you taking time out of your busy
schedule and taking time out of this incredible event to join us
here on Chain of Thought. We appreciate you and the team
at AMD giving us a look behind the curtain here at Advancing AI
2025 and sharing your vision forthe future.
Thank you for tuning into this special episode of Chain of
Thought. We'll have more for you very
soon. You may hear it in a couple

(28:31):
minutes after this interview wraps.
Anush, thanks again and thank you for hosting us at Advancing
AI. We're broadcasting live again
from AM DS Advancing AI 2025 event, where there is so much
energy about improving the experience for developers with
AI and about the open ecosystem.I'm your host, Connor Bronson,
head of Developer Awareness at Galileo, and we have a very

(28:52):
special guest joining us today. It was a fun surprise for us,
actually, directly from the heart of the action we have
Sharon Jao. Sharon is the former CEO and Co
founder of alumni and now Vice President of artificial
Intelligence at AMD. Thank you so much for joining us
on Chain of Thoughts, Sharon. Thanks so much for having me.
It was so cool seeing you come out during the keynote earlier.

(29:13):
I was like, wait a second, like I'm pretty sure I follow you on
LinkedIn and I know the AMD teamis particularly excited to have
you joining. And the knowledge that you bring
from teaching over 1,000,000 people about AI is so important
to the DNA of a company like AMDthat is going so deep into the

(29:34):
open source ecosystem, so deep into what is going to be a lot
of education, engagement, community work.
What are you gonna focus on as you dive deeper into your time
here at AMD? Yeah.
So it's a combination of AI research and teaching, which is
kind of what we were doing at Lam and I as well.
I think the thing I'm really excited about on the teaching

(29:55):
front is making, you know, Rockham and all the technol, all
the software that we've been building at AMD much more
available and accessible to developers.
And I think that's going to be acombination of showing, hey, all
the latest AI stuff, whether it be vibe coding agents or
reinforcement learning, all thatruns on AMD just fine.
And not just fine it maybe it's optimized on it, right?

(30:18):
So I'm really excited to show show that and do that with some
of the biggest names that we've been working with already, like
Andrew Ring at Deep Learning AI.Yeah, if you have not had an
opportunity to hear about what AMD is up to with deep learning
and what sharing is up to with deep learning, there's a lot
more to come there. And maybe she'll tease a bit of
that here. We'll see.

(30:39):
But if nothing else, this focus on teaching and on research and
on understanding developers and other AI builders, data
scientists, and what they need. It is so important to fueling
this increased product development velocity that this
intentional open source strategyAMD is taking is meant to fuel.
As a new shared with me in an earlier conversation note, we

(31:04):
have developers who are already contributing to the repos that
are driving AM DS rocking forward.
What does this mean to actually enable the community though and
make it easy for them to contribute, have them feel a
desire to do so, and to see it pay off for AMD D's actual
product? Right.

(31:25):
I mean, I think the first step is listening first listening to
the community and hearing what they want and then helping also
on the other side, building out what we call a happy path or
kind of a these are the three steps to succeed on AMD so that
people can see that success or aha moment really quickly.
And I think that's really important, right, to be able to

(31:45):
see something working immediately in AII Actually
think attention spans have gone down quite a bit with AI think
we just want a prompt. I think my cursor prompt is
literally this the the minimum number of tokens is like a
question mark. And I just you know, that's it.
The question mark. Yeah, exactly.
So, so I think, you know, we're a little bit less, less patient

(32:06):
on that. So being able to show the roll,
roll out the red carpet or show the Yellow Brook Rd. to follow,
I think that's really critical. Yeah.
So that's what we we'll be doing.
Moment to wow is so important because it's it's really easy.
I know I've been guilty of this.I'm sure everyone listening has
been where you go, oh, this product sounds cool.
Let me go try it out. And like, I don't actually want

(32:27):
to spend 30 minutes setting thisup.
I just need to try it. I need to get going and you jump
off and do something else. You maybe forget about it, you
maybe pick a competitor that's easier to to jump into.
And so I love that you're starting with listening and
with, I mean, the qualitative research into what your users,
what developers were building. I actually want.

(32:48):
How will you approach that listening to her?
Oh, so many different ways. I think there's so many
different formats to listen. One is through, you know,
through talks. I had given over 50 keynotes
last year through Lam and I yes.So I think talks actually afford
the ability to then you know, asyou get off stage talking to
people, people reacting to what you have to say, asking

(33:12):
clarifying questions. I think teaching very much gives
that dialogue, that opportunity to have that dialogue as well.
So these are kind of the different avenues.
And of course through the repositories you can open up
issues, et cetera. But what we're really focused on
is almost, you know, this framework of pre believers
versus pre buyers. So, you know, first you have top
of the funnel pre believers, people who don't yet believe and

(33:36):
getting them to become a believer.
And then once they're a believer, they're they can
become a pre buyer, give you a shot and then become a buyer,
become a customer. So really focus on the pre
believers and listening to them,hearing like, what will it take
for them to have that bit flip and switch and, and just say,
you know, I'm going to give thisa shot because this will make a
very big difference to my business, my workflow, whatever
it may be. And something I'm really excited

(33:58):
about is that AMD is very actually differentiated in the
market, not only from the open standpoint, right?
So like engaging with open source community enables this
whole new strategy of it, you know, accelerating their ability
to to catch up, but also having a different, you know,
heterogeneous compute fabric between GPU and CPU.

(34:20):
And as we, you know, build out more tools, for example, a lot
of people are using agents thesedays, these agents, right?
They are LLM calls, they run on GPU, but the tools they use
through MCP model context protocols, for example, many of
the tools they use are actually running on CPU.
So how do we actually balance those loads effectively moving

(34:40):
forward? And I think AMD is in a really
interesting position to balance that effectively because they
they own and and can do a lot ofthe integration work between
between those types of compute. Yeah, that vertical integration
opportunity is so interesting and it's such a a unique
perspective to have in the space.
I'm curious to see how the open source contribution side of

(35:03):
things factors into this integration with GPUs and CPU's
and the customization offered there.
Are you already seeing the benefits of of opening up the
AMD software stack to the open source community and developers
around the world? I mean, 100%, you know, first,
First things first, from the course perspective, this makes
it a lot easier for people to even learn or even show a

(35:26):
demonstration of how are you going to even learn GPU
programming if there's nothing open to look at?
You're you're kind of like touching around a black box and
not really learning what's goingon.
So even just understanding what's going on inside of it and
getting people curious about this technology.
I think, I think that is that isnumber one.
That's something that's on my mind at least, Yeah.

(35:47):
Let's dive in there a bit more. I'd love to understand your
thought process or strategy around how do you drive that
curiosity? How do you enable that
curiosity? And as you brought up earlier,
how do you, you know, lay out that yellow brick road for them
so they can learn and, you know,build their first agents?
Yeah. No, I, I mean, I think it comes
from understanding what trends there are today in terms of what

(36:09):
developers are building, but also what trends that, you know,
those will evolve into, like what workloads realistically
those of will evolve into. And then kind of matching those
with internally where, where we've been able to shine as AMD
right from the hardware and software perspective.
So right now there's a lot of focus on inference and being

(36:30):
able to actually make that workload really effective and
reliable and efficient. And so how do we actually engage
developers on that specifically today, right?
It's more ready than than training right now.
So how do we engage them on thatwhere they will succeed, be more
likely to succeed and and go from there?
It doesn't have to be, you know,boiling the whole ocean all at
once, but finding like where you're going to see that aha

(36:51):
moment, that wow moment soonest.And do you have a thought
process so far on kind of the key areas to drive that aha
moment or are do you still feeling fairly nascent and your
research there? Yeah.
So I have a few different thoughts around what will help
drive it, but I think right now it's, you know, a combination of

(37:15):
the three different audiences that we're we're looking into.
One is AI developers. That's probably #1.
The second is AI researchers. AI researchers are, are helpful
because they're a little bit lower level and, and more
willing to try something experimental, right?
So maybe that's where we're touching, you know, training
workloads, for example. And then the third is what I

(37:36):
call AI leaders. But anyone who's kind of
thinking about it, maybe it's someone within an enterprise
leading, leading AI, how are they thinking about their
infrastructure, budget costs, etcetera.
So like these are the three audiences that I really think
about these personas that I really think about how do we
serve them? Because ultimately, they're
going to be the ones making decisions on on compute, and

(37:57):
they're going to be making decisions on at every single
layer that will impact what kindof compute should be built to
serve them. What does differentiated
teaching and learning about AMD's open infrastructure, open
software ecosystem look like forthose different audiences?
Yeah, so the developer 1 is veryeasy to talk about because we're

(38:21):
already working with Andrew and his team.
We already have been over the past year.
In fact at Lam and I we were running on over 300 AM DGPUS and
we're actually serving multiple courses, three of them with
Andrew and and one of them in partnership with Meta that were,
you know being served up with those with those GPU's and that
was both inference and training actually.

(38:42):
So that's been really cool to see and that's 10s of thousands
of the developers already hitting AMD GPU's over the past
year. So that's been really exciting
to see. So it's doing more of that with
the things that are new today, probably around agents, probably
around that, maybe MCP like the the second audience around
researchers. It's engaging largely with like

(39:04):
the different labs, whether theybe commercial or university to
be able to start testing maybe new hardware, etcetera, and
starting to run their more nascent workloads there or even
their experimental ones. You know, maybe a new model
architecture, for example. I'd like to see, for example, a
new model architecture be invented on AMD that would be

(39:25):
really cool and taking advantageof the benefits and the
differences on on of AMD hardware, for example, larger
HBMI don't know. So that's that's another thing.
And then for AI leaders, we've been working with Lam and I, we
had for B to B enterprise. So we, we talked to a lot of
Fortune 500 executives and as a result, we have a lot of those

(39:47):
relationships. We have a lot of relationships
with a size that are kind of in between.
We have relationships with different platforms there.
And so I can't speak to some of the brands there yet, but
they're, they're big and and we're we're working with them as
well. So.
Sounds like the the base advice though for developers.
Go to deeplearning dot AI and check out these courses with

(40:07):
AMD. That is, that is, that is the
the base lesson, OK. That's a good base lesson.
You mentioned MCP and it's obviously kind of the new
hotness, right? Yeah.
Anthropic released it to not a lot of fanfare.
End of last. Year.
Not in the beginning, no. But by, you know, March, April,
we really started to see this momentum and you know, now mid

(40:28):
June, it feels like it's on everyone's lips.
Do you think MCP is going to winout as one of the frameworks of
choice in the next year or two? So 1, I think it's really,
really important to highlight MCP as an open right protocol
and therefore there can be a standard.
So the benefit of open in general is that it can be a
standard. When we had worked a lot with

(40:48):
Meta at Lam and I, that was thatwas their, their big thing.
They're like, this is so that wecan actually set a standard for
the community. So I think that's one thing that
MCP is has, has been able to show a glimmer of.
And I think because it's taken off in the sense that the
community was itching for a standard since things were so
customized, it did take off. And Open AI also has, you know,

(41:11):
kind of endorsed it. And I think that's a really big
deal with the big model players endorsing it, it being open so
that everyone can contribute to it and see into it.
If there's only a closed option,you don't really feel like you
can one be able to see into it at all.
It feels locked inside of a certain company.
And then I think the the second thing is, I know developers were

(41:33):
talking about like, well, this enables us to really customize
it for different security needs,for example, that will emerge
and change and evolve over time as AI continues to grow and
evolve as well. So I think it's really important
that there is an open standard and I think this is a a really
good, really good first shot outone.
I completely agree. It's going to be really exciting
to see what ones out. But I I agree it has to be an

(41:56):
open standard to to truly succeed.
And to me that speaks to this philosophy that AMD is taking.
I'm saying we're going to be very open too and we're going to
align to these open opportunities.
And it also speaks to the partnership development work
that has happened. There are a lot of incredible
partners that were on stage earlier today with these huge

(42:16):
keynote announcements. Even seven of the 10 largest AI
companies in the world are are working at the AMD.
And I'm curious if there's a particular partnership that you
think aligns best to this philosophy of like listening
developers, enabling developers and moving forward together in

(42:37):
an open standard? That's such a good question.
So I think for the big foundation model labs, that's
largely the problem with the AI researcher persona.
For the developer persona, I think it might be closer to some
of the AI native startups out there and what they're building
and how they're scaling things up.
So maybe they're not doing crazypre training workloads, but they

(42:58):
are doing substantial workloads that will eventually I think
affect GDP quite substantially together.
Yeah. So I think that'll be a really
important set of folks who are building at the bleeding edge
and they define what the next trend is too of how to even use
these models based on what's easy for them to build and what

(43:20):
what the needs are in the market.
So I think it's really critical to be listening to them and and
building for them there. And that's where I think that
alignment with Llama and with Meta is really interesting as an
opportunity. But there's so much more that
was talked about as you look towards 2026 and beyond, what
are the bottlenecks that we may experience while we're trying to
build this incredible open infrastructure?

(43:42):
So one of the big challenges of building in the open is that,
you know, when I said happy path, you want there to be a
happy path, something prescriptive.
So people are actually doing thething that gets them to success
versus frantically trying a bunch of things, just seeing
documentation and getting a little intimidated and not
knowing where to start even and might not even be up to date,
etcetera. So it's just knowing what the

(44:04):
right path is is really important.
Now, the challenge of open is that you're inviting the whole
community to come, come contribute.
And as a result, you can create almost a monstrosity of like way
too many features tacked on. I'm in trying to go in too many
directions and not having One Direction that's opinionated and
correct. And I think that can be
challenging for generally for open.
I think we've seen that with AI projects in particular because

(44:25):
things can move very, very quickly in the space.
So I think that's one of the challenges candidly.
And that will be one for us to monitor and to balance out with
this idea of happy path and making sure we articulate that
very well and kind of usher people towards a happy path or
learn what the happy path shouldbe and then usher people there.
Totally. And Sharon, you've obviously
been a big part of the AI research ecosystem for several

(44:48):
years now. You've been a big part of the
open source ecosystem for several years now and obviously
teaching as well, which we've talked extensively about.
Are there predictions that you have from your position as
somebody who's embedded within the industry and also a thought
leader for what the next year ortwo of AI development and change

(45:09):
will look like? Yes, I've seen a lot of glimmers
of this in in very different ways.
Some are research papers of showing, you know, a a smarter
model, maybe you know, distilling it down to a smaller
model, but teaching the other model things.
Alum and I, we really care aboutthis mission of self improving
AI. So getting these models to

(45:29):
improve themselves, edit their own training data and improve
themselves. And so I think there's this
growing trend of that process being more and more automated.
So getting these models to actually improve themselves over
time and getting that flywheel out based on what, you know,
what direction or objective thatwe want them to go towards.

(45:49):
So I'm really excited about that.
And I think that can happen at every layer of the stack.
And with AMD, that could happen even at the lower layers, right?
Being able to optimize kernels, for example, being able to
optimize all those different things to make the model itself
more efficient and more efficiently use its own compute.
And I find that really, really exciting because I, I think that
can take us to the next next generation much more quickly

(46:10):
than if we were just developing on our own with a limited number
of AI researchers out in the world.
I get excited about that idea of, you know, continuous
learning loops and self improvement opportunities, but I
think it's also something that maybe makes some skeptics
nervous and probably not a lot of them are listening to this
podcast, but I know there are a couple.

(46:31):
What would you say to the folks who who hear, hey, we're
creating self learning AI that get nervous about that idea of
like, oh, we want human direction.
What would you tell them? Oh, I see.
I would say that I, I think as these models get better at
listening to us and what we need, which we can already see,
like the way you prompt a model can be more and more casual,

(46:52):
right? The question mark works, for
example, or yeah, something way more casual works.
Before you couldn't misspell things actually.
And you had to even before ChatGPT with just GPT 3 had to
be like question, answer question.
You have to format it a certain way.
But they're much more malleable now.
And I think as long as we kind of keep that as a UX, the user

(47:12):
experience, the interface, it provides this interesting
opportunity to give what I like to call vibe space feedback.
So not just vibe coding, but canwe like vibe, tune, vibe train
these models, but vibes based feedback so that we can actually
give our natural language feedback and direction in a way
that where we know it's not as strict in a way where it's much
closer to how we teach each other different things or how we

(47:35):
direct each other to different things as humans, but also to
these models. So I feel confident that we'll
find a way to to nudge those models in the right way where it
won't go like often flywheels, like you can actually intercept
it just like a person who's learning and, and redirect it in
a bit and do so with that, with that prompt, with that natural
language. I think your point about natural
language is important to understand here because we need

(47:58):
both qualitative and quantitative measures around
this. And I love that you're thinking
in both directions. Sharon, I wish we had more time.
It's been so much fun chatting with you and I know our
listeners would would love to know where they can follow your
work and continue to watch what you're up to in the AI space.
Where can they follow you? You could follow me on X or on
LinkedIn, or tune into some of our courses with Andrew.

(48:22):
Fantastic. Well, we will certainly link
those in the description for theepisode.
Thank you so much for joining us.
Thank you to AMD for having us. Once again, sorry to cut you
off. I apologize.
No, it's. OK, it's been a ton of fun being
here at AMD Advancing AI 2025. We're excited to see what's next
and to see this continued open source ecosystem develop.
If you are a developer who's tuning in, we'd love to hear

(48:44):
from you. What are the pieces of the
ecosystem that you want to see more open?
What do you want to contribute to?
I know AMD would love to know. I know Sharon would love to know
as she continues her listening to her.
And obviously, we love hearing from our folks who are listening
to the podcast. So let us know what you're
thinking. And if you enjoyed this episode,
share with a friend. Take a problem here, Sharon.
So thanks so much y'all.

All Episodes

AMD's Challenge to NVIDIA: The Open Ecosystem Bet | Anush Elangovan & Sharon Zhou

Episode Transcript

Popular Podcasts

Stuff You Should Know

Las Culturistas with Matt Rogers and Bowen Yang

Crime Junkie

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}AMD's Challenge to NVIDIA: The Open Ecosystem Bet | Anush Elangovan & Sharon Zhou