Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:02):
Bloomberg Audio Studios, podcasts, radio news.
Speaker 2 (00:07):
Now, let's narrow our focus from the broader markets to
one single stock. Amazon, the tech giant, hosting its annual
Amazon Web Services Reinvent Conference down in Las Vegas this week.
The Cloud focused confab draws developers, engineers, and other thought
leaders in tech to explore the latest cloud and AI
projects happening under Amazon's roof, including a new AI chip.
(00:29):
Let's go live now, we're Bloomberg Tech co host ed
Ludlow is joined by a special guest ed take it away. Yeah.
Speaker 3 (00:39):
Three pieces of news move markets this morning. A new
generation of Frontier model from AWS new agentic tools, and
then a very quickly released, installed and now ramping generation
of in house custom accelerator which is Trainium three. All
points of discussion for Matt Garman, AWSCO. You know the
(00:59):
base point with Trainium three and you've moved quickly to
bring it to the real world is cost performance efficiency
over the prior generation, but also over in Vidia GPUs,
over Google TPUs. I think what people are trying to
understand is that ramp part I was talking about when
real world customers use it beyond this anchor customer of Enenthropic,
(01:22):
which relies on it currently.
Speaker 1 (01:24):
Yeah. Well, look, we're quite excited about Trainium and Trainium
three in particular, as you mentioned, excited to get it
into customer's hands. And part of where we have a
benefit that we can bring to bear is, as you mentioned,
getting it into market quickly, and it's because we control
that full stack, We control the silicon development, we control
the data centers that all land in. We know that
full environment, and we can land that in very large
(01:45):
clusters for people to take advantage of that, and the
performance that we're seeing out of it is quite incredible,
and so we're anxious and excited to get more and
more people using it.
Speaker 3 (01:54):
I've been able to go inside out a Perna Labs
and look at the engineering work between the first generation
of Trainium and second It wasn't just the accelerator at
the server level as well.
Speaker 1 (02:04):
That's right.
Speaker 3 (02:05):
But a part of the surprise of today is this,
you appear to be committing to an annual cadence a
new generation of Trainium. How do you keep that up?
Speaker 1 (02:14):
Well, the key thing that we're focused on is making
sure that we can iterate on the technology as fast
as possible. The desire and the hunger out there for
more power and more compute is almost insatiable. And so
the more we can take an existing power footprint, an
existing set of capabilities and bring more and more compute
(02:34):
into that for customers to build cool applications and cool
environments and to get value from that, that's we're focused on. Then,
so we're going to be pushing that envelope as fast
as we as we possibly can to get those new
and new capabilities out to customers.
Speaker 3 (02:48):
The pitch for Trainium in both the training and inference
use case is that it's a great deal, you know,
cost effective performance. At the same time, you went on
stage and said AWS is quote by far the best
place to run in Vidia GPS how above if possible.
Speaker 1 (03:05):
Well, I mean that both both are possible because that
is a great environment to run accelerators and compute in.
And so we've been working for fifteen plus years with
the in Nvidia team and Jensen and team to deliver
outstanding capabilities for our customers and for when you're running
a large cluster of Nvidia GPUs, people will tell you
(03:26):
AIGHTWS is the best place you get the best performance,
the most stable cluster the best capabilities out there and
broad scale, and it's why folks like OpenAI and others
are running in AWS and we have that choice. And
so for others that want to be able to take
advantage of Trainium, and there's some use cases that are
best for Trainum, there's other use cases where in vidio
GPUs are going to be your best option. We want
(03:47):
to have all of those available, and so we think
that if we can continue to push the envelope on
what Trainium can deliver for customers and make sure that
we are supporting the latest and greatest from everything that
the awesome team in Nvidia is delivering, that's going to
be the best outcome for our customers.
Speaker 3 (04:02):
The plan for AWS is to basically double capacity by
the end of twenty twenty seven to round eight gigawatts,
so you have a sense of how you apportion that
capacity in how silicon and server designs to traineum versus
and video gp is.
Speaker 1 (04:18):
We're just going to keep pushing as fast as we
can and we'll see where customer demands drives us as
we go. And as you said, we're massively adding capacity.
In the last year alone, we've added three point eight
gigawatts of capacity, and we'll continue to add more and
more as over the next couple of years, and we'll
let customer demands drive us a little bit on what
they're looking for and what they want, and that's what
(04:38):
we always listen to and that's what we'll continue to
listen to.
Speaker 3 (04:41):
The focus with Trainium in the time I've been able
to interact with you and talk about not again not
just the accelerator, but at the server design level, there's
a lot of benefits the customer. When does that benefit
start accruing to AWS in terms of profitability, Like if
it's such a good financial proposition, you must be able
soon to say when making a lot of money on this.
Speaker 1 (05:01):
Yeah, Well, you're already seeing some of the benefits of
crew You see things like bedrock growing really really rapidly,
and you see trainingum powering that under the covers, and
we announced today that more than half of all tokens
and inference done in bedrock are done on TRAININGUM two
servers under the covers, and so you're already seeing that
benefit come. You see the models that we're building in
Nova and Nova two start to get better and better
(05:23):
over time and be accelerated by Trainum, and so we
really think that there's a whole bunch of dimensions on
which both our customers, our partners, and our own products
are going to get accelerated all from Trainium.
Speaker 3 (05:34):
Every time you come onto the program, I always offer
the audience opportunity to pose a question to you. There's
a lot of interests in AWS right. Many of your
customers span global technology. Actually most of the questions were
about anthropic. That wasn't much said on stage. I think
people are trying to understand what is the benefit and
advantage AWS Office to anthropic while they are ramping Trainium
(05:57):
through Project Raineer, but also ramping their tea allocations as well.
Speaker 1 (06:01):
Well. Look, our partners are an anthropic. Our partnership with
them is incredibly strong and it's never been stronger, and
we do a ton of collaboration with them, and as
I mentioned through Project Right here, it's a huge collaboration
there to go build their current generation models and all
their models run today and launch on day one on
top of Trainingum and on top of AWS which we're
incredibly excited about it, and we'll continue that partnership for
(06:23):
a long time. I think from them, they have a
huge demand for compute, and so they'll go to other
places where it makes sense to round out their compute
needs because they just have such massive needs for compute,
and they have customers in other clouds as well, But
we're definitely their their primary cloud provider and closest partner.
For sure.
Speaker 3 (06:42):
Supply constraints so am Fropic is supply constrained, they can't
get the compute they need. We've talked about the rampont
and video GPU and in house silicon. Is there a
supply constraint element with AWS so you able to get
the chips that you need.
Speaker 1 (06:57):
Yeah, I think there's always anytime you see an industry
that's growing as fast as this is right now, when
you think about AI and model development and chips, there
are going to be constraints. No matter what. There is
more demand than there is supply. Sometimes it's in chips,
Sometimes it's in power and data centers. Sometimes it's in
you know, different parts of that. At some points it's
(07:19):
you know, networking equipment. At some point it's transistors, you know,
resistors or whatever it is, and you look at the
entire supply chain that is needed to ramp up at
such a massive rate right Never before has the technology
industry ramped at the rate that we are right now,
and so there are always constraints, and so it's not
necessarily that there is necessarily one constraint where it's like, wow,
(07:40):
I can't get in vidio chips. We can get in
video chips. And actually Jensen team have been incredibly supportive
and great partners and helping us get capacity there. It's
not that you can't get power. We're getting power all
over the place. But it's just we're ramping all of
these places in such rapid rates that always there's a
constraint in that system, and it'll change every month you
ask me of what the current one is.
Speaker 3 (08:00):
Throughout the day, we was just speaking with your team
about the idea we're moving from AI assistance to AI
co workers. You know, particular focus on the agentic offering
that you've done. You're in the camp of people, if
you don't mind me saying that sees basically ninety percent
of the value in enterprise coming from agentic technology. Do
you have any data or evidence to support that all
of your customers are ready for that?
Speaker 1 (08:21):
Yeah, I don't think all of our customers are. You
get ready for that, but they're excited about it, So,
you know, I think it'd definitely be an overstatement to
say everybody's ready for it. And part of that is
because it is going to take change. Right, people are
going to have to change how they think about work.
They're going to have to change their process flows, they're
going to have to change some of the things about how
they get work done. It's not just going to be
a magic one that's going to come in and magically
get them to get value. But almost everyone that I
(08:44):
talked to you definitely sees that that's the path. These
the agentic power of the power of agents is what
allows customers to actually get that work done. And when
they see that efficiency gain, they see them able to
accomplish things they weren't able to do before. That is
when it's worth it to go make these changes. And
so there's going to be work for people, and it's
going to take some time, right, It's taken We're twenty
(09:04):
years into the cloud journey and still only a fraction
of workloads have moved to the cloud. So it's going
to take time. It's not like the people are going
to magically switch. And I think it's going to be really.
Speaker 3 (09:12):
Fair that we just have sixty seconds twenty years into
the cloud journey. When I touched down in Vegas. Everyone
accepts AWS number one in terms of scale infrastructure. They question,
is AWS number one in AI? Just in the thirty
seconds we have left? Yeah, I think I'll give it.
Speaker 1 (09:28):
It's a question that we got a lot two years
ago and not that much a year ago, and today
I don't think we get that nearly as much. It's
just people that are kind of playing the same tapes.
We have a huge choice of models. We see when
customers are actually moving their workloads to production, they want
to run those AI workloads on AWS, and that to
me is the biggest signal. When we see our customers
they say, I ran proof of concepts in a lot
of places. When I want to move to production, I
(09:50):
want to run on AWS. And that's the thing that
we hear over and over again, which makes me think
we're actually in a great position.
Speaker 3 (09:55):
Matt Garman, AWS, CEO with the Full Stack AI Company,
pitch here Vegas at reinvent