All Episodes

November 6, 2024 8 mins

Picture a future where AI not only writes a quarter of Google's new code but also reshapes entire industries with unparalleled cost-efficiency. This is not a distant possibility but a current reality, as we unpack the transformative power of AI in our latest episode. We promise a comprehensive understanding of the staggering $200 billion investments by tech giants in AI data centers, a move redefining infrastructure landscapes. Discover how the might of GPUs is propelling AI models to new heights, amid challenges of chip shortages and the complex dance of data availability. Delve into the intricacies of tokenization and its critical role in processing data for these modern AI systems.

But what does this mean for the future of energy and data centers? We explore the looming energy challenges, pondering the fate of investments if tech efficiency outpaces demand by 2030. Could relentless advancements, like Nvidia's impressive GPU energy reductions, lead to stranded assets? As data centers grapple with these energy demands, we speculate on how future innovations might reshape their power needs. Join us as we examine the evolving strategies for energy acquisition and set the stage for further discussions on the interplay of AI, computing capabilities, and the energy landscape.

Support the show

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
In our first session, we reviewed some of the
staggering projections for datacentral load growth.
Today we're going to talk aboutchips, compute and power draw.
But before we get started,let's quickly address four
interesting AI-related factoidsthat have come out in just the
past week since our last video.
First, exelon's CEO says it hasseen what it terms high

(00:21):
probability data central loadjump from 6 to 11 gigawatts this
year.
Second, google CEO indicatedthat over 25% of new code in the
company is being generated byAI and then reviewed by
engineers.
Many observers ask where's themoney in AI?
Well, google eating its own dogfood in this coding case is a
pretty solid use case.

(00:41):
Programmers don't come cheapand programming is probably not
a good industry to be in thesedays.
It won't be the only one whereAI substitutes for human labor,
and that's a huge part of the AIvalue proposition.
Third, the Financial Timesestimates that spending on AI
data centers for the big fourAlphabet, amazon, meta and
Microsoft will exceed $200billion this year and be similar

(01:04):
next year.
And fourth, the Federal EnergyRegulatory Commission just
rejected a request from AmazonWeb Services to expand a
contract involving co-locationof a data center and consumption
of power for that data centerdirectly connected to Talon
Energy's Susquehanna nuclearpower plant in Pennsylvania.
More on that in another video.

(01:25):
Okay, with that out of the way,let's talk compute.
I noted in our last sessionthat compute capabilities for
chat GPT have soared by fourorders of magnitude, that's
10,000 times in just five years.
Why and how did that happen,and why are these guys burning
through hundreds of billions ofdollars in cash?
Well, the key goal here for thecompetitors in the space is to

(01:47):
improve the quality of theirlanguage learning models so that
they can deal with more complexlogic and increased overall
accuracy.
The models do this by trainingon data as much as they can get,
with machines as powerful asthey can make them so-called
scaling.
One of the best independentanalysts, epic AI, indicates
that in recent years, thecompute capability has been

(02:08):
growing at a rate of around 4xannually, a growth rate that
vastly outstrips othertechnological surges such as
cell phone adoption.
The natural question to askhere is whether this growth will
continue at this torrid paceand also what are the
implications for our power grids.
Epic AI looks at this issue byexamining each of four

(02:29):
underlying factors.
First, power availability oursweet spot that we'll talk about
a lot more later.
Second, global chipmanufacturing capacity and
availability.
Third, what Epic calls thequote latency wall, unquote the
limit resulting from delays inAI training computations.
And fourth, the availability ofdata to train on.

(02:50):
Let's have a brief look atitems two through four, since
we'll deal with power in its ownsession.
First, let's discuss the chips.
They are in high demand thesedays.
By the way, these are gameprocessing units, gpus, rather
than your typical centralprocessing unit, cpus, that have
typically been used in mostdata centers in the past.

(03:11):
Gpus are used for games becausethey're powerful.
Games require computers to beable to render millions of
pixels simultaneously and makethousands of calculations in
parallel at the same time.
Gpu power and speed also havehelped game makers avoid
blurring of frames.
Processing changed the AI gamesince, like game makers, ai data

(03:39):
centers need a machine thatcould process huge amounts of
data in parallel, performinghighly complex calculations at
rapid speeds and far moreefficiently than CPUs ever could
.
Those chips also keep gettingbetter, but they're in high
demand.
They're also expensive, both todevelop and buy.
Leading chip maker NVIDIA'snewest Blackwell chip cost the
company about $10 billion todevelop and create, and buyers

(04:02):
are anteing up $30,000 to$40,000 per chip.
Yep, that's right $30,000 to$40,000.
And that same Blackwell chipdraws between 700 watts and 1.2
kilowatts, depending on theconfiguration and cooling
strategy.
Yep, that's right as well.
Nvidia currently has about 80%of the GPU market share,

(04:23):
followed by AMD, and right nowthe industry cannot keep up with
demand.
But Google, amazon, meta andMicrosoft are all at work
developing their own chips sothat supply strain may
eventually ease.
Next let's look at this thingcalled latency wall.
This area is way out of mystrike zone, but I'll try my
best.
It takes a certain amount oftime latency for an AI model to

(04:46):
process each data point, andthat latency increases as the
model size and complexity grows.
Models train by separating datainto batches that can be
addressed in parallel, but thebatches can only be so large.
At a minimum.
Each AI training run takes aslong as is needed to process a
batch.
The more batches you have toprocess, the longer the run

(05:09):
takes.
Today's latencies aren't thatbig Batches can be processed
quickly, but at some time in thefuture that big batches can be
processed quickly, but at sometime in the future, as training
runs get longer and models getbigger, this becomes an issue
and efficiencies may fall off.
So scaling could become anissue limiting future growth
rates.
Finally, let's look atavailability and potential

(05:29):
limitations of data itself.
Ai data centers train on dataEverything we've ever posted to
LinkedIn, facebook or Insta,youtube videos, scientific
papers, movies, tv shows, stupidclips on TikTok all of it.
To understand what's going onhere, we have to understand the
concept of a token.
That's the smallest elementinto which text data can be

(05:52):
broken down in order for AImodels to process it.
So, for example, juliet'splaintiff query wherefore art
thou Romeo would be representedby seven tokens, or at least
that's what perplexity AI toldme when I asked it.
It broke wherefore into twowords, but did allow for the
possibility that it might betokenized into six elements.

(06:14):
Instead, four or five wordswherefore art thou, as well as
the comma and the question markFor images, audio chips or
videos.
Computers typically break theseinto smaller patches for
tokenization.
One picture or one second ofvideo might represent 30 tokens.
Get it Now.
These tokens, which essentiallyserve as the link between our

(06:36):
languages and images and formatsaccessible to computers, can be
processed at lightning speed,but computers can only handle so
many at a time, so models needto be continuously optimized.
It's estimated that the webholds about 500 trillion words
of unique text, which may growbetween now and 2050 by 50%.

(06:57):
Add in images, audio and videoand you might triple that.
So as much as 20 quadrilliontokens might exist for computer
training by the end of thedecade.
That's a lot of raw material,but projections are that, with
ever faster computers and moreefficient algorithms, we might
actually run out of data totrain from as soon as 2026.

(07:21):
What then?
Well, the current thinking isthe machines will then learn to
generate their own syntheticdata.
How Well, for example, somemachines have learned to play
games or solve mathematicalproblems through the process of
training on data they themselveshave created.
Or the machines could findother ways to learn.
This uncertainty leads to acritical question for utilities.

(07:41):
What if we build all thisinfrastructure and then, by 2030
, there's less to do with it?
The phrase stranded assetsshould certainly come to mind.
Or what if chips becomeincreasingly more efficient, so
that they require lesselectricity, both for processing
and for addressing the wasteheat they generate?
Nvidia claims that its GPUsused in training have seen a

(08:03):
2,000-fold reduction in energyuse over the past decade.
To date, those gains havesimply allowed data centers to
do more, and their appetiteappears endless.
But what if future gainscontinue?
How does that affect futuredata center power requirements?
Nobody really knows.
What we do know today is thatthe power grab continues

(08:24):
unabated for now, and datacenters are looking at all kinds
of supply strategies to get thejuice wherever they can, and
that's the topic we'll focus onin the next session.
Advertise With Us

Popular Podcasts

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Therapy Gecko

Therapy Gecko

An unlicensed lizard psychologist travels the universe talking to strangers about absolutely nothing. TO CALL THE GECKO: follow me on https://www.twitch.tv/lyleforever to get a notification for when I am taking calls. I am usually live Mondays, Wednesdays, and Fridays but lately a lot of other times too. I am a gecko.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.