All Episodes

October 20, 2025 9 mins

Karpathy is confusing LLM limitations with AI system limitations, and that makes all the difference. 

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
S1 (00:00):
Andrej Karpathy came on Dwarkesh. S podcast recently and I
have a number of thoughts. The consensus seems to be
that Karpathy thinks AGI is ten years away and therefore
Gary Marcus is right. And people like myself and Sholto
and all the other people saying AGI is within a
few years have just basically lost the war. It's a

(00:21):
compelling narrative, but that's not really what happened. He did, however,
say that he thinks AGI is ten years out. But
the AGI debate has always hinged on definitions, and I
think the one that Karpathy is using is the reason
he's wrong. It came from back when he was at OpenAI,
and it basically goes like this. An AI that can

(00:43):
do any economically valuable work as good or better than
a human. And again, that goes all the way back, like,
I don't know, whenever Karpathy was at OpenAI. This is
over five years ago. I simply don't think this is
the best definition to use. I quite like it as
a pure definition or as a computer science definition, but

(01:07):
I think we should use one that focuses more on
practically and directly helping humans and avoiding bad outcomes for humans,
as opposed to talking about what's interesting and valuable to
AI people like us. I'm worried about human worker replacement,
specifically human knowledge work, and that's why I've been using

(01:29):
this definition since 2023. And Dwarkesh is now using this
definition as well, which is an AI system that can
replace an average knowledge worker. For me, this is a
better definition for two reasons. One, it focuses on the
fact that it's an AI system and not one particular
component of a system like a model. Two it provides

(01:52):
a more direct benchmark for the thing we care about,
which is our companies actually replacing workers with the system.
Yes or no. And this system part is extremely key.
I have no reason or even ability to disagree with
Karpathy on the limitations of pure llms. He recently wrote
yet another LLM from scratch by hand A thousand lines

(02:17):
of code. He is the actual sensei here. Like I
know 0.0017% of what he knows about Llms. The problem is,
AI systems aren't just the llms themselves, they're not naked
neural nets. When you go to ChatGPT and you're talking
with GPT five, you're not talking to a base neural net,

(02:40):
you're talking to an AI system. You're talking to the
result of that initial LLM being shaped and molded with
colossal amounts of extra scaffolding and engineering to be the
best possible system it can be for doing that particular task.
In this case, being a chatbot or an assistant. This

(03:00):
distinction is crucial because replacing human jobs will also be
done through composite, stitched together systems that are many times
more powerful than their parts. To replace a project manager
or an executive assistant, the company's building human worker replacement
aren't going to wait for GPT nine or Gemini 7.5

(03:20):
to maybe solve their problems. Human worker replacement will happen
through AI products and systems that work around the pure
limitations of llms and of individual model intelligence like Rag.
Expanding context, windows, context management, things like that. And the
best example of this is actually Claude code. It's just

(03:43):
a brilliant example. Just throwing out estimates when Cloud Code
came out, which was earlier in 25, in like basically
March of 25 when it launched, it was like five
times better than opus, which was its best model at
the time for doing coding tasks and stuff like that. Well,

(04:04):
it's less than ten months later and it's already gotten
many times better than that already. It's like a night
and day difference. Yes, the models got better, but that's
not what made the difference. It was constant iterative improvements,
grinding towards improving how the AI talks to itself and
how humans interact with the AI coordination, context management, context engineering.

(04:31):
And just now they added skills, which takes the whole
thing to like completely different tier. This is exactly the
type of efficiency ratchet that will apply to human work replacement,
where we don't have enough context window to read all
the company's docs. Companies will have or invent systems to

(04:51):
do that, whether or not general enough to match human flexibility.
They'll just add so many great use cases and capabilities
based roughly around like the agent skills thing from anthropic
that they just released that we eventually won't notice because
it'll cover most use cases. The part that concerns me
most about the speed of progress towards AI replacing human

(05:13):
knowledge workers is not the speed of the AI system improvement.
It's also the fact that the bar is so low.
A good portion of our culture's comedy is based on
the utter incompetence of, like, half of our workforce. We're
talking about the worst possible customer service, people bragging about
how little work they do, making a sport of doing

(05:35):
the bare minimum, showing up the bare minimum amount of time,
not doing hardly any work and getting away with it
and getting paid. People absolutely detesting their jobs. Even decent
workers just mindlessly punch in and out a lot of
the time. Mediocrity is the baseline, almost by definition. That

(05:56):
is what multibillion dollar human worker replacement startups are competing with,
not the top 10% performers that you know, a lot
of us know, at least for now. Think of it
this way in the time that we went from cloud
code not existing to getting really, really good to now
having shareable work task replacement skills, the bottom 50% of

(06:22):
knowledge workers improved by how much? Zero in the time
since ChatGPT came out. Right. So we're talking about late 22.
So we're talking about what is that over three years
in the time since ChatGPT came out, we're talking about
a stark difference in AI before then and now, three

(06:47):
full years go by, the bottom 50% of knowledge workers
improved their capabilities. By how much? Again, 0%. The bar
for human work replacement is not moving, while the capabilities
of AI systems are going absolutely apeshit. Now, you might
push back saying this is only for the people not

(07:09):
trying very hard or who aren't that smart or whatever. True.
But it doesn't matter. You and me and Dwarkesh and
Karpathy are going to be fine. So what? I'm worried
about everyone else. If AI only eats the absolute worst
bottom 50% of knowledge workers in the next 5 or
10 years, we're still talking about hundreds of millions of jobs,

(07:34):
or even 25%. So basically a bunch of I just
did a bunch of research on this, and the total
number of knowledge workers worldwide is right around a billion.
1 billion knowledge workers. So half is a big percentage.
That's 500 million people, but let's just say it's 10%.

(07:57):
Let's just say it's 25%. And we've already established that
these are the least competent people at the job. So no,
they won't be pivoting easily to another knowledge work position.
This is why I disagree with Karpathy on AGI. It's
not because he's wrong about Llms having severe limitations. He's not,

(08:18):
but he's focused on the wrong thing. If the thing
we care about is AI's near-term and practical impact on humanity,
the thing to watch is not the pure LLM tech
or the specific technical limitations of RL to achieving continuous learning.
It's the trillions of dollars being invested in replacing the
worst performing human workers, who will likely never get better

(08:41):
than they already are. Those trillions are being spent on
scaffolding workarounds to LLM limitations that provide us just general
enough AGI to start replacing people and from there it
will only improve. Given what we've seen in systems like
cloud code cursor codecs that dramatically magnify model capability. While

(09:06):
the models continue to improve along their own axis as well,
do you really want to bet that good enough generality
won't be hit in the next couple of years? I
wouldn't take that bet. And this is why I think
AGI will arrive before 2028. Like a 70% chance. A
rough guess who really knows. And before 2030, I'm guessing 95%.

(09:32):
Not because all the stuff Karpathy is talking about will
be solved by then, but because it won't matter if
it's solved with trillions of dollars in funding and trillions
of dollars in market opportunity, we're almost guaranteed to cloud
code our way past a very low bar of millions
of barely there employees.
Advertise With Us

Popular Podcasts

Las Culturistas with Matt Rogers and Bowen Yang

Las Culturistas with Matt Rogers and Bowen Yang

Ding dong! Join your culture consultants, Matt Rogers and Bowen Yang, on an unforgettable journey into the beating heart of CULTURE. Alongside sizzling special guests, they GET INTO the hottest pop-culture moments of the day and the formative cultural experiences that turned them into Culturistas. Produced by the Big Money Players Network and iHeartRadio.

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.