All Episodes

July 4, 2025 93 mins

Our 214th episode with a summary and discussion of last week's big AI news! Recorded on 06/27/2025

Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

In this episode:

  • Meta's hiring of key engineers from OpenAI and Thinking Machines Lab securing a $2 billion seed round with a valuation of $10 billion.
  • DeepMind introduces Alpha Genome, significantly advancing genomic research with a model comparable to Alpha Fold but focused on gene functions.
  • Taiwan imposes technology export controls on Huawei and SMIC, while Getty drops key copyright claims against Stability AI in a groundbreaking legal case.
  • A new DeepMind research paper introduces a transformative approach to cognitive debt in AI tasks, utilizing EEG to assess cognitive load and recall in essay writing with LLMs.

Timestamps + Links:

  • (00:00:10) Intro / Banter
  • (00:01:22) News Preview
  • (00:02:15) Response to listener comments

Tools & Apps

Applications & Business

Research & Advancements

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:10):
Hello and welcome to thelast week in AI podcast.
We can here chat about what'sgoing on with ai as usual.
In this episode, you'll summarizeand discuss some of last week's most
interesting AI news, and you can checkout the episode description for the
links to that and the timestamps.
I'm Ron of your regular hosts Andre Koff.

(00:31):
I studied AI in grad school and nowwork at a generative AI startup.
And I'm your other host, Jeremy Harrisco-founder of Gladstone ai ai, national
Security, blah, blah blah, as you know.
And I'm the reason this podcast is gonnabe an hour and a half and not two hours.
Andre is very patiently waiting for likehalf an hour while I just sorted out.
Just my, my daughter's been teethingand it's wonderful having a daughter,

(00:53):
but sometimes teeth come in sixor eight in a, in a shot, and
then you have your hands full.
And so she is the greatestvictim of all this.
But Andre's a close second because boy,that that was a, I I kept saying five,
five more minutes and it never happens.
So I appreciate the patience, Andre.
I got an extra half hour toprep, so, I'm not complaining.
And I'm pretty sure you hada rougher morning than I did.

(01:15):
I was just drinkingcoffee and, and waiting.
So not too bad.
But speaking of this episode,let's do a quick preview.
It's gonna be, again, kind of lessof a major Newsweek, some somewhat.
Decently big stories, tools and apps.
Gemini, CLI is a fairly bigdeal, applications and business.

(01:37):
We have some fun OpenAI drama anda whole bunch of hardware stuff
going on, and not really anymajor open source stuff this week.
So we'll be skipping that.
Research and advancements excitingnew research from DeepMind
and just various papers aboutscalable reasoning, reinforcement
learning, all that type of stuff.

(01:58):
Finally, in policy and safety, we'llhave some more interoperability,
safety China stories, the usual, andsome pretty major news about copyright
following up on what we saw last week.
Yes.
So that actually would be one of thehighlights of this episode towards Van.
Before we get to that, do wannaacknowledge a couple reviews on
Apple Podcast as we do sometimes.

(02:21):
Thank you to the kind reviewersleaving us some very nice comments.
Also some fun ones.
I like this one.
This viewer said, I want to hear wittyand thoughtful response on why AI can't
do what you're doing with the show.
And wow, you're putting me in a spotbeing both witty and thoughtful.

(02:42):
And it did make me think, I willsay I did try notebook alarm
a couple months ago, right?
And that's the podcastgenerator from Google.
It.
Was good, but definitelystarted repeating itself.
I found that LLMs still often have thisissue of losing track of where they're
at like 10 minutes, 20 minutes inrepeating themselves or just otherwise.

(03:07):
And, and also Andre and, andrepeating themselves too.
Right?
And, and they'll, they'll justkeep saying the same thing in,
in repeating over and over, likethey'll repeat and, and repeat a lot.
So, yeah.
Yeah, no, that, that was, thatkind of a petition was solved
a couple years ago, thankfully.
But yeah, true.
Honestly, you could do a prettygood job replicating last week

(03:28):
in AI with LLMs these days.
I'm not gonna lie, but you're gonnahave to do very precise prompting
to get our precise personas andpersonalities and voices and so on.
So, I don't know.
Hopefully we are still doing abetter job, but AI could do or
at least doing a different job.
Then the more generic kindof outcomes you could get.

(03:48):
Trying to elicit AI tomake an AI news podcast.
But dude, and what AI could competewith starting 30 minutes late
because it's daughter's teething.
Like I challenge you right now.
Try it.
You're not gonna find anAI that can pull that off.
you can, you can have, I says it does.
That's right.
That's right.
Will the emotion of thatexperience actually be it?

(04:09):
That's right.
I don't think so.
I think the cop way, right?
People are often like, oh, it won'thave the heart, it won't have like
the, the soul, you know, the podcast.
It will, it will.
In fact, I think arguably ourjob is to surface for you the
moment that that is possible thatyou can stop listening to us.
One of the, the ver the virtues of,not being like a full-time podcaster
on this too, is we have that freedommaybe more than we otherwise would.

(04:32):
but man, I mean, it's, Iwould expect within the next.
18 months.
Hard to imagine that therewon't be something comparable.
But then you, you know, your, yourpodcast host won't have a soul.
They'll be stuck inside a box, Well, infact, I'm, I'm certain, I, I believe as
of quite a while ago, there are already AIgenerated, the AI News podcast out there.

(04:54):
Yeah.
I haven't checked 'em out,but I am sure they exist.
And nowadays they're probably quite good.
And, and you get one of those everyday as opposed to once a week,
and they're never a week behind.
So in some ways definitely superiorto us, but in other ways can they
be so witty and thoughtful inresponding to such a question?
I don't know.

(05:14):
I don't think.
Yeah.
In fact, can they be so lacking inwith and fought as we can be sometimes.
That's right.
Now that's, that's a challenge.
You know, they'll neveroutcompete with our stupid.
Yes.
As this true in general, I guess you'dhave to really try to get AI to be
bad at things when it's actually good.
Anyways, a couple more reviews lately.

(05:34):
So do you wanna say thank you?
Another one is called this the BestAI podcast, which is quite the honor
and says that this is the only onethey listen to at normal speed.
Most of our podcasts are played in 1.5or two x speed, so good to hear we are
using up all our two hours at a good pace.
That's right.

(05:55):
Funny, a, a while ago there was a reviewthat was like, I always speed up through
Andre's talking and then have to listen.
No worry for Jeremy.
So maybe I've sped up since then.
So yeah, as always, thank you forthe feedback and thank you uh,
for questions that you bring in.
I, it's a fun way to start the show.
But now let's go into the news

(06:18):
starting with tools and apps.
And the first story is, I think one ofthe big ones of this week, Gemini, CLI.
So this is essentiallyGoogle's answer to cloud code.
It is a thing you can use in yourterminal, which for any non-programmers
out there is just the text interfaceto working on your computer.

(06:40):
So you can, you know, look what files theyare, open them, read them, type stuff,
et cetera, all via non UI interface.
And now this CLI is that isGemini in your terminal and it has
the same source of capabilitiesat a high level as cloud code.
So it's an agent and you launch itand you tell it what you want it

(07:05):
to do and it goes off and does it.
And it sort of takes turns betweenit doing things and you telling it to
follow up to change what it's doing orto check what it's doing, et cetera.
With this launch Google is beingpretty aggressive, giving away a
lot of usage 60 model requests perminute and 1000 requests per day.

(07:29):
It's a very high allowance as faras caps, and there's also a lot of.
Usage for free without having to pay.
I'm not sure if that is the capfor free, but for now, you're
not gonna have to pay much.
I'm sure sooner or later you getto the cloud code type of model.

(07:49):
Where to use cloud code at the highestlevel, you have to pay $200 per month
for a hundred dollars a month, whichis what we, at our company already
do because cloud code is so useful.
From what I've seen on conversationsonline, the vibe eval is that this
is not quite as good as cloud code.
It isn't as capable as softwareengineering at using tools just

(08:13):
generally figuring things out as it goes.
But it was just released.
Could be a strong competitor soon enough.
Yeah, I, I'm still amazed at how quicklywe've gotten used to the idea of a
million token context window, by theway, 'cause this is powered by Gemini
2.5 Pro, the reasoning model and that'spart of what's in the backend here.
So that's gonna be the reason alsothat it doesn't quite, you know,

(08:37):
live up to the Claude Standard,which is obviously a model.
That's a lot.
I dunno, it just seemsto work better with code.
I'm curious about when thatchanges, by the way, and what.
Philanthropics actual recipe,like why is it working so well?
We don't know obviously, but someday,maybe after the singularity, when we're
all one giant hive mind, we'll know whatactually was going on to make the Claude

(08:57):
models this good and persistently good.
But in any case, yeah, it'sa really impressive play.
The advantage that Googlehas of course, over.
Philanthropic currently isthe availability of just
a larger pool of compute.
And so when they think about driving costsdown, that's where you see them trying
to compete on that basis here as well.
So a lot of free prompts a lot of freetokens, I should say, good deals on
the, the token counts that you put out.

(09:19):
So, you know, it's, it's one way to go andI think as, as the, ceiling rises on the
capabilities of these models eventuallycost does become a more and more relevant
thing for any given fixed application.
So, that's an interesting dynamic, right?
The frontier versus the fast followers.
Dunno if it's quite right tocall Google a fast follower.
They're definitely doing some frontierstuff, but anyway yeah, so interesting.

(09:39):
Next, next move here.
Part of the productionization, obviouslyof, of these things and entering workflows
in very significant ways, I thinkthat's, you know, this is heading in.
Slow increments towards aworld where agents are, are
doing more and more and more.
And you know, context, windows, coherencelengths are, are all part of that.
Right?
Yeah.
We discussed last year, liketowards the beginning of last

(10:02):
year, was real kind of hype trainfor agents and the agent future.
And I think CLO code and Gemini CLI areshowing that we are definitely there in
addition to things like replicate lovable.
Yeah.
Broadly speaking, L Lambs have gottento a point partially because of
reasoning, partially presumably justdue to improvements in L lambs where

(10:23):
you can use them in agents and they'revery successful from what I've seen.
Part of the reason cloud code is sogood is not just cloud, it's also just.
Cloud code, particularly theagent is very good at using tools.
It's very good at doing textsearch, text replacement.

(10:44):
It's very keen on writingtests and running them as it's
doing software engineering.
So it, it is a bit differentthan just thinking about an LLM.
It's a whole sort of suite of whatthe agent does and how it goes about
its work that makes it so successful.
And that's something you don't getout of a box with LLM training, right?

(11:06):
Because tool usage is notin your pre-training data.
Yeah.
It's, it's something kind of on top of it.
So that is yet another thing similarto reasoning where we are now
going beyond the regime of you canjust strain on tons of data from
the internet and get it for free.
More and more things inaddition to alignment.
Now you need to add to VLMbeyond just throwing a million.

(11:32):
Gigabytes of data at it, itreally is a system, right?
Like at the end of the day, it'snot, it's also not just one model.
I think a lot of people have thisimage of like, you know, there's
one monolithic model in the backend.
Assume that there's a lot of like modelschoosing which models to answer a prompt.
And I'm not even talking about MOEstuff, like just liter, literal
software engineering in the backendthat makes these things have

(11:52):
the holistic field that they do.
So yeah, FYI, by the way, I didn'tremember this so I looked it up.
CLI stands for command line interfacecommand line, and never term for terminal.
So again, fornon-programmers, fund detail.
And speaking of cloud code, thenext story is about philanthropic

(12:13):
and they have released theirability to publish artifacts.
So artifacts are these little appsessentially you can build within
cloud, you get a preview and theirinteractive web apps more or less.
And as with some other ones,I believe Google allows you to

(12:33):
publish gems is what they call it.
Now you can publish your artifactsand other people can browse 'em.
They also added thesupport to building apps.
With AI built in with Claudebeing part of the app.
So now if you wanna build like alanguage translator app within Claude,

(12:54):
you can do that because the app itselfcan query Claude to do a translation.
So, you know, not a huge delta from justhaving artifacts, but another sort of
seemingly trend where all the LMS tendto wind up at similar places as far as

(13:14):
you add things like artifacts when youmake it easy to share what you build.
Yeah.
And you know, it's somethingthat anyone can do.
Most users on their free Pro Maxtiers can share and they'll be
interested to see what people build.
And, and if I'm, if I'm rep lit, I'mgetting pretty nervous looking at this.

(13:37):
Granted, obviously Rep Lit has, soRepli write that platform that lets you
essentially like launch an app reallyeasily, takes abstracts away, all
the like server management and stuff.
And like you've got kids, launchinggames and, and all kinds of useful
apps and learning to code through it.
Really, really powerfultool and super, super po.
I mean, it's 10 x year over year.
It's, it's growing really fast.
But you can start to see the, thefrontier moving more and more towards,

(14:03):
let's make it easier and easierat first for people to build apps.
So we're gonna have, you know,a, an agent that just writes the,
the whole app for you or whateverand, and just produces the code.
But at what point does it naturallybecome the next step to say, well,
let's do the hosting, let's abstractaway all the things, you could see open
ai, you could see Anthropic launchinga kind of app store that's not quite
the right, term, right, becausewe're talking about more fluid apps.

(14:26):
But you know, moving more in thatdirection, hosting more and more of it,
and eventually getting to the point whereyou're just asking, the AI company for
whatever high level need you have, andit'll build the right apps or whatever.
Like that's not actuallythat crazy sounding today.
And again, that swallows up a lotof the replicate business model.
And it'll be interestingto see how they respond.

(14:47):
Yeah.
And this is particularly true because ofthe converging or parallel trend of these
context model protocols that make it easyfor AI to interact with other services.
So now if you wanna make an.
Talks to your calendar, talksto email, talks to your Google
Drive, whatever you can think of.
Basically any major tool you're workingwith, AI can integrate with it easily.

(15:13):
So if you wanna make an app that doessomething with connection to tools that
you use, you could do at VIN Cloud.
So, as you said, I think both rep andlovable are these emerging titans in
the world of building apps with ai.
And I'm sure they'll have a, a place inthe kind of domain of more complex things.

(15:37):
We need databases and you needauthentication and, and so on and so on.
But if you need to build an app foryourself or for maybe just a couple of
people to speed up some process, youcan definitely do it with these tools
now and then share 'em if you want.
And onto applications in businessas promised, kicking up with

(15:58):
some open AI drama, which wehaven't had in a little while.
So good to see it isn't ending this time.
It's following up on this IO trademarkkind of lawsuit that happened.
We covered it last week where wehad OpenAI, Sam Altman announced the

(16:20):
launch of this IO initiative withJohnny Ive, and there's an other AI
audio hardware company called io.
Spelled differently, IYO instead of IO.
And they sued alleging that you know, theystole the idea and also the trademark.

(16:42):
The names sound very similar.
And yeah, Sam Altman hit back, decidedto publish some emails, just screenshot
of emails showing the founder of io,let's say, being very friendly, very
enthusiastic about meeting a wildmanand wanting to be invested in by OpenAI.

(17:04):
And the, the basic gist of what Almansaid is this Founder Jason Rugo, who
filed the lawsuit was kind of persistentin trying to get investments from Sal
Malman, in fact even reached out in Marchprior to the announcements with Johnny.

(17:24):
Ive and apparently Sam Altman, youknow, let him know that the competing
initiative he had was called io.
So, definitely I think iteffective pushback on the lawsuit.
Similar in a way to whatOpenAI also did with Elon Musk.
Just like, here's the evidence,here's the receipts of your emails.

(17:47):
I, not too sure if whatyou're saying is legit.
this is becoming, well,two is not yet a pattern.
Is it?
Is it three?
I forget how many takes to make a pattern.
They say.
Then again, I don't know who they areor why they're qualified to tell us.
It's a pattern, but yeah.
This is a, a, an interesting situation.
One interesting detail kind of gives youmaybe a bit of a window in, into how the
balance of evidence is shaping up so far.

(18:10):
We do know that in the lawsuit.
Eo, so not io, but eo.
This is, I was gonna say JasonDerulo, Jason Rug Rug's Rugo.
Yeah, rug's company didend up sorry where was it?
They were actually, yeah,they were granted a temporary
restraining order against OpenAIusing the IO branding themselves.

(18:31):
So the opening, I was forced to changethe IO branding due to this, this
temporary restraining order, whichwas part of EOS trademark lawsuit.
So at least at the level of the trademarklawsuit, there has been an appetite
from the courts to put in this sort ofpreliminary, temporary restraining order.
I'm not a lawyer, so I don't knowwhat the, the standard of proof would

(18:51):
be that would be involved in that.
So at least at a trademark level, maybeit's like, sounds vaguely similar enough.
So, yeah, for now, let's, let'stell OpenAI they can't do this.
But there's, I. Enough fundamentaldifferences here between the devices
that you can certainly see Opening Eyes'case for saying, Hey, this is different.
They claim that the IO hardwareis not an in-ear device at all.

(19:11):
It's not even a wearable.
That's where that information comes from.
That was itself doing the rounds.
This big deal, opening eyes'new device is not actually
gonna be a wearable after all.
But we do know that.
Apparently.
so Olo was trying to pitch a bunch ofpeople about their idea about the IO
concept, sorry, the EO concept way backin 2022, sharing information about it

(19:34):
to former Apple designer Evans Hanky,who actually went on to co-found io.
So, you know, there'sa lot of overlap here.
The claim from OpenAI is, look,you've been working on it since 2018.
You demoed it to us.
It wasn't working.
There were these flaws.
Maybe you fixed them since, but atthe time it, it was a janky device, so
that's why we didn't partner with you.
But then you also have this wholeweird overlap where, yeah, some of the

(19:56):
founding members of the EO team hadapparently spoken directly to EO before.
So it's pretty messy.
I think we're gonna learn a lotin the, in the court proceedings.
I don't think these emailsgive us enough to go on to make
a firm determination about.
What, because we don't evenknow what the hardware is.
And that seems to be at the core of this.
So what is the actual hardwareand how much of it did OpenAI did

(20:21):
love from, did IO actually see?
Right?
And in the big scheme of things,this is probably not a huge deal.
This is a lawsuit saying you can'tcall your thing IO because it's
too similar to our thing EEO.
And it's also seemingly somesort of wearable AI thing.
So worst case, presumably the initiativeby Semi and Journey Ive changes.

(20:47):
I think more than anything this is just.
Another thing to track of OpenAI, right?
Another thing that's going onthat for some reason, right, we
don't have these kinds of thingswith philanthropic or Yeah.
Or Misra or any of these other companies,maybe because OpenAI is the biggest.
There just tends to be a lot of this, youknow, in this case, legal, business drama,

(21:12):
not interpersonal drama, but nevertheless,a lot of headlines and honestly
juicy kind of stuff to discuss that.
Yeah, yeah, yeah.
Yeah.
So, another thing going on and, andanother indication of a way that Sam
Ahman likes to approach these kinds ofbattles in a fairly public and direct way.

(21:35):
Up next we have Huawei Make bookcontains Kirin X 90, using SMIC seven
nanometer n plus two technology.
If you're a regular listener of thepodcast, you're probably going, oh my God.
And then, or maybe you are, I don't know,this is maybe a little in the weeds.
But either way, you might wanna refresheron, on what the hell this means, right?
So.
There was a bunch of rumors actuallyfloating around that Huawei had

(21:59):
cracked, sorry, that SMIC, whichis China's largest semiconductor
foundry, or most advanced one.
You can think of them asbeing China's domestic, TSMC.
There was a bunch of rumors circulatingabout whether they had cracked
the five nanometer node, right?
That critical node, that is what was usedor a modified version of it was used to

(22:20):
make the HH 100 GPU the Nvidia H 100.
So if China were to crack thatdomestically, that'd be a really big deal.
Well, those rumors now arebeing squashed because this this
company, which is actually basedin Canada did an assessment.
So tech insights, we've actually talkeda lot about their findings sometimes.
While mentioning themby name, sometimes not.

(22:42):
We really should.
Tech Insights is a veryimportant firm in all this.
They do these tear downs of hardware.
They'll go in deep and figure out, oh,what manufacturing process was used
to make this component of the chip?
Right?
That's the kind of stuff they do.
And they were able to confirm that.
In fact the Huawei X 90, sosystem on a chip is, was actually
not made using five nanometer.

(23:02):
Equivalent processes, but ratherusing the old seven nanometer process
that we already knew SMIC had.
So that's a big, big deal fromthe standpoint of their ability
to onshore, domestically GPUfabrication and keep up with the West.
So it seems like, like we're liketwo years down the road now from
when SMIC first cracked the sevennanometer node, and we're still

(23:25):
not on the five nanometer node yet.
That's really, really interesting.
And so worth saying like Huawei neveractually explicitly said that this
new PC had a a five nanometer node.
There was just a bunch of rumors about it.
So what we're getting now is just kind ofthe, the decisive quashing of that rumor.
Right.
And broader context here is ofcourse, that the US is preventing

(23:47):
Nvidia from selling top of linechips to Chinese companies.
And that does limit the abilityof China to create advanced ai.
They are trying to get theability domestically to produce
chips competitive with Nvidia.
Right now they're, let's say abouttwo years behind, is my understanding.

(24:09):
Mm-hmm.
And this is the real, one of the realbottlenecks is if you're not able
to get the state of art fabricationprocess for chips, there's just less
compute you can get on the same.
Amount of chip, right?
It's just less dense.
And this arguably isthe hardest part, right?

(24:30):
To get this thing it takes forever, as yousaid, two years with just this process.
And it, it is gonna be a real blockerif they're not able to crack it.
Yeah.
The fundamental issue China'sdealing with is because they have
crappier nodes, so they can't fabthe same quality of nodes as TSMC.
They're forced to either steal TSMC fabnodes, so, or, or find clever ways of

(24:55):
getting TSMC to fab their, their designsoften by using subsidiaries or shell
companies to make it seem like they're.
Maybe we're coming in from Singaporeand asking TSMC to FAB something, or
we're coming in from a clean Chinesecompany, not Huawei, which is blacklisted.
And then the other, the otherside is because their alternative
is to go with these crappierseven nanometer process nodes.

(25:18):
Those are way less energy efficient.
And so the chips burn hotter orthey run hotter rather, which means
that you run into all these kindsof heat induced defects over time.
And, and we covered that, Ithink last or two episodes
ago the last episode I was on.
So anyway, there, there's a whole kind ofhairball of different problems that come
from ultimately the fact that SMIC hasnot managed to keep up with TSMC, right?

(25:39):
And, and you're seeing allthese 10 billion, $20 billion
data centers being built.
Those are being built with you know, racksand racks and, and huge amounts of GPUs.
The way you do it, the way you supplyenergy, the way you cool it, et cetera.
All of that is very conditionedon the hardware you have in there.
So it's, it's very important to ideallyhave the state of art to build with.

(26:05):
Next story also related to hardwaredevelopments this time about a
MD and they now have an ultraEthernet ready network card.
The Pan Sando Polar,which provides up to 400.
Gigabits per second.
Is that it?
Mm-hmm.

(26:26):
Per second performance.
And this was announced attheir Advancing AI event.
It'll be actually deployed by OracleCloud with the a MD instinct MI
three 50 x GPUs and the network card.
So this is a big deal because MD istrying to compete with Nvidia on the

(26:46):
GPU front and the very series GPUs doesseem to be catching up, or at least has
been shown to be quite usable for ai.
This is another part of the stack, theinternship communications, but it's
very important and very significantin terms of what NVIDIA is doing.

(27:08):
Yeah, a hundred percent.
This is, this is, by theway, the industry's first
ultra ethernet compliant nic.
So a network interface card.
So, what the NIC does you've got, and, andyou can go back to our hardware episode
to kind of see more, more detail on this,but in a rack, say at the rack level, at
the pod level, you've got all your GPUsthat are kind of tightly interconnected
with accelerator interconnect.

(27:28):
This is often like the, the Nvidiaproduct for this is NV link.
This is super low latency,super expensive interconnect.
But then if you wanna connect likepods to other pods or racks to other
racks you're now forced to hop througha slower interconnect part of what's
known sometimes as the backend network.
And when you do that the Nvidiasolution you, you'll tend to use

(27:51):
for that is in finna band, right?
So you've got, you've got envy linkfor the really, like within a pod,
but then from pod to pod you have inFinny Band and in Finny band has been.
A go-to defacto like kind of goldstandard in the industry for a while.
companies that aren't Nvidia don'tlike that because it means that
Nvidia owns more of the stack andhas an even deeper kind of defacto

(28:13):
monopoly on different components.
And so you've got this thing calledthe Ultra Ethernet Consortium that came
together, is founded by a whole bunchof companies MD, notably Broadcom.
I think meta and Microsoft wereinvolved, Intel and they came
together and said, Hey, let's comeup with an open source standard.
For this kind of interconnect withAI optimized features that basically

(28:33):
can compete with the Infinity Bandmodel that that NVIDIA has out.
So that's what ultra ethernet is.
It's been in the works for a long time.
We've just had the announcementof specification 1.0 of that
ultra ethernet protocol and that'sspecifically for hyperscale AI
applications and data centers.
And so this is actually a pretty seismicshift in the industry and there are

(28:54):
actually quite interesting indicationsthat companies are going to shift from
Infinity Band to this sort of protocol.
And one of them is just cost economics.
Like ethernet has massive economiesof scale already across the
entire, like networking industryand infinity band's more niche.
So.
As a result, you kind of have ultraethernet chips and, and like switches

(29:16):
that are just so much cheaper.
So you'd love that.
You also have vendor independence.
You have, because it's an open standard,anyone can build to it instead of just
having Nvidia own the whole thing.
So, so the margins go down a lot and,and people really, really like that.
A obviously all kinds ofoperational advantages.
It's just operationally more simplebecause data centers already know
ethernet and how to work with it.
So anyway this is a, a reallyinteresting thing to, to watch.

(29:38):
I know it sounds like it sounds boring.
It's the interconnect between differentpods and a data center, but this is
something that executives at the toplabs really sweat over because there
are issues with the infinity band stuff.
This is one of the key rate limitersin terms of how big models can scale.
Right.
Yeah.
To give you an idea, Oracle is apparentlyplanning to deploy these latest A MD

(30:00):
GPUs with a Zeta scale AI cluster withup to 131 and 72 instinct MI 3 55 x GPUs.
So when you get to those numbers,like think of it 131,000 GPUs,
GPUs aren't small, right?
Yeah.
The GPUs are pretty big.

(30:22):
They're not like a little chip where,I don't know, like notebook sized ish.
And there's now 131,000 andyou need to connect all of 'em.
And when you say pod, right,typically you have this rack
of them, like almost a bookcase.
You could think where youconnect them with wires.
But you can only get, I don't knowhow many, typically 64 or something

(30:45):
on that side when you get to 121.
Thousand, this kind of stuffstarts really mattering.
And in their slides in this event,they did, let's say very clearly
compare themselves to competition.
Said that this has 20 x scaleover in feeding band, whatever
that means, has performance of 20%over competition, stuff like that.

(31:08):
So, MD is very much trying tocompete and be offering things that
are in some ways ahead of Nvidiaand others like Broadcom and so on.
Mm-hmm.
And next up another hardware story.
This time dealing with energy,Amazon is joining the big nuclear

(31:29):
party by buying 1.2 1.92 gigawattsof electricity from Talen Energy's.
S usa Su, yeah.
Susa nuclear plant in Pennsylvania.
So nuclear, powerful ai,it's, it's over rage.

(31:50):
Yeah.
I mean, so we've known aboutif you flip back, right?
Originally this was the 960 megawattdeal they were trying to make and
that got killed by regulators who wereworried about customers on the grid.
So essentially everyday peoplewho are using the grid, who would.
In their view, unfairly shoulderthe burden of running the grid.

(32:11):
Today, you know, Susquehanna powers thegrid and that means every kilowatt hour
that they put in leads to transmissionfees that support the grid's maintenance.
And so, what, what Amazon was going to dowas gonna go behind the meter, basically
link the power plant directly to theirdata center without going through the
grid so there wouldn't be grid fees.
And that basically just means that the.

(32:34):
General kind of grid infrastructuredoesn't get to benefit from
those fees over time.
Sort of like not paying tollwhen you go on a highway.
And this new deal that gets us to1.2 gigawatts is a revision in that
it, it's got Amazon basically goingthrough in front of the meter, going
through the grid in the usual way.
They're gonna be, as you can imagine,a whole bunch of infrastructure

(32:55):
needs to be reconfigured,including transmission lines.
Those will be done in spring of 2026.
And the deal apparently coversenergy purchase through 2042, Which
is sort of amusing because likeimagine trying to ahead of time.
But yeah, I guess we are predicting thatthey'll still need electricity by 2042,
which assuming X risk doesn't come about.

(33:16):
I, I suppose it's fair.
Yeah.
Yeah.
Next story also dealing with nuclearand, and dealing with Nvidia.
It is joining bill Gates and, andothers in backing Terra Power, a
company building nuclear reactorsfor powering data centers.
So this is through NVIDIA'sventure capital arm and ventures,

(33:39):
and they have invested in thiscompany Terra Power investing.
It seems like 650 millionalongside Hyundai.
And Terra Power is developinga 345 megawatt natrium
plant in Wyoming right now.
So they're, you know, I guess in theprocess of starting to get to a point

(34:00):
where this is usable, although itprobably won't come for some years.
Your instincts are exactly right onthe on the, the timing too, right?
So, there's a lot of talk about SMRs,like small modular reactors which are
just a very efficient way and very safeway of generating nuclear power on site.
That's the exciting thing about them.
They are the obvious apart from likefusion, they are the obvious solution

(34:23):
to the future for powering data centers.
The, the challenge is when you talkto data center companies and builders,
they'll, they'll always tell youlike yeah, SMRs are great, but you
know, we're looking at first, firstapprovals, first SMRs generating
power, like at the earliest, youknow, like 20, 29, 20 30 type thing.
So, you know, if you have sort ofshorter a GI timelines, they're, not

(34:47):
gonna be relevant at all for those.
If you have longertimelines, even kind of.
Somewhat longer timelines than,than they do become relevant.
So it's a really interesting spacewhere we're going to see a turnover
in, in the kind of energy generationinfrastructure that's used.
And and this, you know, people talk alot, a lot about China and their energy
advantage, which is absolutely true.
I'm quite curious whether this allows theAmerican energy sector to do a similar

(35:10):
leapfrogging on SMRs that China did,for example, on mobile payments, right?
When you, when you just like do nothave the ability to build nuclear plants
in less than 10 years, which is thecase for the United States, we just.
Like don't have that, that know-howand, and frankly, the willingness
to deregulate to do it and theindustrial base, then it kind of
forces you to look at o other options.

(35:30):
And so if there's a shift justin the, the landscape of power
generation, it can introduce someopportunities to, to play catch up.
So, sort of a, I I guess that's a, hottake there that haven't thought enough
about, but that's a, an interestingdimension anyway to the SMR story,
by the way, one gigawatt apparentlyequivalent to 1.3 million horsepower.

(35:51):
So not sure if that gives youan idea of what a gigawatt
is, but it's a lot of energy.
It gigawatt is a lot.
Yeah.
1 million homes for one day or,or what does that actually mean?
I mean, it's a, so gigawatt is a unitof power, so it's like the amount of
power that a million homes just consumeat any given consume on a running basis.
Yeah, yeah, exactly.

(36:11):
So one gigawatt is alot, so is 345 megawatts.
Now moving on to some fundraising news.
Mira Murti her company thinkingMachines Lab has finished up their
fundraising getting $2 billionat a $10 billion valuation.

(36:33):
And this is the seed round, so yet anotherbillion round billion dollar seed round.
And this is of course the former CTOof openAI left in 2024, I believe, and
has been working on setting up, takingMachines Lab, another competitor in
the a GI space, presumably planningto train their own models, recruited

(37:00):
various researchers, some of themfrom OpenAI, and now has billions
to work with that tell, deploy,presumably to train these large models.
Yeah, it's funny, everyone just kind ofknew that it was gonna have to be a number
with, with billion after it just becauseof the, the level of talent involved.
It is a remarkable talent set.
the Round is led by Andreesen Horowitzso a 16 Z on the cap table now.

(37:24):
Notably though thinking machinesdid not say what they're
working on to their investors.
At least that's what this article,that's what it sounds like.
The wording is maybe slightly ambiguous.
I'll just read it explicitly.
You can make up your mindthinking Machines Lab had not
declared what it was working on.
Instead, using tis name andreputation to attract investors.

(37:44):
So.
That suggests that a 16 Z cut,they didn't cut the full $2 billion
check, but they led the rounds.
So hundreds and hundredsof millions of dollars.
Just on the basis of like, yeah, you know,mi mirror's a serious fucking person.
John Schulman's a serious fucking person.
You know, Jonathan Laman, likeall kinds of, of people bears off.
These, these are really serious people.
So we'll cut you a $800 million.

(38:05):
Check whatever they cut as part of that.
That's both insane andtells you a lot about.
How the, the space is being priced.
The other weird thing we know, andwe talked about this previously,
but it bears kind of, repeating.
So Mirati is gonna hold this, Amirais gonna hold board voting rights that
outweigh all other directors combined.
This is a weird thing, right?
This is not what is with all thesea GI companies and the really

(38:27):
weird board structures a lot ofit is just like the OpenAI Mafia.
Like people who worked at OpenAI didnot like what Sam did, and learned those
lessons and then enshrined that in theway they run their company, in their,
in their actual corporate structure.
And anthropic has, you know, theirpublic benefit company set up
with their, their oversight board.
And now thinking machines has thisMira Mirati dictatorship structure

(38:51):
where she has final say basicallyover, over everything at the company.
By the way, everythingI've heard about her is.
Is exceptional.
Like every open AI person I've, I'veever spoken to about Mira has just
like, glowing things to say about her.
And so even though $2 billion isnot really enough to compete, if
you believe in scaling laws it tellsyou something about, you know, the
pe the kinds of decisions peoplewill make about where they work.

(39:14):
Include who will I be working with?
And this seems to be abig factor, I would guess.
In, in all these people leaving open ai,she does seem to be a genuinely except
exceptional per, like, I've never mether, but again, everything I've heard
is just like glowing and both in termsof competence and in terms of kind of
smoothness of, of working with her.
So that may be part of what'sattracting all this talent as well.

(39:36):
Yes.
And on the point of not quiteknowing what they're building, you
go, if you go to thinking machines,ai this has been the case for a
while, you'll get a page of text.
The text is, let's say likereads, like a mission statement
that sure is saying a lot.
There's stuff about scientific progressbeing a collective effort, emphasizing

(40:00):
human AI collaboration, more personalizedai systems, infrastructure, quality,
advanced multimodal capabilities,research, product co-design,
empirical iterative approach to AIsafety, measuring what truly matters.
I have no idea.
This is like just saying a wholebunch of stuff, and you can really

(40:22):
take away whatever you want.
Presumably it'll be.
Something that is competing withopen AI and philanthropic fairly
directly is the impression.
And the, yeah.
At near the bottom of thepage@thinkingmachines.ai founding team
has a list of a couple dozen names.
Each one with you can hover overit to see that background, as

(40:46):
you say, like real heavy heaters.
And then there are advisorsand a Join Us page.
So yeah, it really tells you what ifyou gain a reputation and you have
some real star talent in SiliconValley, that goes a long way.
And on that note, next storyquite related Meta has hired some

(41:07):
key open AI researchers mm-hmm.
To work on their AI reasoning models.
So, a week ago or two weeks ago, we talkedabout how meta paid a whole bunch of money
invested rather in scale AI and hired aaway the founder of Scale ai Alex Wang, to
head their new super intelligence efforts.

(41:27):
Now there are these reports I don'tknow if this is highlighting it
particularly because open ai orperhaps this is just reduce details.
I'm sure Meta has hired other.
Engineers and researchers as well.
But I suppose this oneis worth highlighting.
They did hire some fairlynotable figures from opening eye.

(41:50):
Yeah.
So this is Lucas Bayer, AlexanderKnik, and she, how Ja, who I
believe founded the Sweden office.
Switzerland office, was it.
Anyway, they, they we're a fairlysignificant team at OpenAI.
Or so it appears to me, and I thinkLucas Bayer did post on Twitter and

(42:15):
say that the idea that Twitter paid ahundred million dollars was fake news.
This is another thingthat's been up in the air.
Sam Alman has been taking you couldsay some gentle swipes saying that meta
has been promising insane pay packages.
So all this to say is this is justanother indication of Mark Zuckerberg

(42:36):
very aggressively going after talent.
We know he's been personally messagingdozens of people on WhatsApp and whatever,
being like, Hey, come work for Meta.
And perhaps unsurprisingly, that is payingoff in, in some ways in expanding the
talent of this super intelligence team.
Yeah, there's a lot that's bothweird and interesting about this.

(42:57):
The first thing is anything shortof this would be worth zero.
The, the, when you are in Zucksposition and you are, and I'll just
sort of like this is colored by myown interpretation of who's right
and who's wrong in this space.
but I think it's increasingly sortof just becoming clear in fairness.
I don't think it's just my biasessaying that when, when your company's

(43:18):
AI efforts, despite having accessto absolutely frontier scales of
compute, so having no excuses forfailure on the basis of access to, to
infrastructure, which is the hardestand most expensive thing when you've
managed to tank that so catastrophically.
Because your culture is taken, is screwedup by having y Koon as the mascot,

(43:39):
if not the leader of your internal AIefforts, because he is not actually as
influential as it sounds or hasn't beenfor a while on the internals of Facebook.
But he has set the, beatat Facebook at meta.
Being kind of skeptical about aGI, being skeptical about scaling
and then like changing his mindin ego preserving ways without
admitting that he's changed his mind.
I think these are very damaging things.

(43:59):
They destroy the credibility ofmeta and have done that damage and I
think I. The fact that meta is so farbehind today is a reflection in large
part A, a consequence of Yann Koon'spersonality and his inability to kind
of update accordingly and maintainlike epistemic humility on this.
I think everybody can see it.
He's like the old man who's stillyelling at clouds and just like,

(44:22):
as the clouds change shape, he'slike trying to pretend they're not.
But, but I think just like speakingas, like, if I were making the
decision about where to work,that would be a huge factor.
And it has just objectively played outin a catastrophic failure to leverage
one of the most impressive fleets of AIinfrastructure that there actually is.
And so what we're seeing with thisset of hires is people who are, I

(44:44):
mean, so completely antithetical toYen Koon's way of thinking, like meta
could not be pivoting harder in termsof the people it's poaching here.
First of all, OpenAI obviouslyone of the most scale pilled
organizations in the space.
Probably the most scale anthropicactually is, is up there too.
But also scale AI's, Alex Wang.
So, okay, that's interesting.
Very scale pilled dude.

(45:05):
Also very AI safety pilled dude.
Daniel Gross, arguably quite AI safety.
Pilled.
At least that was the mantraof safe, super intelligence.
Weird that he left that so soon.
A lot of open questions about howsafe Super intelligence is doing.
By the way, if Daniel Gross is nowleaving, I mean DG was the CEO, right?
Co-founded it with Ilya,so what's going on there?
But so that's a hanging Chad, butjust Daniel Gross being being now

(45:27):
over on the the meta side, you haveto have enough of a concentration of
exquisite talent to make it attractivefor other exquisite talent to join.
If you don't break that criticalmass, you might as well have nothing.
And that's been meta'sproblem this whole time.
They needed to just like jumpstart thisthing with a massive capital infusion.
Again, these massive pay packages,that's where it's coming from.

(45:48):
Just give people a reason to comeget some early proof points that
get people excited about meta again.
And the weird thing is with allthis, like, I'm not confident at all
in saying this, but you could see adifferent line from Meta on safety
going forward too, because Jan Laun wasso dismissive of it, but now a lot of
the people they've been forced to hirebecause there is, if you look at it
objectively, a strong correlation between.

(46:10):
The people in teams who areactually leading the frontier, and
the people in teams who take lossof control over ai, seriously.
Now meta is kind of forcedto change in some sense.
It's DNA to take that seriously.
So I think that's just a reallyinteresting, like shift, and I
know this sounds really harshwith respect to y Laun, like, you
know, take it from what it is.
It's, it's just one man's opinion,but I've, I have spoken to a lot of

(46:31):
researchers who feel the same way.
And again, I mean, I thinkthe data kind of bears it out.
Essentially, mark Zuckerbergis being forced to pay the
Yan Lacoon tax right now.
And I don't know what happens toYan Lacoon going forward, but I
do kind of wonder if his meta daysis, may be numbered or, you know,
if there's gonna be a face savingmeasure that has to be taken there.
Right.
For context, Yian Koon isMeta's chief AI scientist.

(46:56):
He's been there for over a decade,hired like I think around 20 13, 20
12 by meta one of the key figures inthe development of newer networks,
really over the last couple decades.
And, and certainly is a majorresearcher and contributor to the
rise of deep learning in general.
But as you said, a skeptic on largelanguage models and a proponent

(47:18):
for sort of other techniques.
I will say not entirely bought intothis, this narrative personally.
The, the person heading up the effort onLAMA and LMS was not Yian Una Farms aware.
There was another division withinMeta that focused on generative
technology that has now been revamped.

(47:39):
So the person leading with generative AIefforts in particular has been, has left,
and now there is an entirely new division.
Called the a GI foundationsthat is now being set up.
So this is part of a major revamp.
Yna Kun, still leading his more likeresearch publication type side of things.
And perhaps as far as I know, not veryinvolved in this side of scaling up

(48:03):
Lama and LS and all of this, which is.
Less of a research effort,more of an r and d kind of
compete with open AI and so on.
Effort.
No, absolutely agree.
And, and that was what I was referringto when I was saying Yan Lako is not
sort of involved in the, the day-to-daykind of product side of the org.
You know, it's, it's been known fora while that he's not actually, you
know, doing the heavy lifting on Lama.
But he has defined what it means,like essentially articulated Meta's

(48:28):
philosophy on AI and AI scaling forthe last, you know, however many years.
And so it's understood that when youjoin Meta at least, it was that you are
buying into a sort of y lacuna alignedphilosophy, which I think has, has,
is the kind of core driving problembehind where Meta finds itself today?
Yeah, that's definitely part of it.
I mean, that's, that's part of thereputation of Meta as an AI research

(48:51):
club also, I mean, part of theadvantage of Meta and why people
might wider go to Meta is because ofa very open source friendly nature.
They're only, they're only veryopen source friendly because
they're forced to do that.
'cause it's the only way they canget headlines while they pump out me.
But, but, but regardless, regardless,it's, it's still a factor here.

(49:13):
Yeah.
One last thing of noting,on this whole story.
I mean, you could do a whole speculativeanalysis of what went on the meta.
They did also try to throw a lot ofpeople at the problem scale up to, from a
couple hundred to like a thousand people.
I think probably had a similarsituation to Google where it
was like big company problems.

(49:33):
Right.
OpenAI, Andro, they're still,they're huge, but they're, they
don't have big company problems.
That's a great point.
Yeah.
They have, they have scalingcompany problems, so this
revamp could also help with us.
Oh, scaling.
Yeah.
All right.
Onto research and advancements.
No more trauma talk, I guess.
Next we have a story from DeepMindand they have developed Alpha

(49:58):
Genome, the latest in theiralpha line of scientific models.
So this one is focused on helpingresearchers understand gene functions.
It's not meant for personal genomeprediction, but more so just
general identification of patterns.

(50:19):
So it could help identifyingcausative mutations in patients
with ultra rare cancers.
So for instance, which mutations areresponsible for incorrect gene expression?
I'm, I'm gonna be honest, you know,there's a deep, a lot of deep science
here with regards to biology and genomics,which I am not at all an expert on.

(50:41):
And the, the gist of it issimilar to alpha fold, similar
to other alpha efforts.
On the benchmarks dealing with theproblems that geneticists deal with,
the kind of prediction issues, theanalysis alpha Genome kind of beats
all existing techniques out of a parkon almost every single benchmark.

(51:06):
It is superseding previous effortsand, and the swan model is able
to do a lot of things all at once.
So, again, not really my backgroundto come with him was too much, but I'm
sure that this is along the lines ofalpha fold in terms of alpha fold was
very useful scientifically for makingpredictions about gene folding protein

(51:29):
folding alpha genome is presumablygonna be very useful for understanding
genomics, for making predictions aboutwhich genes do what things like that.
it's a really interesting take that's Iguess a, a fundamentally different way of
approaching the let's understand biologyproblem that, that Google Eat Mind, and
then it's, its subsidiary, I guess it'sit's spawned company Isomorphic Labs,

(51:52):
which by the way, Demis is the CEO ofand, and very focused on, I hear has kind
of been, been very focused on, anyway.
when you look at alpha fold you'relooking at essentially predicting the
structure and, and to some degree thefunction of, of proteins from the.
Lego blocks that make upthose proteins, right?
The amino acids, the individualamino acids get that, get
chained together, right?
So you got, you know, 20 aminoacids you can pick from and, and

(52:14):
that's how you build a protein.
And depending on the amino acidsthat you have, some of their positive
charge, some of their negative,some of are polar, some of 'em not.
and then the thing will fold in a certainway that is distinct from the problem
of saying, okay, I've got a strandof, you know, 300 billion base pair,
sorry, 3 billion base pairs of DNA.
And what I wanna know is if I take thisone base pair and I switch it from I

(52:38):
don't know like from an A to a T right?
Or from a G to an A. whathappens to the, the protein?
What happens to the downstreamkind of biological activity?
What cascades does that have,what effects does it have?
And that question is a, it's aninteresting question because it
depends on your ability to modelbiology in a pretty interesting way.

(52:59):
it, it also is tethered to anactual phenomenon in biology.
So there's a thing called thesingle nucleotide polymorphism.
There's some nucleotides in thehuman genome that you'll often see.
Can, can either be like aa G or a T or something.
And you'll see some people whohave the G variant and some
people have the T variant.
And it's often the casethat some of these.

(53:20):
Variants are associatedwith a particular disease.
And so there's like a, I used to work ina genomics lab doing cardiology research
back in the day, and there's like famousvariant called nine p 21.3 or something.
And, you know, if some peoplehad, I forget what it was, the T
version, you have a higher riskof getting coronary artery disease
or atherosclerosis or whatever.
and not if you had the other one.

(53:41):
So, essentially what this is doing is it'sallowing you to reduce in some sense the
number of experiments you need to perform.
If you can figure out, okay, likewe have all these different possible
variations across the human genome, butonly a small number of them actually
matter for a given disease or effect.
And if we can model the genomepretty well, we might be able to

(54:01):
pin down the variants we actuallycare about so that we can run more
controlled experiments, right?
So we know that, hey, you know,patient A and patient B, they may have
like a zillion different differencesin their genomes, but actually
for the purpose of this effect.
They're quite comparableor they ought to be.
So, so this anyway, a really, Ithink, interesting next advance
from from Google DeepMind.

(54:22):
And, I expect that we'll see a lotmore 'cause they are explicitly
interested in that direction, right?
And very least pretty detailed researchpaper, a preprint on this as they
have of alpha fold 55 page paper,describing the model, describing the
results, describing the data, all ofthat also released an API so a client

(54:44):
side ability to query the model and itis free of charge for non-commercial
use with some query limiting.
So yeah, again, similar toalpha fold, they're making this
available to scientists to use.
They haven't open source thisyet the model itself, but
they did explain how it works.

(55:08):
So.
Certainly exciting and, and always funto see DeepMind doing this kind of stuff.
And up next we have DirectReasoning Optimization, DRO.
So we've got, you know, GRPO, we've gotDPO, we've like, you know, there, there,
there's so many so many pos or Ros or o's,so many o's so LLMs can reward and refine

(55:28):
their own reasoning for open-ended tasks.
I like this paper.
I like this paper a lot.
It's I, I think I, I might have talkedabout this on the podcast before.
I used to have a. A prof who wouldlike ask these very simple questions
when you were presenting something andthey were like, embarrassingly simple.
And you would, you would be embarrassedto ask that question, but then

(55:48):
that always turns out to be theright and deepest question to ask.
This is one of those papers, it's like,it's very simple concept, but it's
something that when you realize it,you're like, oh my god, that was missing.
So first let's just talk abouthow currently we typically train
reasoning into models, right?
So you have some output thatyou know is correct, right?

(56:09):
Some answer the desired or target output.
And you've got your input.
So what you're gonna do is you'regonna feed your input to your model.
You're gonna get it to generate abunch of different reasoning traces.
And then in each case you're going tolook at those reasoning traces feed
them into the model and based on thereasoning trace that the model generated,

(56:29):
see what probability is, signs to thetarget output that you know is correct.
So reasoning traces that are correct ingeneral will lead to a higher probability
that the model places on the targetoutcome because it's the right outcome.
So if the reasoning is correct,it's gonna be give a higher
probability to the outcome.
So this is sort of, it feels a littlebit backwards from the way we normally

(56:50):
train these models, but this is howit's done, at least in, in GRPO group
relative pol policy optimization.
So essentially you rewardthe model to incentivize high
probability of the desired.
Output conditioned onthe reasoning traces.
And this makes you generate overtime, better and better reasoning
traces 'cause you wanna generatereasoning traces that assign higher
probability to the correct output.

(57:12):
So the intuition here is if yourreasoning is good, you should be very
confident about the correct answer.
Right now, this breaks and itbreaks in a really interesting way.
Even if your reference answer isexactly correct you can end up being
too forgiving to the model duringtraining because the way that you score

(57:33):
the model's confidence in the correctanswer based on the reasoning traces,
is you average together essentiallythe confidence scores of each of the
answer tokens in the correct answer.
Now the problem is.
The first token of the correct answeroften gives away the answer itself.
So even if the reasoning stream wascompletely wrong, like even if, let's

(57:56):
say the question was like who scoredthe winning goal and the soccer game?
And the answer was Lionel Mess.
If the model's reasoning is like, I thinkit was Christiana Ronaldo the model is
going to okay from there assign a lowprobability to Lionel, which is the,
the first word of the correct answer.
But once it reads the wordLionel, the model knows that
messy must be the next I. Token.

(58:18):
So it's gonna assign up actuallya high probability to messy, even
though its reasoning Trace saidChristiana Christiana Ronaldo.
And so, essentially this suggeststhat there are some tokens in the
answer that are going to actuallylike, correctly reflect your, the
quality of your model's reasoning.
So, you know, if your model'sreasoning was, I think it was

(58:40):
Christian Christiano, Ronaldo, andthe actual answer was Lionel Messi.
Well Lionel, you should expect it to havevery low confidence in so that's good.
It you'll, you'll be able toactually correctly determine that
your reasoning was wrong there.
But once you get Lionel.
As in, as part of the prompt, thenmessy, all of a sudden becomes obvious.
And so you get a bit of a misfire there.
So essentially what they're gonna do isthey're gonna calculate, like they'll

(59:03):
feed in a whole bunch of reasoningtraces and they'll look at each of
the tokens in the correct output andsee which of those tokens vary a lot.
tokens that are actually reflectiveof the quality of the reasoning
should have high variance, right?
Because if you have goodreasoning trajectory, those tokens
should have high confidence.
And if you have a badreasoning trajectory, they
should have low confidence.

(59:24):
But then you have some like kind ofless reasoning reflective tokens,
like say messy in Lionel messy.
'cause then Lionel hasalready given it away.
You should expect messy to consistentlyhave high confidence because again,
even if your reasoning trace istotally wrong by the time you
get Lionel as by the time you've,you've read Lionel Messy is obvious.
It's almost like I. If you're writinga test and you can see like the

(59:47):
first word in the correct answerwell, yeah, you're gonna get, even if
your thinking was completely wrong,you're gonna get the correct second
word if the answer is Lionel messy.
So anyway, this is just way that theyuse to kind of detect good reasoning
and then they feed that into anyway,a, a broader algorithm that beyond
that is, is fairly, fairly simple.
Nothing too, too shocking.
They just fold this into somethingthat looks a lot like A-G-R-P-O

(01:00:10):
to get a, this DRO algorithm.
Right.
Yeah, they, it's been a while inthe paper contrasting it with other
recent work that deals with thatdoesn't pay attention to tokens.
Basically, so that, just to contextualizewhat you were saying their focus
is on this our free reasoning,reflection, reward, and DRO direct T

(01:00:33):
reasoning optimization is basicallyGRPO, what people use generally for
RL typically with verifiable rewardhere, where focus is how do we train
kind of generally in an open-endedfashion over long reasoning chains.
Identify some of these issues and existingapproaches and highlight this reasoning

(01:00:55):
reflection award that basically islooking at, add consistency between these
tokens in the chain of thought and inthe output as a signal to optimize over.
And as you might expect, you know,they do some experiments, they show
that this winds up being quite useful.
And, and I think another indicationof we are still in the early-ish days

(01:01:18):
of using RL and training reasoning.
There's a lot of noise and a lot ofsignificant insights being leveraged.
Last thing, DRO, I guess kind of areference to DPO, as you said, DPO
is direct preference optimizationand this is direct reasoning
optimization, not super related.
It's just, I guess fun,fun naming conventions.

(01:01:40):
'cause aside from arguably beingsort of, analogous in terms of
the difference between a L basedpreference alignment and DPO.
Anyway, it, it's kindof a funny reference.
Yeah.
next paper.
Far seer a refined scalinglaw in large language models.
So we've talked about scaling laws a ton.

(01:02:03):
Uh, Basically you try to collect abunch of data points of, you know,
once you use this much compute orthis much training flops or whatever,
you get to this particular loss onlanguage prediction, typically on
the actual metric of perplexity.
And then you fit some sort ofequation to those data points.

(01:02:25):
And what tends to happen isyou get a fairly good fit.
That holds for feature data pointsthat typically you're like scaling
up, scaling up, scaling up, yourloss goes down and down and down.
And people have found that somewhatsurprisingly, you can get a very good fit
that is very predictive, which was not atall kind of, common idea or something that

(01:02:51):
people have had really tried pre 2020.
So what this paper doesis basically do that, but.
Better.
It's, it's a novel and refinedscaling law that Inha provides
enhanced predictive accuracy.
And they do that by just systematicallyconstructing a model loss surface
and, and doing just a better jobof fitting to empirical data.

(01:03:16):
They say that they improve upon theChinchilla law, one of the big ones
from a couple years ago, by reducingextrapolation error by 433%, so a
much more reliable law, so to speak.
Yeah, the, the Chinchilla scalinglaw was sort of somewhat famously

(01:03:38):
Google's correction to the initialOpenAI scaling law that was
proposed, I think in a 2019 paper.
This is the so-called Kaplan Scaling Law.
And so, it, it was, chinchilla wassort of haired as this kind of big
and, and ultimately maybe pseudofinal word on how scaling would work.
It was more data heavy than theKaplan scaling laws, notably, but

(01:04:00):
what they're pointing out here is.
Chinchilla works really well formid-size models, which is basically
where it was calibrated, like, youknow, what it was designed for.
But, but it doesn't do great onvery smaller, very large models.
And obviously given that scaling is athing, very large models matter a lot.
And the whole point of a scaling lawis to extrapolate from where you are
right now to see like, okay, wellif I trained a model a hundred times

(01:04:23):
the scale and therefore at, you know,let's say a hundred times this budget
where would I expect to end up?
And you can imagine how muchdepends on those kinds of decisions.
So you want a model that is reallywell calibrated and extrapolates really
well, especially to very large models.
they do a really interestingjob in the paper.
We won't go into detail, but especiallyif you have a background in physics

(01:04:44):
like thermodynamics, they, they playthis like really interesting game.
Where they'll use finite differenceanalysis to, to kind of separate
out dependencies between n thesize of the model and d, the amount
of data that it's trained on.
And that ultimately is kind ofthe, the secret sauce, if you
wanna call it call it that here.
There's a bunch of other hijinks, butthe core pieces, they sort of break

(01:05:05):
the loss down into different terms.
One of which only depends on n, theother of which only depends on D.
So one is just model size dependent.
The other is only dependent on thesize of the training data dataset.
But then they also introduce thisinteraction effect between n and d,
between the size of the model andthe amount of data it's trained on.
And then they end up derivingwhat should that term look like?

(01:05:27):
That's one of the, the framings of this.
That's really interesting.
Just to kind of nutshell it, ifChinchilla says that data scaling
follows a consistent pattern, it's liked to the power of some negative beta
coefficient, regardless of model size.
Like no matter how big your model is,it's always d to the power of negative
B. So if I give you the, the amount ofdata you can determine the contribution

(01:05:50):
of the data term what Farer says is datascaling actually depends on model size.
Bigger models just fundamentallylearn from data in a different way.
And we'll park it there, but, butthere's a lot of cool extrapolation
to figure out how exactly doesthis term have to look exactly.
And, and this is very useful, not just.
To sort of know what you're gonna get.

(01:06:10):
That aspect of it means that fora given compute budget, you can
predict what balance of data tomodel size is likely optimal and
basically is when you're spendingmillions of dollars training a model.
It's pretty nice to knowthese kinds of things.
Right?
And one more paper.
Next one is LLM first search, Self-GuidedExploration of the Solution Space.

(01:06:38):
So the gist of this is there are manyways to do search where search just
means, you know, you're look at onething and then you decide on some.
Other things to look at, and you keepdoing that until you find a solution.
So the typical, or one of thetypical ways is Monte Carlo
research a classic algorithm.

(01:06:59):
And this was, for instance,done with a while ago.
If you wanna combine with an LMtypically what you do is you assign
some score to a given location and,and make perhaps some predictions, and
then you have an existing algorithmto sample or to decide where to go.
The key difference here with LLM fora search is basically forget that

(01:07:19):
motor college research, forget anypreexisting search algorithm or technique.
Just make the LLM decide where to go.
It can decide how to do the search.
And they say that this is moreflexible, more contact can contact
sensitive, requires less tuningand just seems to work better.

(01:07:42):
It's, all prompt level stuff, right?
So there's no optimization goingon, no training, no fine tuning.
It's just like give, like, givethe model a prompt so number one.
Find a way to represent the sequenceof actions that have led to the
current moment in whatever problemthe language model is trying to
solve in a way that's consistent.

(01:08:02):
So like essentially format, let's sayall the chess moves up till this point
in a consistent way so that the modelcan look at the state and the, the
history of the, the board, if you will.
And then give the modela prompt that says.
Okay.
From here, like I, I want you todecide whether to continue on the
current path or look at alternativebranches, alternative trajectories.

(01:08:23):
the prompt is like, here are someimportant considerations when
deciding whether to explore orcontinue, and then it lists a bunch.
And then similarly they have thesame but for the evaluation stage
where you're scoring the availableoptions and getting the model to
choose the most promising one.
So, you know, it's like, here aresome important considerations when
evaluating possible operations that youcould take or actions you could take.

(01:08:44):
So once you combine those things together,basically at each stage, I'll call it,
of the game or of the problem solvingthe model has a, a complete history of
all the actions taken up to that point.
It's then prompted evaluate the optionsbefore it and to decide whether to.
Continue to explore and kind ofadd new options or to select one of
the options and execute against it.

(01:09:06):
anyway, that's basically it.
Like, it's a pretty conceptually simpleidea, just offload the tree and branching
structure development to the model.
So it's thinking them, thinkingthem through in, in real time.
Pretty impressive performance jumps.
So, when using g PT four oh when comparedwith standard Monte Carlo tree search on

(01:09:26):
the scheme of countdown, were essentially.
You're given a bunch of numbersand all the standard mathematical
operations, addition division,multiplication, subtraction you're
trying to figure out how do I combinethese numbers to get a target number.
So at each stage you have to choose,okay, do I try adding these together?
Do I anyway so 47% on this,using this technique versus 32%

(01:09:46):
using Monte Carlo tree search.
And this effect amplifies.
So the advantage amplifies asyou work with stronger models.
So on oh three mini, forexample, 79% versus 41% for
Mon Monte Carlo tree search.
So Reasoning models seem to beable to take advantage of this.
You can think of it as a kindof scaffold, a lot better.
It also uses fewer tokens soit's getting better performance.

(01:10:09):
It's using fewer tokens, so less computethan Monte Carlo tree search as well.
So that's, that's reallyinteresting, right?
This is a, a way more efficientway of squeezing performance
out of existing models, and it'sall just based on very kind of
interpretable and tweakable prompts.
Right.
And, and they compare this not just toMontecarlo Tree search, we also compare it
to three of thoughts or three of thoughts.

(01:10:33):
Bread for a search, best for a search.
All these are, by the way, arepretty significant because search
broadly is like, there's a sequenceof actions that can take and I
want to get the best outcome.
And, you know, so you needto think many steps ahead.
And so depending you brancheshere mean like, I take this
step and this and this step.

(01:10:54):
Well, you can either go deeper or widerin terms of how many steps you consider.
One step ahead, two step ahead.
And this is essential formany types of problems.
You know, chess go obviously, but broadlywe do search and all sorts of things.
So having a better approach tosearch means you could do better.
Reasoning means you coulddo better problem solving.

(01:11:14):
And moving on to policy and safety.
We have one main storyhere called unsupervised
elicitation of language models.
This is really interesting, and I'llbe honest like was a head scratcher for
my, like I spent a good, Embarrassingamount of time with Claude trying
to help me through the paper.
Which is sort of ironic because I, ifI remember, it's an anthropic paper.

(01:11:34):
But this is essentially a way of gettinga language model's internal understanding
of logic to help it to solve problems.
So imagine that you have a bunchof math problems and solutions.
So for example, you know, what'sfive plus three, and then you
have a possible solution, right?

(01:11:54):
Maybe it's eight.
The next problem is like,what's seven plus two?
And you have a possible solution,and that possible solution is maybe
10, which is wrong, by the way.
So some of these possiblesolutions are gonna be wrong.
So you have a bunch of math problems andpossible solutions and you don't know what
you're, which are correct and incorrect.
And you wanna train a language modelto identify correct solutions, right?
You want to figure out whichof these are actually correct.

(01:12:16):
So imagine you just laythese all out in a list.
You have, you know, what's five plusthree and then solution eight, what's
seven plus two solution 10 and, and so on.
Now what you're gonna do is you're gonnarandomly assign correct and incorrect
labels to a few of these examples, right?
So you'll say, you know, five plusthree equals eight, and you'll just

(01:12:36):
randomly say, okay, that's correct.
And seven plus two equals 10,which by the way is wrong.
But you'll randomly saythat's correct, right?
Then you're going to get the model tosay, given the the correctness scores
that we have here, given that solutionone is correct and solution two is
correct, what should solution three beroughly, or, you know, given all the

(01:12:58):
incorrect and incorrect and correct labelsthat we've assigned randomly, secretly
what should be this missing label?
And generally, because you've randomlyassigned these labels, the model's gonna
get really confused because there'sa logical inconsistency between these
randomly assigned labels a bunch ofthe problems that you've labeled as

(01:13:18):
correct or actually wrong and vice versa.
And so now what you're gonna do isessentially try to like measure how
confused the model is about that problem.
And you are then gonnaflip one label, so you'll.
Kind of think of like flippingthe, the correct or incorrect

(01:13:39):
label on one of these.
One of these problems from correctto incorrect say, and then you'll
repeat and you'll see if you get alower confusion score from the model.
anyway, this, this is roughlythe, the, the concept.
And so over time you'regonna gradually converge on a
lower, lower confusion score.
And that's, it's sort of like, feelsalmost like the model's relaxing into

(01:14:02):
the correct answer, which is why thisis a lot like simulated and kneeling.
If you're, if you're familiar withthat, you're making random modifications
to the problem until you get a reallylow loss and you gradually kind
of relax into the correct answer.
I hope that makes sense.
It's sort of like you kindof gotta see it and it's.
Right.
Just to give some motivation, they framethis problem, and this is from Tropic

(01:14:25):
and, and a couple other institutes.
By the way.
They frame this in thecontext of super human models.
So the unsupervised elicitation part ofthis is about the aspect of how do you
train a model to do certain things, right?
And these days, the common paradigmis you train your language model

(01:14:48):
via pre-training, then you.
Post train you have some labels foryour words or preferences of outputs,
and then you do RLHF or you do DPO tomake a model, do what you want it to do.
But the framework or the idea hereis once you get to superhuman ai,

(01:15:08):
well, maybe humid, can't actually,you know, see what it does.
And, kind of give it to labelsof what is good and what's not.
So this internal coherence maximizationframework makes it so you can elicit
the good behaviors, the desiredbehaviors from the LLM without

(01:15:31):
external supervision by humans.
And, and the, the key distinction herefrom previous efforts in this kind of
direction is that they do it at scale.
So they train a cloud 3.5 haiku basedassistant without any human labels
and achieve better performance thanits human supervised counterparts.

(01:15:52):
They demonstrate in practice on a,you know, significantly sized LLM,
that this approach can work andthis could have implications for
feature, you know, even larger models.
Next up, a couple storieson the policy side.
Well, actually only one story.
It's about Taiwan and it hasimposed technology export

(01:16:14):
controls on Hu Huawei and SMIC.
Taiwan has actually black listedHuawei and SMIC, semiconductor
Manufacturing InternationalCorp. And this is from Taiwan's
International Trade on Administration.
They have also included subsidiariesof these it's an update to

(01:16:34):
their so-called strategic hightech commodities entity list.
And apparently we added not just those601 entities from Russia, Pakistan,
Iran, Myanmar, and mainly, and China.
Yeah.
And one um, you know, reactionyou might have looking at
this is like, wait a minute.
I thought China was already barred fromaccessing, for example, chips from Taiwan.

(01:16:57):
And you're, you're absolutely correct.
That is the case.
That was my reaction.
Yeah, yeah, yeah.
No, totally, totally.
It's, it's a great question.
Like, so what, like what isactually being added here?
And so the answer is because of us exportcontrols, and we can get, we don't, we
won't get into the reason why us, the UShas leverage to, to do this, but they do.
Taiwanese chips are not going intomainland China, at least theoretically.

(01:17:19):
Obviously Huawei finds ways aroundthat, but this is actually a kind
of broader thing to deal with.
A whole bunch of plant constructiontechnologies, for example specialized
materials, equipment that.
It isn't necessarilycovered by US control.
So there's sort of broadersupply chain coverage here.
Whereas US controls are morefocused on cutting off, like
specifically chip manufacturing here.

(01:17:40):
Taiwan is formally blocking access tothe whole semiconductor supply chain.
It's everything from specializedchemicals and materials to manufacturing
equipment, technical services.
So sort of viewed as this loopholeclosing exercise coming from Taiwan.
This is quite interesting because it'scoming from Taiwan as well, right?
This is not the US kind of leaningin and, and forcing anything to

(01:18:01):
happen though, you know, who knowswhat happened behind closed doors.
It's interesting that Taiwan is takingthis kind of hawkish stance on China.
So even though Huawei couldn'tget TSMC to manufacture their
best chips, they have been workingwith SMIC to develop some domestic
capabilities for chip manufacturing.
anyway, this basically just makesit harder for that to happen.

(01:18:22):
Next up paper, dealing with someconcerns actually from a couple weeks
ago, but I don't think we covered it.
So worth going over it pretty quickly.
The title of a paper isYour Brain on the Chad, GPT.
Accumulation of cognitive debt when usingan AI assistant for essay writing tasks.

(01:18:43):
So what they do in this paper is have afew have 54 participants write essays.
Some of them can useLS to help 'em do that.
Some of them can use search enginesto throughout dut, some of them have
to do it themselves, no tools at all.
And then they do a bunch of stuff.

(01:19:03):
They first measure thebrain activity with EEGs.
To they say assess cognitiveload during essay writing.
They follow up by looking at recallmetrics and the reduced results
is there's significant differencesbetween the different groups.
EEGs reveal less circled brainconnectivity between brain only

(01:19:28):
participants and LLM participantsand search participants.
Similarly self-reported ownership.
Recall all these things differed.
This one got a lot of play, Ithink on Twitter and so on, and
I. Quite a bit of criticism.
Also, I think in, in overblowingthe conclusions, I think the
notion of cognitive debt.

(01:19:48):
The framing here is that there'slong-term negative effects on
cognitive per performance due todecreased mental effort and engagement.
And you can certainly question whetherthat's the conclusion you can draw here.
What they show is if you use atool to write an essay, it takes up
less effort and you probably don'tremember what is in the essay as well.

(01:20:09):
Does that transfer to long-term negativeeffects on cognitive performance due
to decreased mental effort engagementMaybe I like, and I, my, all I have
is a, a personal take on this too.
Like, I think that good, sogood writers are good thinkers.
Because you're, when you, when youare forced to sit down and, and
write something, at least it's beenmy experience that I don't really

(01:20:31):
understand something until I'vewritten something about it with intent.
And so, in fact, I, when I'm trying tounderstand something new that I actually
make myself write it out because itjust doesn't stick in the same way.
Dif different people may be different,but I suspect that maybe less so than
some people might assume they are.
So I think, at least for people likeme, I imagine this would be an effect.

(01:20:54):
It's interesting they say, yeah,after writing 17% of chat, GPT users
could quote their own sentencesversus 89% for the brain only group,
the ones who didn't use even Google.
The other interesting thing here is that.
by various measures, Google iseither between using chat GPT and
going brain only, or it can evenbe slightly better than brain only.
I, I thought that wasquite interesting, right?

(01:21:15):
Like Google is sort of this thing thatallows like, fairly obsessed people
like myself to kind of do deep dives on,let's say technical topics and learn way
faster than they otherwise could withoutnecessarily giving them the answer.
And.
Chat, JPT at least, or, or LLMs at leastopen up the possibility to not do that.

(01:21:36):
Now, I will say, I think there are waysof using those models that, that actually
do accelerate, accelerate your learning.
I, I, I think I've experienced thatmyself, but the reten, there has to be
some kind of innate thing that you do.
At least I, I don't know, I'm selfdiagnosing right now, but there there's
gonna be some kind of innate thingthat I do like whether it's writing or
drawing something or making a graphic.

(01:21:56):
To actually make it stick and make me feela sense of ownership over the knowledge.
But yeah, I mean, look,we're gonna find out, right?
People have been talking about theeffects of technology on the human brain
for, since the printing press, right?
When people are saying like, Hey, werely on our brains to store memories.
If you just start getting peopleto read books, well now the human
ability to, to have long-termmemory is gonna atrophy and.

(01:22:19):
And you know what, it, it probablydid in some ways, but we, we
kind of found ways around that.
So I think, you know, th this mayturn out to be just another thing
like that, or it may turn out toactually be somewhat fundamental.
Because, you know, back inthe, the days of the printing
press, you still had to survive.
Like, you know, there, there was enoughkind of real and present pressure on
you to to learn stuff and retain that.

(01:22:41):
You know, maybe it didn't have the effectit otherwise would, but interesting study.
I'm sure we'll keep seeing analyses andanalyses for the next the next few months.
Yeah, quite a long paper, like 87 pages.
Lots of details about thebrain connectivity results.
And ironically, it wastoo long for me to read.
No, that's actually true.
I used an LM for this one.

(01:23:01):
It's like, anyway, I, I have seenquite a bit of criticism on the, yeah.
Precise methodology of a paperand some of its conclusions.
I think also in some waysit's very common sense.
You know, if you don't put ineffort doing something, you're
not gonna get better at it.
Yeah.
You know, that's alreadysomething we know.
But.
I, I guess there shouldn'tbe too much of a hitter.

(01:23:22):
I'm sure this paper also has some niceempirical results that are useful in,
as you say, like a very relevant lineof work with regards to what actual
cognitive impacts usage Olms has.
And and how important is it tolike go brain only sometimes.
All right, onto synthetic media and art.

(01:23:44):
Just do more stories to cover, andas promised in the beginning, these
ones are dealing with copyright.
So last week we talked about howphilanthropic scored a copyright win.
The gist of that conclusion was that usingcontent from books to train LLMs is fine.
At least philanthropic.

(01:24:04):
What is actually bad is ppirating books in the first place.
So philanthropic bought a bunch of books,scanned them, and used the scanned data to
train LLM, and that kind of passed a bar.
It was Okay.
So now we have a new rulingabout a judge rejecting an off.
Some offers claims that metaai training has violated.

(01:24:27):
Copyrights.
So the, a federal judge has dismisseda copyright infringement claim by
13 offers against meta for usingtheir books to train its AI models.
The Judge Vincent RIA, has ruled thatMeta's used of nearly 200,000 books,
including the people suing to Train theLama language model constituted fair use.

(01:24:52):
And this does similarly alignwith a ruling, very ruling
about philanthropic with Claude.
So, this is a rejection ofthe claim that this is piracy.
Basically the judgment is that theoutputs of LAMA are transformative,
so you're not infringing on copyright.

(01:25:14):
And, and this is, you know, using thedata for training and a language model.
Is Fred use and copyright doesn't apply.
Is at, at least as far as I can tell, isagain, not a lawyer, is the conclusion
seems like a pretty big deal, like thelegal precedent for wherever it's legal
to use the outputs of a model, when someof the inputs to it where copyrighted

(01:25:38):
appears to be being kind of figured out.
Yeah, this is super interesting, right?
You've got judges trying to like squarethe circle on allowing what is obviously
a very transformational technology.
And, but I mean, the challenge is like,no, no author ever wrote a book until
say 2020 or whatever with the expectationthat this technology would be there.

(01:26:01):
It's just sort of like, no one everimagined that facial recognition would
get to where it is when Facebook wasfirst founded and people, or MySpace
and people first started uploading, youknow, a bunch of pictures of themselves
and their kids, and it's like, yeah.
Now that's out there and you're waitingfor a generation of software that can
use it in ways that you don't want it to.
Right?
Like, you know, deep fakes, I'm sure werenot even remotely on the radar of people

(01:26:25):
who posted pictures of their childrenon MySpace in the late nineties, right?
That's like, that is one extreme versionof, of where this kind of argument lands.
So now you have authors who write books.
You can say like, in good faith orassuming a certain technological
trajectory, assuming that thosebooks when put out in, in the world.
Could not technologically be usedfor, for anything other than just,

(01:26:48):
you know, what they expected themto be used for, which is being read.
And, and now that suddenly changes.
And so, and it changes inways that undermine the market
quite directly for those books.
Like it is just a fact that if you havea great, like a book that really explains
a, a technical concept very well and yourlanguage model is trained on that book.
And now can also explainthat concept really well.

(01:27:08):
Not using the exact same words,but maybe having been informed by
it, maybe having, you know, using.
Analogous strategies.
It's hard to argue that that doesn'tundercut the market for the original book.
But it, it is transformative.
Right.
The threshold that the judge in thiscase was using was that llama cannot
create copies of more than 50 words.
Well, yeah.

(01:27:29):
I mean, you can, you, every wordcould be different, but it could still
be writing in the style of, right.
And, and that's kind of a differentthreshold that you could otherwise
have imagined the judge could havegone with or something like that.
But, there is openness apparently fromthe judge to this argument that AI could
destroy the market for original worksor original books just by making it
easy to create tons of cheap knockoffs,and they're claiming that likely

(01:27:52):
would not be fair use, even if theoutputs were different from the inputs.
But again, the challenge here is thatit's not necessarily just books, right?
It's also like you just want a goodexplanation for a thing, and the form
factor that's best for you is a, acouple sentences rather than a book.
So maybe you err on the side of thelanguage model and maybe you just

(01:28:12):
keep doing that, whereas in the pastyou might have had to buy a book.
So, I, I think overall this makes asmuch sense as any judgment on this.
I, I don't have, you know, like I,I, I feel, feel deeply for the, the
judges who are put in the positionof having to make this call.
It's just tough.
I mean, you, you can, you can make yourown call as to what makes sense, but, but

(01:28:33):
man is this littered with with nuance.
It's, it's, it is worth noting tospeak of nuance that the judge did
very explicitly say that this isjudging on this case specifically
not about the topic as a whole.
He did frame it as Caprio law beingabout, more than anything, preserving
the incentive for humans to createartistic and scientific works.

(01:28:58):
And fair use would not apply, as yousaid, to copying that would significantly
diminish the ability of copyrightholders to make money from their work.
And so in this case meta presentedevidence that book sales did not go
down after llama released four theseoffers, which included for instance, so
Sour Silverman Junot Diaz, and overallthere were 13 offers in this case.

(01:29:24):
So, yes, this is not necessarilyestablishing precedent in general for any.
Suit that is wrought.
But at least in this case, theconclusion is meta doesn't have to pay.
These offers and generally did notgo against copyrights by training
on the data of their books withoutasking for permission or paying them.

(01:29:46):
And just one last story.
The next one is that Getty hasdropped some key copyright claims
in its lawsuit against stability ai,although it is continuing a UK lawsuit.
So, the primary claim against spaghettiwas about copyright infringement.

(01:30:09):
So they dropped a claim about stability,AI using millions of copyrighted images
to train its AI model without permission.
But they still are keeping thesecondary infringement and I guess
trademark infringement claims thatsay that AI models could be considered

(01:30:32):
infringing articles if used in theuk, even a phrase, trained elsewhere.
So, honestly, don't fully getthe, the legal implications here.
It seems like in this case in particular,it was the claims were dropped because of
weak evidence and lack of knowledgeablewitnesses from stability, ai.

(01:30:54):
There's also apparently jurisdictionalissues where these kind of lacking
evidence could be problematic.
So.
A development that is notdirectly connected to these
prior things we were discussing.
Seems to be, again, fairly specificto this particular lawsuit.
But in our case of, you know,copyright in cases going forward,

(01:31:18):
this one being a pretty significantone dealing with training on the
images and if you're dropping yourkey claim in this lawsuit that you
know, bodes well for stability, ai.
And that's it for thisepisode of Last in ai.
Thank you for all of you who listenedat one X Speed Routes being up.

(01:31:38):
And thank you for all of youwho tune in week to week share
the podcast review and so on.
So on.
Please keep tuning in.
Advertise With Us

Popular Podcasts

United States of Kennedy
Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.