#169 - Google's Search Errors, OpenAI news & DRAMA, new leaderboards - Last Week in AI

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Andrey (00:06):
Hello and welcome to a new episode of Last Week in AI,
where we will summarize and discuss some of last week's most
interesting AI news. And as always, I want to mention we also have a
text newsletter last weekend that I where we have even more news
for you to read about and one of your co-hosts Andrey Kurenkov.
I graduated from Stanford, where I learned about AI

(00:29):
supposedly last year, and I now work at a generative AI
startup.
And back again, we have a regular co-host.
Welcome back Jeremie.

Jeremie (00:39):
Yeah. Hey, man, it's good to be here. It's good to be back.
I like how you allegedly tier your Stanford bio.
Stanford being, you know, famous for.
Yeah, like like bullshit AI research, as we all know.
Yeah. It's it's good to be back, man.
I just emerged from the the YouTube comment section, and,
with a couple of bumps and bruises.
But but, yeah. No, it's it's good to be back.

Andrey (01:01):
Yeah. The comment section of a Joe Rogan podcast where
you were a guest.
And. Yeah, hopefully we got some converts and you'll
like your podcast. We'll see.
But, I'm sure, if nothing else, you've educated a lot of people about
AI policy and safety on there.

Jeremie (01:19):
Dude. The like the comments are amazing.
Like Joe warned us before we went out, he's like, this is going to
be like nothing I've ever experienced in the comment.
You know, the comments especially, it was it was really funny.
Like, I mean, obviously it's a very polarizing issue.
You talk about AI, you talk about AI safety capabilities, AGI
anytime. But yeah, there are some there's like a whole section about
just like people are like, I don't trust these guys just based on

(01:41):
the spelling of their names, which I get.
I get that right. We have very, very suspicious name spellings.
Totally, totally valid.
There's cool stuff there about like, yeah, we're like, what is it?
We're spooks. We're spooks for the USG, we're spooks for some other
government. We are secretly trying to, like, help the labs,
like, accelerate. It's actually a lot like what we saw with the the

(02:04):
action plan release, but just more extreme.
Or where I think we had, like, we found four different clusters of
conspiracy theories, one of which was like we're trying to help
OpenAI achieve regulatory capture.
The other was we're trying to have the US government stop all the AI
research, fill out the other ones.
We're trying to help China. And then the last one was we're actually
like secretly, because our thing is is not strong enough.

(02:25):
Like our recommendations aren't strong enough. We're trying to like
unshackle. So because we're not saying stop AI research.
So that must mean that we're saying like barrel head build, you
know, anyway, it's a whole thing. So I love that, it's always a
pleasure to be to be, the center of some kind of conspiracy theory
or other. So, always fun, but glad to be back.
And and excited for for the show.

(02:47):
We got a lot to cover.

Andrey (02:48):
We got a lot to cover.
And before that, as usual, I do want to just take a moment to call
out some nice listener comments.
We got a couple podcast reviews on Apple Podcasts.
One. I'm sad that the podcast is detailed but approachable.
We get right balance of getting into detail without getting

(03:09):
into too much technical stuff, which we do try to do over sometimes.
You do get very nerdy, and number one is, good overview of
AI hype. And that was very,
nice. We did mention a bit of feedback where
it would be nice if sometimes you talked about critics like Gary
Marcus or people out there, and I think that's a that's a good

(03:31):
feedback where we should, kind of call out some of
these things. And lastly, we also got some comments
on YouTube and we got Mister Billy the Fisherman
had some, good sort of information about how the EU
has each country has, its own laws

(03:51):
and can veto EU laws, and it's kind of similar
to the US and that there are federal and state laws.
So yeah, thank you for that comment.
And FYI, I am now actually recording the video and
editing video of the podcast.
So if you do like watching that stuff on YouTube or seeing our faces,

(04:12):
you can go over there and like and subscribe and so on.
And in the future I might add graphics, I might add a bit of fence
yourself. For now it's just our faces talking instead of just
the audio.

Jeremie (04:25):
Yeah, that's why your hair looks so good.
We've been we've been working on it for the last couple of hours.

Andrey (04:29):
Oh yeah.
All right. Well, that said, let's get into the news starting
as usual with the tools and apps section.
And we have one main story to start with, which is that Google's
AI search errors have caused a further on
line. So after a very big Google IO

(04:51):
event where we covered a lot of news, it did roll out and new
feature for search called AI overview, which essentially.
In response, has a generative AI kind of summary
of some sources at the very top of the search before you get
to any links. And as soon as it came out, the internet did its usual

(05:12):
thing of testing it and trying to break it.
And people very much found that it is a bit
broken in many ways, and there are many examples of this.
On the funnier side, I think it led to some recipes
where you could put glue on a pizza, and
it told you to eat rocks if you ask it.

(05:33):
How many rocks should I eat per day?
It said, well, I don't know, one, two, three or something like that.
And then on a more serious side, if you asked how
many, Muslim presidents did the US have, it
claimed that Barack Obama was Muslim, which was a big
sort of, conspiracy theory back in the day.

(05:54):
And, yeah, that's I think, a serious example of, you know,
if if anyone's like, how many rock should I eat?
Well, that's not gonna cause real misinformation.
But if you're saying that Barack Obama was Muslim, that's pretty
serious. And that's pretty problematic.
And Google's response to all this has been very defensive.
It has basically said all of this happens for a minority of queries.

(06:18):
It, isn't a big deal.
They have kept it and apparently they have been manually
removing bad results instead of like trying to fix this feature.
So a real bad blunder on that part again,
as happened a few months ago, where the image generator of faces
and I think showcases that Google is unlike some

(06:42):
of our previous efforts with Waymo or things like that
are moving super fast and not very carefully into adding
all of these AI features.

Jeremie (06:51):
Yeah, like one way to read this is as a response to pressure.
You know, everybody's talked about AI right.
Exerting pressure on Google famously I think Microsoft's so Satya
Nadella CEO of Microsoft obviously with that close title, OpenAI
said I want to make Google dense, right.
Mission accomplished. All these launches, the Bard stuff, the, you
know, image generation stuff like the Gemini launch and now this.

(07:12):
It's, you know, it does read like unforced errors.
But of course, the challenge is when you want to move at this kind
of velocity and ship faster, something's got to give.
You're going to have either more false positive errors or more false
negative errors. And you know, they're they're choosing to go in
this direction, which may ultimately actually make sense.
Faster shipping. That certainly is the philosophy at OpenAI.

(07:32):
We've seen gaffs at OpenAI. But the difference there is when you
look at OpenAI, you tend to think of them as a younger company.
You know, they're they've been around for a little bit, but they
really are for the first time, shipping some real features or have
been in the last few years. So people look at it a little bit more
forgiving. You know, they're they look at what OpenAI is doing with
a bit, you know, rose colored rose tinted glasses.
Also, the case that OpenAI has often been the first to launch a

(07:54):
completely new category of feature.
Right. So you have Google here launching something that we've kind
of seen before. It's not the first time, right?
Perplexity is experimenting with similar things as well.
No doubt. That's putting pressure additional pressure on Google and
on top of OpenAI. But they don't get credit for being first here.
So that's a big ding. So the expectation then is okay, if you're not
going to be first, you're going to be best or at least going to do

(08:15):
it without gaffs. I think that kind of raises the bar a little bit
on the expectations, especially because Google's a bigger company.
So maybe a, you know, bigger downside risk for them, but they have
no choice. I think they really have to ship faster.
They've got to be able to collect data on failures like this and
iterate their processes, which they certainly will be doing.
One thing I thought was really funny at the bottom of this New York
Times article, when they talk about, you know, all this litany of

(08:38):
kind of failures, embarrassing things that are put out by the AI
overview, they write, a correction was made to the article.
They say an earlier version of this article referred incorrectly to
a Google result from the company's new AI tool, AI overview A social
media commenter claimed that a result of for a search on depression
suggested jumping off a Golden Gate Bridge as a remedy.
That result was faked.

(09:00):
So, ironically, The New York Times, in reporting on the fact that
Google screwed up, screwed up their facts, which I thought was kind
of funny. But anyway.

Andrey (09:08):
Sort of. Yeah. And that's, that's a good example of how this kind of
became a meme on its own, where it was like a meme template.
It was really funny. Some of these things, like when you Google
cheese not sticking to pizza, it says you can also add about
one a cup of nontoxic glue to the sauce to give it
more tackiness. And I don't know, maybe that's true.

(09:29):
Maybe there is glue for food.

Jeremie (09:31):
Andre, I was just on vacation in Italy.
I can tell you this is absolutely something that they do there.
So. Yeah, look into it. That's that's all I'm gonna say.

Andrey (09:40):
That's right. But at the same time, I think you're absolutely right
that this is in response, like the narrative last year
was Google is behind.
Google has sort of lost its mojo.
It's used to be an AI leader.
Now they're really lagging.
And to be fair to them, I think they sort of fixed.
That aspect of it. We have coming out of Bard with Gemini now

(10:03):
very real player. In these frontier model categories, Gemini usually
ranks similar to cloud and ChatGPT in general,
and they did got get some things right, like Gemini for instance,
came out and was generally favorably received after
Bard was generally seen as kind of lame for a while.

(10:24):
So it makes sense that I made these mistakes.
Certainly it would have been more ideal that they moved more
carefully and avoided these kind of blunders.
But, as you say, I think on balance this is
better than continuing to be behind.
And especially from an advisor perspective, you can see how probably
investors are still kind of happy that Google is catching

(10:48):
up, or at least moving fast, rather than being super careful
and, avoiding releasing until they're super ready.
And that's the only major story for this section, so the rest is
the lightning round. First we have telegram gets an
in-app Copilot bot.
So this is from Microsoft, and they have added their Copilot bot

(11:10):
to yet another, platform this time telegram.
And this, like similar things like WhatsApp, allows users to search,
ask questions, and converse with the AI chat
bot. It's currently in beta and free for telegram users
in both mobile and desktop platforms, and I think this
is a real indicator of a trend that we have seen with how AI

(11:34):
is integrating into these products, where it's essentially added
everywhere you have meta. We saw this add added to Instagram
search, to I think WhatsApp as well, Facebook.
I think the idea a lot of these companies seem to have is that I
should be everywhere in every app, etc., where you can talk to
a chat bot in particular, whatever you want.

(11:56):
And we'll see if that's really the case that people want
that, or whether they want to just one dedicating app for talking
to ChatGPT or something like that.
But, certainly that seems to be a play currently.

Jeremie (12:09):
Yeah, it definitely seems like a worthwhile experiment.
It's also interesting when you look at all these different
platforms, the the instances of, you know, ChatGPT or
Gemini or whatever the model is that's running in the back end are
all going to be different. Of course, like with your in telegram or
WhatsApp or, you know, I could imagine at some point it starts to
make a little bit of sense to maybe try to integrate those have one

(12:29):
kind of consistent, persistent, chat bot experience
so that you're not kind of refreshing the context in each case.
Then again, in some some telegram channel, some WhatsApp channels,
you want the fresh context just for that.
But it kind of makes me wonder what the future of integration looks
like. You know, do you go horizontal or does it stay vertical in
this way? Obviously big security implications to going horizontal

(12:51):
too. But yeah, I mean, we'll see.
It's also being rolled out in beta right now.
So just sort of in testing mode.
But it's free on mobile and desktop for, for telegram user.
I'm not sure if you, you might have mentioned that already, but,
that that was sort of interesting as a broad release, Microsoft
getting more, more distribution. That's what they do really well.
Right. Getting these, these chat bots in people's pockets.

Andrey (13:11):
And next up, another kind of similar story.
And that's that opera. The browser is adding Google's Gemini
to its browser, and opera has already had this
Aria assistant that I believe already had some chat
bots. So this partnership now has
the chat bot from Google, and apparently

(13:35):
it's part of where they're heading.
So you can kind of choose this chat bot over other ones.
And there is also a feature where Aria is allowed
to add new experimental features as part of its AI feature
drop program, which apparently they have.

(13:55):
So yeah, it's another example of, you know, AI everywhere,
chat bots everywhere. And personally, to talk a little
bit more about it, I think this is probably the right approach
in some ways, where I have found personally that
I've gone to ChatGPT so much in so many cases,

(14:16):
that it makes sense to me to basically have a chat bot
accessible from anywhere as, of these, you know, large language
model, chat bot where it's built in, you
can ask for it in whatever your preferred platform is.
But I think the initial responses from a lot of people on Instagram
and WhatsApp have been like, why is this here?

(14:37):
So again, we'll see if most people actually want it.

Jeremie (14:41):
Yeah. And I think you're right. It also has a lot of strategic
implications. Right. Because now to the extent that people want this
stuff baked into the apps they're already using, rather than going
to ChatGPT or, you know, open AI, directly opening
AI, it's not that they get cut out of the process, but there's an
extra layer between them and the user, which means that data gets
shared with other people, which means, you know, there's leverage

(15:02):
that third parties like Microsoft are going to have over the
distribution and access to these systems, over and above the fact
that they themselves can make their own versions of the systems,
too, which, as you said, is what we're seeing, with Google and what
we're seeing here, actually, so I think that's kind of an
interesting strategic challenge, right? The maintaining that
monopoly on the flow of data, especially in a context where, you
know, we are we're not quite data bottlenecked yet, but we may be

(15:25):
headed in that direction in the coming years, for various reasons.
So that's a really important strategic consideration.
But yeah, the apparently this is also going to come unlike the one
we just talked about, unlike the telegram.
Copilot one, the opera one developer version is actually come with
an image generation feature which is powered by imagine two, which
is like Google's, image generation model.

(15:46):
So kind of a, more of a multi-modal experience there going for here,
which it'll be interesting to see. Like, I could see that being a
thing, you know, when you're when you're chatting with somebody and,
you know, you just want to kind of spontaneously generate an image
in that context. Might be a nice, user experience.

Andrey (16:01):
Yeah. I mean, I think a bigger deal would be once these models are
better at generating memes, I think that will really be a delight
to the internet. But I guess this is also cool.
And next, Amazon plans to give Alexa an AI overhaul
and a monthly subscription price.
So as we've kind of been waiting for, apparently

(16:23):
they plan to overhaul Alexa with generative artificial intelligence
and as a result, introduce a monthly subscription price
for that fee. This is expected to launch later
this year. And, vis a subscription costs will be on
top of Amazon Prime.
So, you know, it's going to be something you have to really opt into.

(16:47):
So yeah, I think probably the right move.
Probably Apple is doing the same thing with Siri.
Certainly Google has already done it with their AI assistant.
So something that, makes a lot of sense
and I'm sure will make Alexa better.

Jeremie (17:05):
Yeah. Amazon obviously playing a lot of catch up when it comes to
generative AI. And certainly a giant particularly have an internal
AGI team. We talked about this, I think a month or two months ago,
that it was sort of a surprise to me when I learned this, to be
honest, but it's obviously a newly spun up team, so it kind of makes
some sense. Yeah. They're trying to make up for lost time, and this
seems like a good way. I mean, they already have this brand in

(17:26):
Alexa. They have distribution in Alexa.
I think one of the things that Microsoft is showing is the
distribution really, really matters when it comes to generative AI.
So, you know, maybe this helps them close the gap.
They are actually surprisingly going to be challenged on hardware.
Amazon will be, even though they have dominated for so long on the
hardware game, being really the first, the first, like

(17:47):
servers as a service company, you know, infrastructure to service,
all that stuff. They kind of like pioneered a lot of, a lot of the
early work there. But now, is there more challenge of getting
allocation from Nvidia that's starting to turn around?
But, definitely a sense that Amazon needs to make up for lost time
here. And this is one way that they could make some real progress
there. The fact that it's not bundled into the $139

(18:09):
per year Prime offering.
Right by default. Like you said, it's kind of interesting.
Right. And it does track, if you look at the pricing for ChatGPT or
all of these other services, you're looking at like 20 bucks a
month. That's, you know, a lot of that is going to be profit,
obviously. But, you know, that's it's more then the,
annual price of of prime in this context.

(18:30):
So, you know, maybe, maybe that's just like at the at this point,
inference is just too expensive to tack it right on.
And I think we'll learn a lot about the unit economics of this sort
of thing as we start to see the bundling with services like Prime,
which may eventually happen, but we just we don't know yet, and
Amazon doesn't know yet how that's going to shake out.

Andrey (18:46):
Yeah. And it's interesting to me that there are examples of
both approaches. For instance, ChatGPT is kind of a hybrid approach
where you have a free version of a less powerful model and
you can pay for the best model. And that's generally true across
over chat about providers.
Twitter or ECS has had a different approach where you

(19:09):
do pay for a monthly subscription, and as part of that you get access
to grok. So a lot of people experimenting with different approaches,
and we'll see which one wins out next.
Going back to a browser we got Microsoft Edge will translate
and dub YouTube videos as you're watching them.
So that's pretty much the story.

(19:29):
We feature as support for translation of Spanish
to English and English into German, Hindi, Italian, Russian and
Spanish. And this will be also available
for videos on new sites like Reuters, CNBC and
Bloomberg. And I think this is one of the less discussed big deals
of AI is that the language barrier kind of across everything,

(19:54):
will some extent go away.
And the big deal will especially be real time translation.
For instance, video chats or even in person.
It will be very interesting to see if people still need to even learn
languages when they, you know, want to move countries.
Or we are moving into the sci fi future where we have universal

(20:14):
translators.

Jeremie (20:15):
Yeah, and I think you're right to latch onto that.
And with the one of the key ingredients really enabling that is the
low latency, right. All the progress we've made on inference, one of
the biggest challenges has been, sure, you can train these very
large models, to do translation and do it really well,
but if they can't translate in real time, if they can't translate
with, you know, a couple hundred milliseconds of latency, let's say,

(20:38):
then it starts to feel weird because I say something and like, Andre
has to wait and then like, listen to it, you know?
And then we're having this, like, kind of stuttered conversation.
And anyway, so I think that's one of the big things that's really
made this possible. There's so many things.
I think people are still undervaluing the importance of rapid
inference and what it's going to do to, among other things, agents,

(21:00):
but also translation and other downstream applications.
So, we're definitely one to watch as the unit economics for if
rate for inference or the economics, I should say for, for inference
models, get more and more favorable.

Andrey (21:12):
And illustrate for this section.
Ego thinks its gen AI earbuds can succeed
where Humaine and Rabbit stumbled.
So this is another example of the emerging sector of sort
of wearable AI tech.
And as per the headline here, we are focusing on Bluetooth
earbuds. This company has been around before and already

(21:36):
I think released some products, but they have this new EO
one product that will be launching this winter and
will integrate presumably the latest chat bots.
And apparently they actually was formed inside Google
and incubated in their X moonshot factory,

(21:57):
but are now operating independently.
This product video and will cost $600 for the y
fi model and 700 for the cellular version,
but will not require a monthly subscription fee.
So yeah, another try at this, and
it seems like at least I think they will have a decent shot relative

(22:19):
to other ones. It's not a new company.
They've already released products, and the form factor of earbuds
has already had this idea of integrating assistants with Siri
and so on. So probably the best bet for this
kind of working. But again, we'll have to see if there's going to be
a major thing that you know on YouTube will be viewed as your worst,

(22:45):
product that people have ever seen.

Jeremie (22:49):
Yet we're seeing a lot of those these days, it seems.
But yeah. No, I mean, and this has an interesting form factor, for
sure. You know, they are headphones, but they're they're pretty big
because of this pretty big battery.
Apparently has up to 16 hours of charge.
With a that's with a phone in Bluetooth mode.
So that's quite something.
But one of the big selling points they're claiming here is, look,

(23:10):
this thing has intrinsic value no matter what.
It can just function as a good pair of headphones.
You've got that core kind of that cornerstone nugget of value.
And then we build on top of that. Whereas, you know, you compare it
to like the rabbit R1. And I'm just going to harp on this for a
second, because while I was on my break, last couple of weeks,
I did watch a video on YouTube by this channel Coffee Zilla.

(23:35):
You know, they go into the like rabbit R1 and apparently
rabbits, founder in the kind of history of alleged fraud
that he's engaged in, which goes back to like a alleged crypto scam
that he ran, a couple of years before.
So there's, like, all kinds of shady stuff going on in this space.
I think when we first covered this, we were sort of like, you know,
curious about this. I think they called it the, language action

(23:58):
model, if I'm getting that right. It was a lamb.
Right. And we were like, what is this like?
Because there's not a lot of information about it.
It wasn't clear how the architecture worked, and we had a lot of
open questions, but we're excited about it because, you know, surely
nobody would throw like $30 million at a nothing burger.
But hey, it's 2024, man.
There's a lot of this stuff going on. So yeah, you're here.
We have a different play. You know, EO coming out with this thing

(24:21):
that has intrinsic value at the very least.
In that sense, you know, maybe a bit of a bit of a safer
bet that they can build on top of.
We'll just have to see how it how it pans out.
But the price point seems to make some amount of sense for right
now. And, hopefully the AI features are to follow.

Andrey (24:39):
Yeah, that's a good point. I mean, people already spend hundreds of
dollars on the earbuds and yeah, as you said, they're
actually going to have value even if I isn't the best.
So this is something I could see myself investing in, unlike
the pen and other stuff.
And out do the applications and business section.

(24:59):
And the first story is p C agrees to deal
to become OpenAI's first reseller and largest enterprise
user. So opening AI has courted various
companies with this enterprise offering idea and
C, which I think is a large sort of,

(25:20):
actually, do you know what they do?
I kind of forget.

Jeremie (25:23):
Yeah, yeah. So C is one of the so-called big four consulting
company. So when you think about, I guess there's like what Deloitte
C I think KPMG is one of them.
And I forget that it stands for PricewaterhouseCoopers.
It's the old name of the thing, but it's consulting.
It's a, you know, whatever problem you got, they'll throw some
bodies at thing.

Andrey (25:43):
Right. That's what I thought. But I didn't want to get it wrong.
So this deal, will mean that it's
100,000 employees will now have
enterprise licenses, and, apparently
they will be a reseller as well to provide it to, I
guess, presumably, clients of the company.

(26:05):
So I really like, you know, this seems like, maybe very
businessy. Whatever. Story, but I think this is a big deal.
Like enterprise is where the money is at typically.
And this deal, we don't know how much the dollar amount
was, but I'm guessing pretty large given that it's going
to 100,000 employees now.

(26:27):
So another kind of boon for OpenAI.
Again, it seems in recent months we've been really pushing for this
with Sam Altman, meeting with people and offering this.
So interesting to see if they continue to
make progress in the enterprise from.

Jeremie (26:46):
Yeah. And this is a new business model for OpenAI as well.
They're kind of I was gonna say dipping their toes into the water.
This apparently amounts to like 100,000 of these, licenses,
the enterprise licenses to public employees.
So that's, you know, that's something. But it is the first time
OpenAI has, like, done this resale model.
So, you know, again, like you said, we don't know what the margins
are. We don't know what the costs are going to be like.

(27:07):
But certainly if you're OpenAI right now, you got to be looking
around you. You got to be looking at Microsoft saying, you know
what? These guys, if they don't own us, they're definitely,
as there was a, an article I think out in, maybe Forbes or Fortune
recently talking about how OpenAI and Microsoft are increasingly
becoming increasingly becoming frenemies.
Just because, you know, OpenAI needs distribution, they need

(27:27):
hardware. They need a lot of these things that only Microsoft so far
has been able to offer at scale.
So they're now looking for other ways to meet those needs.
So they're more independent. Microsoft is internalizing a lot of
their AI model development efforts, which used to be what they'd
lean on OpenAI for. They also still have access to all of OpenAI's
IP, up to the point where they build whatever the board decides.
AGI but, still. So there's a lot of this kind of dynamic going on.

(27:51):
You can kind of see this through that lens of OpenAI saying, you
know what? We need our own way of achieving distribution, right?
We talked about that in the context of the whole telegram, Microsoft
stuffing chat bots into telegram and other things that got great
distribution. It's what they do. So-so. Well, it's the reason that
Microsoft Teams beat out slack, right?
There's a long history of Microsoft using its outrageous scale to

(28:11):
just beat out products, in some cases, superior products.
And so OpenAI, I hear that, you know, they gotta find a way to turn
a profit. They've got to find a way to make up for those insane
costs of both model training and inference and partnerships like
this, you know, looks like a pretty good headline as well, right?
If you think about that, their job is to sell the hours of
their teams, labor of their consultants, labor to companies.

(28:33):
And if they have this brand is like this, you know, forward AI,
forward, consulting company, the first of the big four to sign any
kind of deal like this with open AI.
All of a sudden, if you have any problem, you're like, oh, well,
which one do I choose? Maybe you go with me for that reason.
So, a really big marketing win here for me and
potentially good distribution win for OpenAI.
That's kind of at least my my top line thoughts looking at how this

(28:55):
is shaping up.

Andrey (28:56):
Yeah, this also makes a lot of sense to me.
I think, you know, programmers are the number one category of
people who have really adopted this into their workflows already.
But I could see consultants being the number two sort of category
where as these how I understand it.
Consultants write a lot of emails and presentations and documents.

Jeremie (29:16):
Oh, they'd never use a chat bot for that stuff.

Andrey (29:20):
Yeah, maybe. But certainly I think there's
going to be a lot of benefits said make sense that you see went away
and, yeah, it's an interesting point regarding Microsoft because
Microsoft has kind of built their business partially around providing
software for businesses, and they have all of these ideas for, you
know, if you want to buy wear OS, you can get the business

(29:43):
version. Copilot, I believe also already has a business
version, but copilot is powered by JB four.
So, Microsoft hasn't managed to build a JB
four scale model.
They have five, which is a smaller one.
We do know they're working on a GPU for a scale model, so yeah,

(30:03):
frenemies for now. We'll have to see where it goes.
And next, another open air story.
It seems like a lot of these, business stories
usually start with open AI.
So this one is again adding to a trend that we've seen
and have been talking about for quite a while.
The story is at Vox Media and there's non-tech sign content

(30:25):
deal with OpenAI.
So OpenAI, has kind of done the thing we've already started doing
with various things. You've covered how this happened with Financial
Times, I think a couple other ones.
So here again, we're going to license retraining data
from these, companies, their

(30:46):
news stories, etc..
And Vox Media is planning to use open AI technology
to enhance its approach to enhance
affiliate commercial products and expand its ad data
platform.
Apparently The Atlantic is developing this microsite, called Atlantic
Labs, to experiment with AI tools to serve

(31:09):
its journalism and its, readers.
And one last deal is, at in
terms of the, dollar amounts, it doesn't seem like
we know exactly the numbers we know of News Corp's.
It was 250 million financial times, or 5 to 10
million. So I'm guessing this was in a million, kind of category.

(31:33):
But yeah, once again, this is happening.
OpenAI, we know, has done this multiple times before.
So part of a race to get copyrighted data
to use for, let's say, legitimate training instead.
Of having these companies use them, like when The York Times had done
with OpenAI.

Jeremie (31:50):
Yeah. To your point, like they're saying, these deals also appear to
provide OpenAI with protection against copyright lawsuits.
Right? At all important. All important provision.
And it's, you know, a long series of these things that we've seen,
News Corp, which owns the Wall Street Journal and York Post, Daily
Telegraph, Axel Springer, that's Business Insider, Politico, dot
Dash Meredith, which I'd never heard of.

(32:11):
They owned people magazine, their homes and gardens, Investopedia, a
bunch of other. So, you know, you got tons of these places.
Associated Press, Financial Times, you know, when you start to think
about the set of corporate media that now have
deals with OpenAI, right?
You start to think as well about like, okay, what does that do to
affect coverage? This is a really interesting question.
We're now starting to move into that space where potentially an

(32:34):
increasing fraction of the revenues that these outlets are
generating is going to have to come through OpenAI, whether through
referrals, through ChatGPT, as it cites those outlets in its
responses. This is something we've seen with some of these deals.
Or just like the, you know, the way that the outlets are potentially
portrayed in chat GPT outputs, they're like a million different ways

(32:55):
in which influence could be, bought and sold, let's say, in that
context. So this is a really interesting question.
As we migrate from a world where social media platforms are the ones
that had all this outsized leverage, really, on, publications
where I can imagine Facebook turning to, like, I know The New York
Times and sort of having through their algorithm a huge impact on
what succeeds and what fails.

(33:17):
Now OpenAI starts to become that platform or Microsoft or whoever
has these deals. And, you know, the more you see these things
accumulate, the more you have to ask yourself, what does the media
environment look like? What is the likelihood, what is our guarantee
of getting unbiased coverage?
You know, especially in a context where we've seen OpenAI is, you
know, allegedly, at least fairly keen to, shut down criticism in

(33:37):
various ways. So, yeah, we'll we'll just have to see what happens
here. But I think this is a really important story that may not
actually be told in the way that it ought to going forward,
precisely because of the economics behind the story.
So one to track for sure.

Andrey (33:52):
Yeah. It's, I guess not going to be surprising if all these
news companies are going to start portraying AI in a more
positive light. And it's interesting to note, OpenAI
is kind of trying to have its cake and eat it, too, in the sense that
they are have been making this free use argument.
We can train on your data if it's copyrighted, but also making

(34:15):
all of these deals. So Reddit and all these news companies to
license their data, which also makes sense in
a, sense that, you know, before,
I think no one really was aware that a, this will happen, that their
data would be scraped by all of these companies for training.
Right? It just wasn't a consideration.

Jeremie (34:36):
Or they were and they just didn't care because there wasn't enough
money to be made. Right? Like now now the chat bots work.

Andrey (34:42):
Yeah. And now all of these platforms are building in protections such
as these companies can to do it. And it's true that, you know,
these companies need up to date information.
They need search. They need to continue gathering data,
pushing the frontier of until when these models have information.
So I think regardless of the legalities of this, really, these

(35:06):
companies have no options.
But to do this, to access the updated information, get
additional training data over the coming years.
Yeah. So very interesting and very
kind of important trend to be aware of from an economic
and business perspective, because this is the world that's shaping up

(35:28):
to be and the kind of relationships that
are happening that didn't exist in years prior.

Jeremie (35:35):
And I do want to flag, by the way, when we talk about all these
outlets in the incentives, like, it's not that I'm not saying I have
seen like The Atlantic and News Corp, you know, outlets
and stuff like that, behave a certain way.
Like I think that, you know, we're going to see potentially that
evolve. I think this is more, a flag to plant.
This may or may not come to pass, but the incentives may be taking

(35:56):
shape in that direction. Just want to be clear.
I'm not accusing right now the Wall Street Journal or the New York
Post, specifically of of doing anything bad in this direction.

Andrey (36:04):
Yeah, that's definitely true. And it is well worth also flagging
that, you know, these are business decisions.
It seems like journalists, the actual employees may have more
mixed feelings about all of this.
And we've certainly seen a lot of criticism levied at AI by
such publications. So, yeah.

(36:24):
And yet another example of all of this happening.
And on to the Lightning Round, just one more story about OpenAI.
I guess we're on a streak here, and this one is about how
OpenAI is launching programs to make ChatGPT cheaper
for schools and nonprofits.
So that's pretty much the.

(36:45):
Sorry.
We're going to offer it to these categories.
And to me, I think it's interesting
kind of as part of a trend where AI
and in particular chat bots is shaping up to be a landscape
where it's a bit of a commodity in a sense, where now you

(37:06):
have multiple options, almost like cloud
computing, where a few companies that are big are able to train
frontier models that are almost equivalent in terms of performance.
So it's feels to me like these, moves
are in part to sort of lock in usage to
get people to continue thinking of ChatGPT as sort of a default,

(37:31):
as you do with browsers or cloud providers.
And because, you know, in a way like Uber and Lyft is another example
where these are basically equivalent products.
And to keep, leading, you know, you
probably can't compete on price too much and you can't
compete on quality too much.

(37:51):
So what you're going to compete on is a lot on a brand side, and what
people are just used to using.

Jeremie (37:57):
Yeah, yeah. Well, this particular PCI is that nonprofit push,
apparently. You know, open AI for nonprofits is allowing nonprofit
works to use ChatGPT team at this discounted rate, 20 bucks
per month per user.
So that's, you know, attempt to sort of, I guess, yeah, build that
brand, as you say. And then they've got tier a tiered discount

(38:18):
system for larger, organizations.
But and then in this post, they share a bunch of stories about, you
know, specific cases where people have been able to use their tools
for, as they put it, a tangible social impact.
They highlight folks who are using it to access international
funding. A lot of this around automating the process of putting in,

(38:38):
applications, grant applications for, for different grants that
would help these companies, these nonprofits, thrive.
So, you know, there's certainly very interesting use cases.
I will say, you know, it's interesting that there has been so
much buzz about open AI, so many news stories coming out,
that just happened to have come after, I don't know, a

(38:59):
week where it seems like there's been a lot of fairly bad news about
OpenAI, some news that some might say cast doubt on,
perhaps allegedly, the integrity of the management
of Sam Altman in particular.
There's been a whole bunch of whistleblower stuff has come out.
And now we're getting blasted by the Who introducing open AI for

(39:20):
non-profits. So I think this is pretty interesting.
You know, if you're putting your media hat on, your marketing hat on
your your, let's say brand damage control hat on, this might be the
sort of thing that you might expect to happen after a week of
hell, like the one that we've just seen.
You know, not to say these aren't wonderful things.
Hey, OpenAI AI for nonprofits. Kudos.
There's a lot of great stuff OpenAI has done on the security side

(39:42):
flagging some, influence operations, information operations that
they've just caught people trying to pull off using their tools.
That's all good stuff, credit where credit is due.
But, you know, you gotta you gotta notice the timing.
Here it is. It is noteworthy at the very least.
So, anyway, it kind of makes me wonder about the timing of this
particular announcement, too.

Andrey (40:00):
Yeah. You gotta wonder, I agree. Yeah.
And this next one is another topic we like to talk about, which
is, computing and in particular
the frontier of computing that will enable AI to continue scaling.
And this one is about China and our topic we like
to cover. So there's a new Huawei patent that reveals

(40:23):
three nanometer class product process technology plans.
And once again, this is in spite of US section
sanctions that really do limit the ability to
build this sort of technology.
So I think, Jeremy, you're usually the expert on this, so I'm gonna
let you take over.

Jeremie (40:45):
Yeah. No. Of course. So, yeah.
Big picture. You know, we've talked about this a fair bit.
This idea of, like, the the different nodes, right, in the
semiconductor fabrication process.
So, you know, the three nanometer node is, is currently the sort of
leading node. It's the sort of most advanced process we have for
developing really, really, really, really tiny semiconductors.
Currently three nanometers is basically all used for the iPhone.

(41:07):
It's what you tend to see. So as, the semiconductor fabs
like TSMC and Taiwan learn how to make the next generation
of node these even smaller feature sizes, that that generation node
tends to be used for smartphones because you need the smallest, most
compact processing.
So what's happening now is the node above that node.
So the next, the next best node, the four and five nanometer nodes

(41:30):
are being used for AI processors like the Nvidia H100, the B100,
those sorts of things. So historically, what's happened is
the US has been trying to prevent China from accessing the three
nanometer, the five nanometer processes.
China made a breakthrough, the seven nanometer process, which is the
scale that's used to make the Nvidia A100, the GPU that was used to

(41:50):
train GPT four. So they can basically in-house, roughly speaking,
make GPT four level models or, you know, the kinds of processors
that go into that. So one of the key things that has happened
is they've been cut off from accessing extreme ultraviolet
lithography tools. These are the tools that you really need to make
those five nanometer, those three nanometer nodes, China can no

(42:12):
longer access those. That's a US export control policy that kind of
kicked in fairly, fairly recently.
So now, you know, they can't access these, extreme
UV lithography tools. What do they do?
Well, instead of using extreme ultraviolet light to etching these
tiny features, what you can do is use a kind of weaker or less
effective tool called deep ultraviolet, a sort of a longer

(42:34):
wavelength of light, less powerful, less effective for this, but you
can pass over your chip many times to achieve functionally
the same level of resolution. That's called multi patterning.
So use the same kind of crappy old deep ultraviolet lithography
tools. But just kind of pass over your your wafer multiple
times, pass over your chip multiple times to achieve a higher

(42:55):
resolution. That. Because you're passing the same thing through the
same system multiple times. It's slower, right?
It takes longer to complete a production run, which means that the
cost per chip is higher and your yields can be pretty bad
too. So the economics of this process tend to be worse.
What we're learning about here is Huawei and Smic.
Have they had this patent out for not double patterning but

(43:19):
quadruple patterning lithography methods.
So taking the same chip from four times which originally people
thought, oh, they're going to use this to get to the five nanometer
process, basically to get to the h100 level node.
Apparently, though, they have plans to use this technique get all
the way down to three nanometers, which would be really interesting.
This is like, you know, strategically, something that could position

(43:41):
China to compete much more long term before they kind of are unable
to keep up with the extreme UV lithography stuff that we can benefit
from in the West. Always a question of like, okay, sure, but what
are the yields going to be? And chances are they're going to be
really crappy. But the state apparatus in China is willing to stuff
tons of money. They're willing to lose money on this big time, and
make it a strategic priority to keep up on AI.

(44:03):
So, you know, a lot of stuff going on here.
But the bottom line is Huawei and Smic seem to have a plan, at
least whether or not it'll work is another question.
Do they have a plan to get to a three nanometer node size with this
multi patterning technique that involves, well, quadruple
patterning, taking the same chip, the same wafer through literally
four times in the fab process?

Andrey (44:24):
Wow. Yeah. Interesting.
Another example of me learning of things that I had no idea about
just while recording this podcast.
Next, Nvidia, powered by the AI boom, has reported
soaring revenue and profits.
Soviet chip sales have actually exceeded expectation, leading
to yet another rise in its stock and market capitalization

(44:48):
to be over 2 trillion.
So the revenue for the three months ending in April was 26
billion, exceeding the fair February estimate of 24
billion and tripling sales from the same period last year.
And, yeah, this is something we've seen

(45:09):
so much with Nvidia. They keep rising and keep
sort of leading the charge in terms of certainly valuation.
And, we are at a point where their, sort
of ratio of price to earnings
is like 260.
Usually with tech it's like 20 maybe, you know, we've

(45:33):
seeing Google, meta, etc. they make a lot more revenue.
But Nvidia is now in a class of, ultra expensive
tech stocks. So certainly people believe that India will continue
to be dominant.

Jeremie (45:47):
Yeah. And this is also on the back of you know, Nvidia has done a
lot of things structurally to radically accelerate its shipping
velocity, which I think, you know, is is a positive for them
overall. You know, they've moved from shipping a new generation of
GPU for AI every two years to every one year.
Right? So we go from the H100 B100 to the X100 and so on.

(46:08):
The roadmaps are also getting shorter.
They're looking to iterate faster.
This is all kind of pulling into, you know, their presumably their
valuation and how people are looking at them and yeah, evaluating
them as an opportunity. It's interesting.
I mean, there's a lot of competition shaping up.
We've talked about it a lot on the podcast.
You know, you've got AMD, you've got all kinds of other other
companies that have more specialized chips and things like that.

(46:31):
But but certainly, you know, their as they point out there, they
have the best distribution. They also have, as we've talked about in
the podcast, the best allocation.
Right. One of the things Nvidia is really, really good at doing is
buying out crazy amounts of allocation from
all the factories that fabricate the semiconductor chips.
So Nvidia Nvidia designs the GPU and they ship their designs to like

(46:53):
TSMC, for example, for actual fabrication and packaging.
They buy up all the capacity they can.
So there's no room for competitors. And that's been one of the
things that's really differentiated Nvidia from other companies.
They're able to use their market dominance to preserve their market
dominance in that way and buy out all the packaging, all the fab
capacity they can. It's, it's aggressive, but they're willing to

(47:14):
lose money by having things just sitting in inventory and get
depreciated just because, you know, just because Jensen is super
aggressive and it's got to be said effective.

Andrey (47:24):
Oh, yeah. And I believe it's still the case that the market is
supply constrained. You know, people are still competing to get their
hands on the leading edge compute.
So this is not example of race to a bottom.
In fact Nvidia has as far as I understand fantastic
margins on these products.

(47:45):
So yeah, certainly if you're an investor you're very happy right
now.

Jeremie (47:50):
Yeah. We're saying their margins are starting to take a little bit
of a hit as they're changing their pricing.
In response to what they're seeing from the competitive landscape.
But this is consistent with Nvidia, you know, looking at the
landscape and saying, look, we're going to make it up in volume.
We do want to start out our competitors.
So, you know, that's a I think a long term bullish move.
And it's also what you expect, you know, as a company matures those

(48:13):
margins, they're going to drop in a competitive space like this.

Andrey (48:18):
And one last story Elon Musk's X raises six
6 billion in latest funding round.
And now where valuation is 18 billion meaning that
they are one of the highest valued companies in the space.
We covered this was already rumored a while ago.
And I believe at the time, Elon Musk, refuted the rumor,

(48:41):
saying we are not raising money.
Well, seems that they are, and
certainly. And this, if nothing else, makes them competitive.
They have the money to train frontier models, and they are in league
with, OpenAI, meta, and Google.
As far as companies able to build these models

(49:02):
that, you know, grok, as far as I understand, is still quite a bit
behind.

Jeremie (49:06):
Yeah. And, you know, they, Elon posted on X that
he is setting up. He's setting a pre money valuation of 18 billion.
So they've raised an additional 6 billion.
So post-money they're sitting at 24 billion.
compares similarly to companies like you know think about anthropic
here. Right. Like those are maybe that's maybe the orbit a little
bit more than, than maybe Cahir, if I recall.

(49:27):
So, you know, these sort of like mid-sized companies, we've talked a
lot about whether it's possible for companies like this to actually
survive in the kind of market dynamics that currently exist,
especially around hardware. Right.
This is $6 billion is a lot of money.
But when we're looking at a world where compute training runs will
cost will cost in the billions of dollars in the coming years.

(49:49):
This doesn't necessarily get you that far, and you're going to have
to show some ROI pretty soon.
It's interesting. You know, you look at the set of investors Elon is
raising here from Prince Alwaleed bin Talal, and
a bunch of. Yeah. So he's already kind of reaching for sources of
money that are not your not necessarily just

(50:09):
your A-lister Silicon Valley.
Sort of VCs. There are certainly some impressive ones here.
Right. We got Andreessen Horowitz, we got Sequoia.
But when you're already raising from from Saudi entrepreneurs,
it's, you know, you're at that stage, you're raising 6 billion.
You have to cast a wide net.
It's an interesting question as to whether this is sustainable,

(50:29):
strategically dicey situation opening.
I had to align itself to a hyperscaler like Microsoft to get access
to the computing resources they needed.
Anthropic has done sort of similar things with Google and Amazon.
Obviously, Google DeepMind is completely in-house now.
So, you know, there's this interesting question as to whether XYZ
position is stable long term.
I'm there definitely the largest market cap version of

(50:52):
this problem that I've heard of that I believe exists.
So I'm especially interested from a kind of market dynamics
standpoint as to whether their position turned out to be tenable in
the next few years. That being said, Elon thinks AGI by 2027
is highly possible. So if that's the case, yeah, maybe this is
enough for now. Who knows.

Andrey (51:10):
And onto the projects and open source section.
And we start with another favorite topic.
We go to more and more, which is evaluation.
And the story is at scale.
AI has published its first large language model
leaderboards, which, as with other leaderboards,

(51:31):
ranks AI model performance, in this case in specific domains.
So this is called the seal leaderboards.
And they evaluate on several categories like
coding and instruction following.
This is similar in category to things like VMA
Commons benchmark, Stanford's

(51:55):
Transparency Index, and other things.
We've covered the alum since but they do
say, scale AI that there are some flaws in
these things. For instance, ML Commons has public benchmarks so
companies can train their models specifically to do
better on them. And you've covered how leaderboards are gamble, and

(52:18):
probably companies are gaming them because this is now
important for PR and you always want to say we have the best model.
Generally, the results may show are pretty consistent with the
picture. We know that GPT four oh is
still leading Lamar 370.
Ruby is close behind that.

(52:41):
Mistral large is similar, cloud
is similar. Jemma 1.5 Pro is similar.
So yeah, nothing too new, but I think a new entrant
in the space that a lot of people are excited about.

Jeremie (52:56):
There's a I think there's a lot to like about what scale is doing
here. First of all, very, very welcome, as you said, to have these
private leaderboards, right? They are harder to game.
They're getting human beings to actually run these evaluations.
So we're not looking at automated evals, which, you know, has been a
big trend that we've seen, especially using things like GPT four to
evaluate the performance of language models, which has a whole bunch

(53:18):
of interesting failure modes. Right, because GPT four doesn't always
evaluate or rate things the same way that a human would, you know,
it can succumb to adversarial attacks, it can go out and
distribution do all kinds of things. So this is, I think, a really
expensive effort because you got to get all these, contractors
and outside, outside folks who they say they're bringing in experts
to evaluate these things.

(53:39):
But, but it's worth it because it at least gives us a really kind of
clear picture of how these models actually stack
up against each other. And that's that's been a real challenge
historically, you know, knowing which models are best in for what,
you know, nice to see that that progress being made.
They do say, you know, so they let you dive into different
categories of evaluations and leaderboards that they have.

(54:01):
They have some for coding, some for language, some for math.
The coding one, they've got like, for example, a thousand prompts,
they say spanning a diverse array of programing languages and tasks.
And anyway, so the whole idea here is in each case they're using,
Elo scale rankings to kind of have two different models compete
against each other, and then they have the winner, you know, get

(54:23):
ranked by it, get scored by a human, or determined by human.
So those those Elo rankings, we've seen them pop up on all the
leaderboards. It's a very common popular way to do this.
It really is the best, most robust way that we've seen evaluations
get done just because it's a relative ranking.
It doesn't you know, you can't saturate that metric in the same way

(54:44):
that you can saturate others like ML, you or whatever.
So kind of interesting. And again, human evaluators, very expensive,
but very important.
And they've got a whole process that they lay out anyway for coming
up with these things. I thought the one thing to highlight from the
actual data they share is, you know, Claude does seem to do best.
Claude three opus does seem to do best.

(55:04):
In math, though GPT four tends to win across the board in some cases
just by a hair. But I thought it was interesting because it does
objectively give us that indication that, you know, Claude three
opus, when it comes to your mathematical reasoning, is indeed, it
seems objectively, the kind of superior model at least, you know,
given the the noise that they've got, in their evaluations.

Andrey (55:25):
Yeah. And, another reason to be excited about this is,
just building on some previous efforts of scale where they published
a research paper showing that on one of the popular benchmarks,
in fact, people were gaming it.
If you created a new private variant of
a benchmark that wasn't already out there, then your models did not

(55:47):
generalize. Some generalize more of an others, but things
like Phi, for instance, and some of these smaller models that claimed
to be very good at smaller scales, they didn't quite,
perform as well as the benchmark said so makes sense for them to
continue pushing on restaurant next cohere for AI launches
AI 23 with eight and 35 billion

(56:11):
parameter versions and the open weights being
released. This is, building on
the previous AI, which, brought together
3000 collaborators from around the world
to build, the largest, in particular, multilingual

(56:31):
instruction, fine tuning data set and, stable, hard
to massively multilingual model.
And what that means is that, for instance, I,
1 to 1 covers 101 languages and is
focused on breadth, whereas AI at 23 covers 23
languages, but is more focused on

(56:54):
depth. So they are focused on really making this a very powerful,
multilingual, large language model.
And this is pretty important because, they say that this
expands with state of the art language model to nearly half
of the world's population, and this is now available
for people to experiment, explore and build on for

(57:18):
research and safety auditing.
So yeah, exciting, new model release.
And we've seen a lot of smaller models being released in the 8
billion model category. But this is a little bit different with a
focus on multilingual models.
You know, we still know that models like GPT don't necessarily
perform as well on foreign languages.

(57:40):
So pretty exciting.

Jeremie (57:42):
Yeah. This is clearly a big differentiator that cohere is trying to
go for. They have this I think it's a foundation.
I'm not sure if it's a nonprofit tied to cohere, that they call
cohere for AI and, and it's it is focused, as you say,
on. This sort of like multilingual strategy where we'll, you know,
translate. Had the AI performed really well in all of the languages
we can. This latest one.

(58:03):
So we talked about IO 101 I think when it first came out.
Is this, like, impressive, attempt to cover a wide range?
Is that over 100 languages?
This one, they're using a pre-trained model that's really high,
highly performing, and they're coupling it to their IO data set
collection, which apparently is a new data set that they've
released, to go in depth to, to achieve that really high performance

(58:26):
on the 23, the 23, languages that they've selected here.
It is apparently a really big multilingual collection has 513
million instances of prompts and completions for a whole bunch
of different tasks. So we're looking here, presumably at
instruction, fine tuning data, maybe dialog data as well.
But that that certainly seems to be a lot of like fine tuning

(58:47):
examples that they're feeding it.
And then they do share win rates too. So it, it compares favorably
to at least the models they're comparing it to here.
There's in Mr.. Al 70 instruct and
and showing the win rate of IO 23 at least the 8 billion version
winning about two thirds of the time, for example.
And it's pretty representative across the board. So, yeah, an

(59:07):
impressive new model from Cahir.
Definitely another entry in that catalog of but of multilingual
models. You know, meta has done a lot of stuff in this direction to
the field of open source models.
We said this a lot, but it it is getting pretty crowded, right?
I mean, we are getting to the point where, I think, I don't know,
controversial hot take might be most of the time when companies drop

(59:28):
a new open source model, it's literally just for the headline.
I think that's kind of, were the world that we're getting to,
especially when you look at these kind of late following models
that, not necessarily this one, but the late following models, you
know, that don't do a new Sota for their, their class.
So, this certainly seems to be important, at least for one category
of the story. But the categories that we're making advances in are

(59:50):
getting narrower and narrower. Narrower. Right. We're having a look
here at like very specific problem of low resource languages and
things like that. So anyway, there's there's cohere as
differentiator for you and solid, new development.

Andrey (01:00:03):
Yeah. And they compare to particular other
open source models in the same class of number of parameters.
So they compared to Mistral seven be in stock to GEMA 1.17.
We don't compare to Lima.
Interestingly, although maybe I'm missing this is just in the graphs
we share. But as you might expect on visualizations,

(01:00:27):
they have the top numbers.
Although GEMA seems to be pretty decent and
mixed draw 8X7B is quite good
on some other benchmarks.
So, yeah. And now we're open source model.
And that one I think researchers will like.
On to the lightning round to the first story is who will make

(01:00:49):
AlphaFold free? Open source scientists race to crack
AI model. So a little bit less of a new story and more of
kind of a summary.
DeepMind, unlike previous AlphaFold,
hasn't released the computer code for this.
They promised to release the code by the end of the year.

(01:01:12):
But researchers, worldwide, are now working on their own,
their own open source versions.
Apparently there are 600 researchers signing an open letter
to nature, to ask for this being
open. So, yeah, a lot of people are trying

(01:01:32):
to work on this. And we've seen before that open source versions of
AlphaFold have been developed.
So I wouldn't be surprised if, we scientists do
crack this problem.

Jeremie (01:01:43):
Yeah. There's all this question, too, despite the commitment from,
Google to open up a version of AlphaFold three, there
are questions about. Okay, well, but what will that version
specifically allow you to do?
One of the things that, one of the researchers, they're citing here,
this is a guy who's actually trying to replicate the model in open
source, a kind of more open source version of it is he's he's

(01:02:05):
claiming basically, look, I don't know that they're actually going
to give us the ability to predict the structure of proteins,
in conjunction with any, ligands or,
say, drug molecules, roughly speaking.
So, you know, there are specific use cases that they're concerned
they won't have access to, even when Google actually does do this,
you know, quote unquote open sourcing of the model.

(01:02:27):
And so you've got this. Yeah. It's interesting, I think, that the
site, at least three examples of people of independent efforts, try
to replicate this model. That did happen, of course, with AlphaFold
two. But, you know, it is kind of noteworthy.
The most advanced one, at least as far as I could tell, seemed to be
held up by this guy, Phil Wang, who's in San Francisco.
He's got this crowdsourced effort, that, you know, they they claim

(01:02:50):
to be able to replicate the code or have code that functionally
replicates AlphaFold three, within a month.
But then, you know, separate question about the actually training
the model using that code, which interestingly, they said, you know,
estimated compute costs of around $1 million, which, you know,
struck me as being surprisingly accessible.
I guess I wouldn't have expected it to be that cheap.

(01:03:11):
So if that's the case, if that actually is true, and I think there
are a lot of asterisks there. But if that is true, you know, then
you could see, pretty plausible, an impressive results coming
out in open source on the side of things within, you know, maybe a
year or so or less, which, you know, a lot of implications for drug
discovery, a lot of implications for Google's competitive moat,

(01:03:31):
because Isomorphic Labs and Google DeepMind have been partnering on
these things. That is their moat.
But also just, you know, you think about designer, you know,
designer bioweapons and things like this.
You know, as you start, it's not any one particular discovery that
will unlock big risks, necessarily.
But certainly when you look at this, you know, this is a big step.
You're talking about models that can actually model the interaction

(01:03:54):
between drugs and biomolecules.
That's, you know, that gets you a lot closer to things that could
be, pretty dangerous. So you think about open sourcing those things.
There are all kinds of questions that, naturally get raised.

Andrey (01:04:07):
And the story of this section, Mistral releases
Code Austral, its first generative AI model for code.
So they trained this on over 80
programing languages. And as with other models of the sort,
it can complete functions, write, test, fill in partial

(01:04:29):
code, and answer questions about a code base in English.
As with other mistrial efforts, they say this is open,
but the license does prohibit for use of it, for
commercial activities. And as far as size, this
is a 22 billion parameter model, so

(01:04:50):
not easy to use.
You do need a lot of compute for this sort of stuff.

Jeremie (01:04:55):
Yeah, we've covered in the past that people get all, you know, fussy
about, you know, oh well you're open sourcing it, but it's a 22
billion parameter model. So who can actually use that.
But the big companies which you know fair enough.
But like credit where credit is due, this is an open source model.
So you know it's out there.
other interesting stat 32,000 token context window, which they
advertise as being longer than competitors.

(01:05:15):
Let's put a pin in that phrase, longer than competitors, because
I think that there's a really interesting little subplot here.
In in terms of the comparisons they're choosing.
But yeah, so 32,000, token context is a lot when it comes
to coding. That can be especially important just because, you know,
this is a model that has had to read 32,000 token

(01:05:36):
chunks of code. So, you know, maybe the ability to plane over sort
of larger bodies of code and write more coherent code,
that accomplishes more complex things. So.
That could be exciting. Okay.
I want to take a second to talk about evals here.
So, you know, they do a great job of highlighting evaluations that
make this model look really good.

(01:05:57):
Human eval, you know, knocks it out of the park community.
Val, by the way, is this. It's just a benchmark where people create
a bunch of unit tests, for fairly simple programing problems.
And they basically get the model to try to pass the unit test,
right. One really interesting evaluation, though.
Crux eval or crux eval.
Oh, here. This is a benchmark that was designed by meta.
And it's for input output predictions of functions.

(01:06:18):
So given an input can you predict the output of the function or can
the model predict the output. And then given the output and the
function, can you predict what input went into it.
Right. So it's kind of an interesting test of the model's ability to
understand the logic of functions.
In my opinion, they're cherry picking their comparisons
on this metric in particular pretty hard.
So Meta's original paper, when you look at the

(01:06:42):
crux eval, a benchmark, the first time they came out with it, they
actually test code Lama 34 billion,
the 34 billion parameter version of code Lama.
And they find that it gets a 50% score on this metric.
But Mr. doesn't show code Lama here.
They only compare the model to Lama 370 billion parameters,

(01:07:02):
which is not fine tuned for code which miss trials model absolutely
is. So this is a very unfair comparison a priori as far as I can
tell. You know, you you have a specialist model and you're stacking
it up against a generalist model. Yeah.
It's a yep. Llama llama 370 B it's a, it's a bigger model and all
that. But you get a lot of value out of that fine tuning process.

(01:07:23):
Interestingly so Lama 370 B then gets 26%
on this benchmark and code Strahl gets 51%.
So the interesting thing is they're trying to say, look, code
star gets 51%. Look at how much better we are than Lama three.
Again, a generalist, not a specialist model.
When you actually do more of an apples to apples thing, what you

(01:07:45):
find is code struggle gets 51% here.
That is barely ahead of Code Lama, which is based on
Lama two, just fine tuned for code.
So the apples to apples here seems to suggest, at least to me, and
I'm maybe getting this bit wrong, but I don't think by that much.
It seems to me that the code stroll is actually maybe more of,

(01:08:05):
like past generation, more like a llama two level model.
Which raises interesting, interesting questions and challenges here
about their positioning. Again, Mr..
One of these companies that does not have a partner with a large,
body of compute infrastructure.
They're well-funded, but not well enough funded that you would
expect them to compete favorably with, like OpenAI or even Anthropic

(01:08:27):
or Google DeepMind. So, you know, this kind of makes me wonder.
It's clearly, at least to me, kind of nitpicked or
not. Net picked a, cherry picked set of comparisons.
That's how it comes off to me at least.
That may raise some questions about, about how good these models end
up being. But let's wait to see leaderboards.
And, you know, who knows? I maybe prove to be a complete, a complete

(01:08:50):
wing nut on this.

Andrey (01:08:51):
But yeah, I agree, it's a lot of good points.
And it kind of shows how Mistral is a company.
So as with other companies, they do want to maybe game
or, you know, make this partially a PR move.
One of the advantages is they are sort of a national champion,
so to speak. France is pretty proud of it and backing them, which is

(01:09:12):
not the case with various U.S.
companies. So they do have some advantages.
But, overall. Right.
They're fighting fighting a hard fight.
And onto the research and advancement section.
And the first paper is pretty nerdy.
So we're going to try to make it accessible.
We'll see if we can do it is titled The Road Less

(01:09:35):
Scheduled. And you got to love the fun titles that I papers get.
And so it's dealing with optimization and how you train
your model. So there's a lot of optimizers typically use
a variant of Adam which is an optimizer, you
know, popular throughout deep learning for a while.
And as with other, kind of a parts of the learning process,

(01:09:58):
these optimizers have parameters that you have to tune,
kind of magic numbers that there isn't necessarily a
great science for. Usually you do, ideally a hyper parameter
suite to see what the best number is for this.
There's no kind of known answer.
So this is introducing a new.

(01:10:23):
Way to optimize. That takes away the need
for learning rate schedules.
So in particular some of these have optimization stop
stopping steps T and they introduce
this schedule free approach that does not introduce any
additional hyper parameters over standard optimizers with

(01:10:47):
so-called momentum.
There's a bunch of math, a bunch, details we can't get
into. And I believe when the results were first
shown on Twitter a couple months ago, you know, a generate quite a
bit of buzz, quite a bit of excitement.
And now the paper has come out.
And, if you look at the various graphs, they show

(01:11:10):
they don't quite go as big as large language models.
They compare, things like ImageNet training image models
and, they optimize on quite a few
different benchmarks, 28 problems
ranging from logistic regression to large scale deep learning
problems. And we schedule three methods show

(01:11:33):
strong performance matching or outperforming heavily
tuned costs cosine schedules.
So again, exciting if you know your stuff when
you engineer and train these models, because it's really annoying to
try and guess what the hyperparameters should be.
And yet more progress on some of these fundamental

(01:11:55):
problems we need to address.

Jeremie (01:11:57):
Yeah. And I'll, I'll give it a just a quick attempted like
explaining this whole learning rate thing for listeners if you're
interested, because I think, you know, it is important.
But basically every time you feed your model,
during the training process, you feed your model a new batch of data
to have it update its its weights, its parameters, values.

(01:12:18):
What ends up happening is you have to decide, okay, well, how much
am I going to update my model based on the data that I just
fed? Right. Am I going to assume, like, oh, man, I got a completely
change all of the weights, all the parameters in my model in
response to the mistakes that I made on this training batch?
Or am I going to go, well, I'm just going to take a very cautious
little step here, right? So this is where the idea of the learning

(01:12:40):
rate comes in. That is the learning rate. Big learning rate means
taking big steps, making big changes to your model in response
to the data that comes in during training.
Small learning rate means taking very kind of tenuous, careful
changes, making, careful changes to your model during that process.
And what people have found historically is that a steady,

(01:13:00):
consistent learning rate throughout the training process is not
actually optimal. It's often best, for example,
it can be best to do things like have a faster learning rate, take
bigger steps early on in training as the model is just learning from
scratch and has to be kind of reinvented, let's say with the data as
it comes in, and then gradually reduce the learning rate as the

(01:13:20):
model hones in. And then there are a whole bunch of strategies,
including momentum based strategies, which essentially, well, we're
not going to get in the details, but basically there's strategies to
kind of like tune the learning rate in a more and more intelligent
way, using heuristics that aren't as simple as, just like, we'll
start with this and we'll wrap it, that this is an attempt to
further optimize, that process by getting rid of these learning rate

(01:13:42):
schedules. How did I do?

Andrey (01:13:44):
That's a yeah. Great summary.
Well, one thing we should mention is, of course, the optimal learning
rate is ten e, negative four.
That's the magic Andre capacity number.
So yeah.

Jeremie (01:13:59):
Pretty here. First set it to to ten.

Andrey (01:14:02):
Oh this this is you know you if you're in the know you know
that's the perfect answer.
All right. Next up a little bit different.
We got a story. Training compute for frontier AI models
grows by 4 to 5 x per
year. So this is, some analysis

(01:14:24):
from epoch AI.
And they essentially. Yeah. Graph the estimated
compute. We don't really know the compute used for a lot of these AI
models, but they take quite a few different
notable models. Things like AlphaGo master, AlphaGo
zero, Gpt3 palm tree before Gemini Ultra going

(01:14:46):
all the way back to AlexNet and 2012.
And once you graph it, they say that it's
yeah, about about 4 to 5 x per year
with various examples of Google is five x
per year, OpenAI meta is seven x per year, which kind of makes

(01:15:06):
sense to me. And this is pretty relevant because people
have been predicting, you know, for instance, that we will get to
trillion parameter models. In fact, we already there, at least in
terms of a mixture of experts.
So pretty interesting to see at.
Trend that has been at least somewhat consistent.

(01:15:27):
With the frontier models, it's not quite as big.
So we saw in the early deep learning boom, we
had, growth of almost seven times per year.
Now with these frontier models, it's more like four times per year.
And that tracks with, really a while ago
we talked about a paper that talked about with three eras of

(01:15:49):
deep learning that did a similar thing to this and graph them and
showed that you can sort of divide the growth
into these stages that happened.
So, certainly interesting.
And no doubt we'll keep seeing this grow perhaps
at four times per year. We'll see if it's possible.

Jeremie (01:16:07):
Yeah. This research coming to us from epic AI, which does amazing
research in the realm of forecasting compute trends,
we use them, to inform some of our, our work that we did,
the week we, you know, got covered in the press.
And, yeah, they've got great researchers on this.
They did have a report that assessed, yeah, that we were hitting 4

(01:16:28):
to 5 x, per year scaling of, of compute for the largest
models about two years ago. This is, in a sense, is a confirmatory
report. It's saying, hey, yeah, that trend it's still holding.
They also do highlight a couple of things.
I mean, you know, you called out Andre, this idea that your scaling
had been a lot more intense previously, compute scaling

(01:16:49):
and may have flattened out a little bit.
And the may there is important, you know, they themselves include a
lot of uncertainty estimates and assessment saying, you know, it's
hard to know exactly when when one era of scaling ended and the
other the other kicked off.
But what does seem to happen is around 2020, you have what seems
like a, you know, I mean, call it a slowdown.
It's still five x per year, right? That means that compounds every

(01:17:11):
year. So in two years it's 25 x right.
In in three years it's 125 x.
So these models in terms of compute expenditure are growing
outrageously fast. It's just not maybe the seven x per year that
it was before then. they speculate about why that might be.
And they say that although there's not enough information to be
confident about what caused the kind of, again, call it a slowdown,

(01:17:32):
it's still a radical like acceleration of, of this stuff.
But but you know, the five x one of the explanations
they came up with was this idea that in the early days, there was
a sort of compute overhang. There was so much straight compute lying
around. People weren't really wrestling each other to get their
hands on GPUs. They weren't crazy expensive.
And so if you wanted to scale up more, you could kind of pretty

(01:17:55):
easily just go to the next lab over and say, hey, can we use your,
you know, your GPUs to try something at a bigger scale?
That's obviously a caricature of what actually would happen, but,
you know, a whole lot easier to get your hands on on that compute.
Whereas you get to 2020, GPT three comes out.
They correctly, at least in my opinion, identified as a major
turning point in the scaling story for a lot of people.
This is when we started paying attention to generative AI and AGI.

(01:18:18):
And so at that point, all of a sudden everybody's trying to get
their hands on GPUs, the market competition kicks in and it just you
start to be limited by essentially the the rate of,
like the rate of decrease in the cost of, of compute rather than
just like your ability to reach out and grab really readily
available devices.
Last thing I'll mention. You know, you called out this idea that so

(01:18:40):
so they compare different companies, in particular meta, OpenAI and
Google. They look at the scaling trends in compute for those
companies. Google and OpenAI look very similar, very consistent
5XA year starting in 2012 or 2016, in the case of OpenAI
and continuing, on an ongoing basis, whereas meta
looks a bit sharper. They're at 7XA year.

(01:19:02):
And while though that makes it sound like, wow, you know, meta is,
scaling more aggressively, I think the most appropriate
interpretation is they're playing desperate catch up now, because
for a long time, Yann LeCun was of the view that, like,
we weren't really close to AGI.
You know, we're not going to, you know, double down on the scaling
thing, whereas OpenAI is saying, no, no, like scaling is the path to

(01:19:24):
AGI. And Google institutionally seems to have a belief that's pretty
similar to that.

Andrey (01:19:29):
You know, there used to be a time where you could fit an entire,
deep learning model on one GPU, and I
miss was days. Yeah, yeah, yeah.
I mean, you can still do it with, quantized smaller models, but
not for frontier models.

Jeremie (01:19:45):
It's not the same, though.

Andrey (01:19:47):
Yeah. It used to be, researchers had a desktop
with a GPU, and you could run your experiments anyway.
Yeah, I will say, coming to the defense of Yann LeCun a meta a
little bit. In some sense, they did follow,
like scaling.
Everyone knew it was the path to go even before Gpt3.

(01:20:10):
Right. We knew more data, bigger models.
It was a known strategy. And it was the trend since AlexNet,
and meta did play a role in that.
In particular, they had massive. Some data sets for training,
segmentation, image models, etc..
Where they did lag was really adopting large language models

(01:20:31):
and moving towards outrageously large things.
And to be fair, again, I think
it wasn't in the popular consciousness in the research world as
as soon as Gpt3 hit and and scaling was hit, maybe we
didn't understand the full extent of implications, but certainly
people did see that this was a big deal.

(01:20:54):
But, you know, something that's justifying a $10
million training round was probably still not easy in these
businesses and organizations.

Jeremie (01:21:04):
I think that's the thing. Right? That is the test of whether you
genuinely have conviction. OpenAI I put forward that that, those
resources. Right.

Andrey (01:21:12):
Yeah. We're not the negotiators.
You know, he was right on deep learning.
And he was early. So pioneered tons of good research papers
to. Yeah. So definitely great but sometimes
wrong.

Jeremie (01:21:25):
And I just I just like to call to call people out and not get called
out myself. Andre. That's what I like to do on the show that that's
what this is.

Andrey (01:21:33):
Onto the lightning round. First story gzip predicts
data dependent scaling laws.
So what they're saying essentially is you can sort of tell
the scaling laws you'll see of data just based
on seeing how compressible the data is without even training

(01:21:54):
for neural nets.
If you just kind of analyze your data, and they show this
across six datasets of varying complexity,
and these different data sets have different compressibility based
on that complexity. And they then find that on each of these,
when you do, actually do this, that

(01:22:16):
corresponds to what you observe with scaling laws when you do
train neural nets. So very interesting result and perhaps very useful
result.

Jeremie (01:22:25):
Yeah. This was for for me personally this was like the paper of the
week. It just in terms of how interesting it was
and it was written by one guy, by the way, is like like one author
paper. So it's kind of a rare treat to see one of those.
The big question is, you know, when you're going to scale up the
amount of compute that you're going to throw at a model, so you have

(01:22:45):
a certain compute budget, you got to decide how big do you want to
make your model, right. Because the larger your model is, the more
moving parts it has that you have to tweak, the more parameters you
have to tweak per, let's say per, weight update.
Right. So you're going to invest more computing power into updating
your models weights for each data point that you feed it.

(01:23:07):
Versus should I just increase the number of data points?
Should I use a small model but use way more data?
So do more cycles of weight updates where each cycle is just
updating fewer weights because the model smaller?
That's the big question. And for a long time people thought
that like the answer to that question, the the ratio, the optimal
ratio of how much do you how big do you make the model versus how

(01:23:29):
large do you make the data set?
People thought that you should just scale them together 1 to 1
roughly. That's sort of the chinchilla scaling law result that we
got from, I don't know, 2021 or something. Google DeepMind, what
this paper says is, well, wait a minute, that may actually not be
independent, that may be dependent on the quality of the data,
the nature of the data, in particular how dense that data

(01:23:51):
is. If you have a data set that is super packed with like
loads of of information per token.
So think you're about code, right? There's way more information per
token than in just a poem or something, right?
Some languages are more efficient in terms of how much information
they pack into small sentences.
Right. So the more dense the, let's say less

(01:24:12):
compressible your data is, the more complex it is, the less
you can get rid of fluff.
It turns out the more you want to lean towards data instead
of model size as you scale your model.
And I was thinking a lot about why that might be.
The two reasons I could come up with were, you know, maybe it's

(01:24:33):
for highly complex data for, for, you know, very sort of
information, efficient data.
You need like a lot of,
let's say, well, let me flip it around.
If you've got language with a lot of fluff, a lot of unnecessary
grammar rules like longer sentences, and they should be for the
information they contain, you need to kind of store those stupid

(01:24:55):
grammar rules somewhere, and they turn out to have to be stored in
the model weights. You need more model weights, larger model,
whereas if you have more compact, more efficient models of the sort
of data, data that's more information dense, then you need more data
to train the model, because you want more diverse training examples
to capture all the nuances in that data.
So just to kind of, I guess, back of envelope thought there, I

(01:25:16):
thought, this is really cool. Whole bunch of very applicable results
here. They show they can reach the performance of.
With 24% fewer flops, 24% less compute.
Basically.
Using their compute optimal scaling.
So I thought that this was really cool.
Check it out if you're interested in this kind of stuff.

Andrey (01:25:34):
Yeah. Super interesting. And it also kind of plays into theories of
intelligence, I think, where you have said often that, you know,
you can basically say intelligence is compression.
Certainly neural nets, what they're doing is compression in some
sense. So I also agree this is very cool.
Next paper we are talking about robotics which we don't touch on too

(01:25:55):
often, but I am fond of.
So the paper is neural scaling laws for embodied
AI. And they're talking about robot foundation models, which
is let's say not quite as big as other foundation models.
For many reasons. It's hard to get a lot of data, and
so on. But they are an emerging field of research.

(01:26:17):
DeepMind has published some research on foundation models
for, let's say, manipulation. For instance, there's been recent
efforts to scale up data sets and so on.
And this paper is looking at basically
years of efforts on robot foundation.
Models may find 24 robot

(01:26:38):
foundation models, scaling laws on
a study on six tasks and four methods.
So not too surprising, but,
it seems that there is variation there.
It's not, quite the same across domains,

(01:26:59):
and certainly not the same as image generation of various exercises.
So in some sense, you can scale up in robotics and
keep getting better results, which we already suspected.
And again, this was already the case in 2016 and so on that
people were going there.
But now we have a slightly more scientific understanding of it, which

(01:27:20):
is pretty important.

Jeremie (01:27:22):
Yeah, I think it's like it seeing new scaling laws is really
interesting. To the the discussion we were having, just the tail
end, the point you brought up about, you know, how fundamental like
theories of learning that this brings up, right, when you see
scaling laws that apply in a new domain like robotics pretty
robustly as they seem to here, kind of makes you think, yeah.
Like, you know, it's almost like a principle of, I don't know,

(01:27:45):
principle of physics. Right? Like the more data, the more compute
you plow into a model, the more performant it is, at least by some
kind of figure of merit. So.
Yeah. Interesting. Interesting result.
Of course, one of the implications of scaling laws in robotics,
or the implications of scaling laws in robotics can be a little
different from, say, language or other kind of natively digital

(01:28:06):
contexts, because often in robotic systems, but not always,
you want to actually be able to deploy, at the edge, right?
You want to be able to do on device deployments.
And so you end up being constrained by just different variables.
Model size becomes a bigger variable constraint than necessarily
things like data compute. So you might find that yeah, you know,

(01:28:26):
we've got a $10 million budget. But for that budget, we could
afford to train a much bigger model than we can actually fit on our
edge device. And so you end up doing, you know, data optimal
training or, you know, other things. So, kind of, yeah,
kind of kind of interesting to see these laws, laid out again and
hopefully more research like this. Fourth.

Andrey (01:28:47):
Next contextual position encoding, learning to count
what's important. Another slightly nerdy topic.
And that is very exciting if you're working on language models and
foundation models. So real quick attention is
very important for swimmers.
And part of how you do that is

(01:29:09):
when you encode the inputs to your model, you add on
position and codex to tell you where in the sentence for instance
or other place you are, because otherwise transformers
are invariant to order.
So we don't know. Like this is the beginning of a sentence.
This is the end of a sentence.
And typically there's a lot of ways to encode position.

(01:29:31):
But roughly speaking, you know, you want to say this is earlier
versus later. And the way are independent of
context. They are independent, really have the content, so to speak.
And this people are saying that you can do a different
way of positioning that is contextual, and that means

(01:29:51):
incrementing position only on certain tokens that are determined
by the model. So you can attend, let's say, to the
if particular word now in a sentence.
And if you do that, you enable the model to work on tasks
that otherwise have a hard for it, like selective copying,
counting and flipping tasks.

(01:30:13):
And they also show that, this improves perplexity
on language model and coding task basically makes things work better.
So. Pretty exciting.

Jeremie (01:30:23):
Yeah. This is, you know, after having just ripped, Yann LeCun and
Meta for their AI strategy.
This is one where, you know, I flagged this paper because I think
it's a great example of the kind of research meta can do.
You know, it's a classic example where they they'll look at the
transformer architecture, say, or, you know, whatever architecture,
but these days, transformer. And they will look at what are these

(01:30:46):
structural, let's say limitations or flaws of these
models. Right. What are things that no matter how much you scale
this model, you should expect it to struggle with certain things.
And the fact that these models, don't
have the ability to, like, natively count anything
other than tokens like the position encoding in these models, as you

(01:31:08):
said, is it basically, you know, which token is the first?
Which is the second, which is the third? Well, tokens can be
anything. And in particular, often when you look at things like,
you know, byte pair encoding, tokenization, you have like something
like words, syllables are a token or components of words or token.
And so it's not always the case that if you count your tokens,
you're counting your words. And that can make it challenging for the

(01:31:30):
model to natively count.
Say the occurrence of words of full words in a sentence,
or sentences or things like that.
So if you want sentence level reasoning, if you want paragraph or
whatever other level of abstraction, level of reasoning that relies
on the model actually knowing where it is, let's say in a
particular, in a particular body of text at the at the

(01:31:50):
sentence level, if the word level, you need to find a way to have
this sort of more contextual positional encoding.
Right. And that's exactly why this is called contextual positional
encoding or or Cop. I thought that was a funny acronym.
Yeah. Apparently performs better on coding tasks as well.
You can sort of see why that might be. Right. The kind of logical
requirement surfaces in that context, where you want to know

(01:32:13):
where one variable ends and another begins.
Maybe you want to be able to count the occurrence of things in, body
of code because it has implications for logical reasoning.
But, again, this is one of those very elegant and simple things that
meta can sometimes put out where they'll show you a failure mode of
language models, and you kind of go like, you know, like it looks so
dumb, and yet it's, it's this beautiful sort of.

(01:32:36):
It reminds me of some of the best physics theory research that you
can see, where somebody just shows something so simple but so
powerful. And in this case, I thought one of the best examples was,
you know, they showed GPT four fail, on a task where they're like,
just count the number of times that the name Alice appears in the
sentence, and it couldn't do it.
Right. So, that problem, of course, was solved with their, their,

(01:32:58):
positional encoding strategy. But, you know, I thought it was a
really interesting paper. And, hats off to meta for this result.

Andrey (01:33:04):
Yep. Now you want to use cop instead of rope, right?
And the last story for this section new AI products, much
hyped but not much used according to a new study.
So this is a survey of 12,000 people in six
countries. And it finds, you know, as per title that there's

(01:33:25):
a lot of hype, but they haven't quite reached this mass
appeal stage. So only, 2% of British
respondents, for instance, say that they use such tools daily.
In general, if you just look at people that know I've
heard of these things, like ChatGPT is at 58%

(01:33:46):
in the UK, 53 in the USA.
All of these other AI models are lagging quite a bit behind
Geminis at 24% in the US,
50 in the UK, Microsoft Copilot is in the 20s and
then you go down, you know, they get to 7% perplexity
that I, for instance, very much hype at 3% in the US as

(01:34:09):
far as attention.
So yeah, I guess not too surprising.
You know, it takes time for people to adapt and be aware of the new
technology. There is maybe a sense if you're
deep into the stuff that everyone knows about it, and it's not true.
As we've covered before, I will say I challenge the,

(01:34:29):
kind of what this is saying, that there's a lot of hype, but not a
lot of actual impact necessarily. Yet.
I mean, this has been the case with technology throughout time.
You know, you could say the same about the internet, about
smartphones. You know, it doesn't it's going to take time to
penetrate. But it is, I believe, the case that everyone is going to
use AI soon enough.

(01:34:50):
So that's not take this to mean that the hype is not
warranted.

Jeremie (01:34:55):
Yeah. I mean, when you look at it exponential, right.
Like obviously like you're either too early or too late, you're
either going to look like you're cause, calling the trend way before
it's a thing or you're going to be sitting on, you know, the other
side of the internet bubble in like 2015 calling the internet
bubble or not the bubble, but calling the internet phenomenon.
So, yeah, I think that's the case here.

(01:35:16):
You know, I'm I'm old enough to remember that the idea of like 2%
of British respondents saying.
They use these tools on a freaking daily basis.
Would have been insane.
Like a daily basis.
That is all. That's a lot. Man, that's a lot of usage.
That's about where I'm at, obviously. Like, you know, we're junkies.

(01:35:37):
I actually I use it many, many times a day.
But but still, like, this is early days.
You're talking about a market that has.
Let's not forget a giant chunk of people who are over over
60, over 50 who don't necessarily think of technology
as, you know, part of what they need to keep up with and
necessarily dabble. There's a, you know, a famous, tech

(01:36:01):
book called Crossing the Chasm for business book that kind of goes
into this phenomenon or like, you know, yeah, it takes a long time
to kind of break through to each new level of user.
But to your point about impact, I don't think that the metric that
matters for impact is how many people actually use the services.
It's how much value is being created by the services.
Right. And for that like also.

(01:36:22):
Yeah, just I mean, think about every time you use Google and you get
a summary, you're consuming the output of a generative AI system.
It may not be a chat bot per se, but it is something that is very
much inspired and literally based on the breakthroughs we've seen,
like GPT two, GPT three, and so on.
So yeah, I think it's like these claims are,
it's fun to say things are overhyped.

(01:36:44):
It makes people sound, you know, very mature and all that.
But, I agree with you. I mean, I think overall it's an interesting
stat. It's definitely like it's less than I would have expected.
But it is significant and, and not necessarily the right metric to
be tracking if you care about impact.

Andrey (01:37:00):
Onto policy and safety.
And we start with something a little bit spicy.
Once again, we are talking about OpenAI drama, one of the stories
that has come out this past week.
And the title is ex OpenAI board member reveals what
led to Sam Hartman's brief ousting.
And we've covered a lot of his ousting.

(01:37:21):
You know, to recap, Tim Hartman was removed
by the board as CEO last November.
Came back after like a week or something where that was a
very exciting time.
And one of the people on the board who
was on the side that pushed Altman out is Helen Toner.

(01:37:42):
Who is that safe? More from, policy safety background
less on the business side.
And yes, in a new interview, she spoke and provided
more details about what happened.
So, for instance, she said that the board only learned
of ChatGPT on Twitter and they didn't get their heads up,

(01:38:03):
said that, and multiple cases, there was lack of
transparency regarding safety processes and some even misleading
information. We sort of knew a lot of this already.
I won't say that this is revealing what led to Sam Altman's
ousting. We knew the general kind of story here, but certainly these

(01:38:23):
are, dramatic new revelations with a bit more
detail than we had before.

Jeremie (01:38:29):
Yeah. And some of the detail I think is, is new and more specific.
Right. So one of the things that, Helen Toner highlighted here in
so there are a couple things. There's a big podcast that she did
with that, I guess the Ted organization.
And then there was, there was this economist article.
And one of the things that she highlighted is, you know, they they
the board apparently believed that Altman cultivated a, quote, toxic

(01:38:52):
culture of lying and engaged in quotes, behavior that can be
characterized as psychological abuse.
She is not the first person to make this kind of allegation.
We've heard from, various people in including, I think, one person
who took text. I'm trying to remember his name as he's, like, famous
in the community. But anyway, who's saying, you know, I've, I've

(01:39:12):
dealt with Sam, for a long time.
He's always been really nice to me, but I do know and have have seen
multiple instances where he has lied to me and to others.
Sam seems to be somebody who, based on these reports, does have
a lot of trouble with the truth. And that's concerning, when you
look at a guy who's running an organization with a mission as

(01:39:32):
important as opening eyes, certainly, that hasn't
been helped by the revelations we've had.
Surface, and we'll talk about this.
But, you know, OpenAI, institutionally, it seems silencing
whistleblowers and former employees is certainly something that my
team has been aware of for a long time.
As we've talked to whistleblowers at, the various frontier labs, you

(01:39:53):
know, hearing about the the challenges that people have in making
the decision to come forward, this is something that OpenAI has,
does seem to have an institutional culture of secrecy, which
contradicts, fairly significantly, their public,
sort of messaging around how well, how open they are.
And I think that's, you know, not great for accountability.

(01:40:14):
You know, Helen called out here as well that there were, questions
about security practices, safety practices, that.
You know, they just weren't getting accurate or complete reporting.
She seemed to suggest from Sam and that the board just couldn't do
their job and oversee their work.
So interesting. You know, this sort of thing is much more in the
orbit, of course, at the time, right when opening, when Sam was

(01:40:36):
first let go, we talked about it on the podcast and I said, well,
you know, this looks sort of like a he call it a somewhat botched
firing because we don't know the reasons yet.
Now, having seen the reasoning starts to make a little bit more
sense. I do think that this statement about how,
you know, they were taken by surprise by ChatGPT, like, I don't

(01:40:56):
know, to me, that doesn't seem as as much of an indictment as it
might sound. The reality is, at least the public story that OpenAI
tells us, and that Sam Altman has been very consistent on from the
start was, yeah, you know, we launch products all the time.
I didn't think much of ChatGPT.
He did say I expected it to be successful or did you know, to work
well. But they definitely were surprised at how successful it was.

(01:41:19):
And you wouldn't expect the board to be notified from like every,
you know, every PR that's that's pushed up.
So there's a bit of a, a bit of a continuum there of, you know, what
is to be expected. There has been a rebuttal to, in The Economist
published by Opening Eye's current board chief, Brett Taylor, and he
basically says, we're disappointed that, Helen Toner continues to

(01:41:40):
revisit these issues.
And he says that all kinds of things, he kind of goes through the
process, the story as we know it, you know, hey, we had an
independent review, the concluded that the board's prior decision
was not based on concerns regarding product safety or security, the
pace of development, OpenAI AIS finances, or its statements to
investors, customers or business partners.
Okay, there are a lot of asterisks in that.

(01:42:03):
We could we could get into and maybe don't have time for.
But I think ultimately, it does not address the question of the
board. Was the board kept up to date on these issues or.
No, there's a lot of very careful language here that, frankly,
mirrors all too closely the very careful language that we've
seen OpenAI deploy in the past, including in the context of these

(01:42:25):
sort of whistleblower employment agreement, things that have caused
so much trouble in the last week. So, you know, I think, this
is this is tricky. Again, these are serious allegations, right?
Helen is coming out and saying on multiple occasions he meaning Sam
Altman, gave us inaccurate information about the formal safety
processes that the company did have in place, meaning that it was
basically impossible for the board to know how well those safety

(01:42:47):
processes were working or what might need to change.
So, you know, again, for a company with, as they themselves, as Sam
Altman himself puts it, is is on track to change,
you know, if if they're right, the face of humanity as we know it,
more openness is simply required.
I can tell you that OpenAI employees themselves still don't know the
full story. There there are a lot of problems there.

(01:43:10):
If there's not openness, even at the level of the employees in the
company itself. So, the shiny luster may be coming off,
coming off the, the company a little bit.
And I think until we have answers to these questions, perhaps
rightly so, perhaps people should be asking themselves fundamental
questions about the governance implications of, of a lot of this
behavior.

Andrey (01:43:29):
Right. And this is coming, out just
following the Scarlett Johansson, episode from
last week where there was a similar story where, you know, OpenAI,
I was like, well, we didn't intend or it wasn't meant to be based
on Scarlett Johansson, but Sam Altman did tweet her

(01:43:49):
and it was revealed that he personally lobbied
for this. So not exactly lying, but not exactly
being fully transparent necessarily.
And I think more than anything, these things are kind
of shaping people's perspectives.
Sam Altman in particular, let's say OpenAI in general, but Sam

(01:44:10):
Altman in particular, and it does kind of expand on
a general understanding of Sam Altman.
So, for instance, at this first company, what I read was,
you know, employees came to a board asking him to be ousted for,
let's say, similar reasons.
And it's worth noting that he was lead of Y Combinator.

(01:44:30):
So he's very much in this Silicon Valley startup environment.
And one of the ethos of that world in
general is to be kind of rule breakers, to fake it til
you make it. You know, if you know how startups work, you may not
always be fully honest in your quest to succeed.

Jeremie (01:44:50):
So I will say, you know, having having gone to Y Combinator,
like the, the ethos at C is actually very
sort of like oriented towards being a positive force in the world
and in building to build.
And to the extent that you're breaking rules, it's because you
assess those rules are, sort of poorly developed,
are not kind of in the interest of the wider world in some amount of

(01:45:14):
some sort of rule breaking is necessary in life.
I think the challenge with Sam is the extent to which this may
be. Pathological and the extent to which, given the context,
right, like OpenAI is a very powerful company and they become the
most powerful company. If Sam Altman's own thesis turns out to be
true. And if that's the case, you just can't be having

(01:45:36):
these sorts of questions.
And again, like, you just rattled a bunch of Scarlett Johansson.
We got this. We had literally a few days before, this
news that apparently, Sir John, like, the former head of alignment
at OpenAI, left in protest, saying, look, Sam promised us
20% of the compute budget that OpenAI had raised as of some date in

(01:45:56):
2023 to help with Super line that they did not deliver on that
promise. You know, these are very substantial.
These speak to the core mission of OpenAI that apparently is not
being upheld by Sam Altman and by the board and by the executives.
These are really deep issues, and I think we are
not even close to having the answers that we need on any of these
questions. It would be great to have a little bit more clarity and

(01:46:18):
transparency and, frankly, accountability.
Ultimately, if if these concerns turn out to be valid.
And that's something we we can't find out without more more
transparency.

Andrey (01:46:29):
Yeah. So there you go. Once again we are shilling for AI.
And speaking of John Lackey, the next story is opening AI researcher
who resigned over safety concerns joins on from Pic.
So he went from one of a leading companies to another.
And actually it's funny and Tropic was a spin out from OpenAI

(01:46:50):
initially like the founding team was former OpenAI,
employees. So in some sense this is appropriate.
And what, you know, is like he will be working on
essentially similar things on this kind of super alignment,
idea of aligning AGI and particular alignment
more for the long term of super models.

(01:47:13):
Less so for today's models.
So yeah, not too much more to a story, but I think very significant.
And anthropic does position itself and I think
fairly so as more safety concerned and more focused
on safety than OpenAI.
So this is yet adding to it for sure.

Jeremie (01:47:34):
Yeah. For sure. And this is consistent with, you know, what young
like you said when he, left OpenAI again in protest at that time,
he posted in a tweet, he said, we're long overdue and getting
incredibly serious about the implications of AGI.
We must prioritize preparing for them as best we can.
So he's looking to continue his work on, as he put it, scalable

(01:47:55):
oversight. So basically having essentially you think of this as like
smaller models oversee, in a reliable way, larger
models, weak to strong generalization, and automated
alignment research, which had been a key pillar of OpenAI strategy
that kept saying, look, we're going to we're going to use, weaker
systems to automate the process of doing alignment research to help

(01:48:15):
us align increasingly powerful systems.
A lot of people have criticized that strategy.
It to me, it seems like, you know, it's it's one of the strategies
that could actually work. I think, it's it's incredibly risky
if you if you take the risk seriously.
But, the which which some may not for various interesting and
potentially good reasons, but, still, I think it's.

(01:48:37):
Yeah, it's a risky piece. He's moving over to anthropic.
They've got a, as you said, an amazing track record on things like
interpretability in particular. So it'll be interesting to see those
efforts pair to the scalable oversight.
And we just wrong generalization stuff.
that like us doing not mentioned here is Ilya Sutskever who
also left OpenAI. Interestingly, exactly six months, it seemed,

(01:48:59):
after, the attempt to oust Sam Altman in return.
So kind of makes you wonder, you know, if there is some, some
provision for a period of notice that he had to give or something,
which is why we hadn't heard from him. We still don't know where's
Ilya? We, as a meme goes, we still don't know, what he is doing or
what project he might be working on.

Andrey (01:49:16):
And after Lightning Round, where we'll try to be a little quicker.
But as you said, there's a lot going on this week.
And, you know, we just got finished.
Let's say.

Jeremie (01:49:26):
Oh, Jesus.

Andrey (01:49:26):
Criticizing Sam Altman.
But unfortunately we got to go back to it with this next story.
So the story is that the leaked OpenAI documents show Sam Altman
was clearly aware of silencing former employees.
You've talked about how this was one of the controversies that
emerged with regards to the safety team.
One of our employees, came out on Twitter saying that they

(01:49:50):
gave up. They didn't want to sign an NDA.
And in the agreement, when you leave a company,
apparently you had to sign, very strict
requirements not to talk negatively about OpenAI, not to disclose
various things, or OpenAI could claw back
the equity, the ownership you had in OpenAI, amounting to a

(01:50:13):
lot of the money you made with the company a huge amount of value.
Right. And so some Altman, you know.
My on. That was just a fad.
Someone at Twitter saying, oh, no, I was unaware of this.
But we haven't clawed back any equity, and we gonna, like, go out
and say that we're going to change the system.
While the story says that while he signed off on

(01:50:37):
the policy, you know, so seems like you knew about it.
I don't know.

Jeremie (01:50:42):
Yeah. You know, man, I think the headline here
is maybe a little ungenerous, you know, saying he was clearly aware.
The reality is his signature was on a series of corporate documents
that established this practice.
Right? That formalized it. Okay.
That's serious. You really should be reading the corporate documents
that you sign, especially if you are the CEO of this company, and

(01:51:05):
especially given your company's commitment and your own personal
commitments to openness and transparency.
Absolutely true. That being said, does everybody read all the fine
print? No. That being said, this is really fine print
that, you know, if I'm a betting man, I am putting my money on Sam.
And you knew this was a thing.
We have other I have other reasons to suspect that, but,

(01:51:28):
but long and the short of it is, you know, this doesn't look great.
And you know this claim as well, that, there have not been any
clawbacks on equity.
I, I understand from talking to some folks, who
have experienced the process of departing that,
that this is maybe technically true,

(01:51:50):
but in practice, not necessarily true.
That is my understanding of this.
Again, this is one of those things that kind of, causes me to raise
an eyebrow, looking at the ease with which Sam Altman seems to put
out these statements. We just heard an awful lot of this, like,
technically true, but actually stuff.
And. And this doesn't look great.

(01:52:11):
I mean, you know, like, the list just keeps getting longer, and,
and now, you know, here we are with something that's really high
stakes. And OpenAI, arguably now, it's only after there was that big
Vox piece that came out calling them out on this practice.
You know, there's all kinds of stuff. If you actually look at the
interactions to between the OpenAI employees who are trying to
depart with their equity and, the OpenAI people

(01:52:34):
that are just outrageous, I mean, the OpenAI folks telling them
things like, look, we're just doing this by the book.
It's, you know, standard procedure or whatever for us to include
these very, very aggressive non disparagement clauses, basically
saying for the rest of your life, you can't criticize OpenAI at the
cost of losing your equity.
That by the way, for vested equity, not normal at all, despite the

(01:52:56):
apparent characterization that, OpenAI institutionally
had been offering. So a lot of big problems here, a lot of questions
that I think are rightly being raised, across the board when it
comes to OpenAI, OpenAI and its governance here.
So hopefully this leads to some positive changes and more
transparency.

Andrey (01:53:14):
Yeah. And if something else, maybe he didn't know the fine print, but
the overall strategy for overall like approach to this.
Yeah he is a CEO you know.
So yeah.
And clearly now we will get comments saying we are open to haters.
So there you go.

Jeremie (01:53:31):
Well I've been a fan of them for a long time.
You know that's the sad thing is, is I genuinely have been an
OpenAI fanboy for forever and, believe that they were,
you know, operating with, with the best of intentions.
And I think that may have been the case.
I do think that that may well have changed, unfortunately, and I've
had to update quite a bit just based on talking to people, talking

(01:53:52):
to people who've left, talking to people in the ecosystem, to
whistleblowers. This is not the same company that it was when it was
making a lot of its most lofty and ambitious commitments to safety
and responsibility. And, and that's unfortunate.
And I think that unfortunately, it also means self-regulation,
is no longer really on the table as long as you have companies.
That could be the way OpenAI could be.

(01:54:15):
Like, you can no longer just count on people, you know, doing their
own grading, their own homework, so to speak. So this is really
unfortunate on a lot of levels.

Andrey (01:54:23):
Yeah. And I think that's true even if you just look at research
output, you know, OpenAI used to put out more and now it's more
entropic. Yeah.
And one more story about OpenAI.
There's a lot of squeezes.
So there's a new safety and
security committee, which is led by directors Brad

(01:54:45):
Taylor, Adam D'Angelo, Michael Seligman and CEO
Sam Altman. And so the committee will be responsible for making
recommendations on safety and security.
This is presumably in response to some of these things like
realignment team.
And the first task will be to evaluate and develop OpenAI processes
and safeguards over the next 90 days.

(01:55:08):
Afterwards, we all share recommendations to a full board, and open AI
will publicly share an update on the adopted recommendations.
I've seen some criticism of this.
As you know, the board is full of.
Ciders. They may not be quite, let's say,
challenging enough challenge for commercial interests of

(01:55:28):
the company as Sam Altman is on the board.
So I guess let's not keep waiting and hoping I
too much. But that's how it looks.

Jeremie (01:55:38):
It just seems it just seems like a bit of a trend.
Like every time there is a corporate governance shuffle, it's
like Sam Altman ends up on the committee that's deciding what to do
about it. So, you know, you might argue that's not the most
constructive and healthy form of oversight, especially given the
leaks that we've been hearing.
It's also you got it.

(01:55:59):
You got to respect the game from OpenAI on the PR side here, they're
basically saying like, hey, world.
We're like all the scandals.
Yeah, whatever. We're setting up a new safety and security
committee. The super climate team is totally like, the leadership is
totally just left us. Yeah, yeah.
Boris set up this new committee. By the way, we're training GPT
five. Anyway, so the committee is going to be set up like this.
They just dropped this line. OpenAI has recently begun training its

(01:56:22):
next frontier model, and we anticipate the resulting system to bring
us to the next level of capabilities on our path to AGI.
So, by the way, I think recently there is doing a is a very load
bearing word. I suspect the, you know, model has been in train for,
for quite some time. But in any case, this is clearly a clearly I
shouldn't say clearly. This is quite possibly, kind of PR response

(01:56:44):
to, you know, understandably, all the brouhaha.
You might do this even if you were well-intentioned.
Right. So this is just kind of they're trying to pick up the pieces
of their bring on board some, some great people, you know, to to
join this committee as well. Yes. We can rip on them for having
Sanjay there, especially in light of the criticism.
But, they have a bunch of, good policy experts.
They've like John Shulman, who's the head of alignment science, kind

(01:57:06):
of taking over for John, like after his departure.
And, and, Jacob, who is their chief scientist?
So, you know, that's good.
But you got to wonder with what we've seen with the shake ups at
OpenAI, is there a credible governance structure?
Is accountability? Is transparency actually going to happen, or

(01:57:27):
is Sam Altman's personality just going to like, run,
you know, run over all the objections that are raised so hard to
know. Brett Taylor, very vocal.
Sam Altman, advocate.
And in fact, the guy who wrote the rebuttal to Helen Toner and,
and Tasha McCauley that we, we read earlier, he will be chairing
this committee. So kind of, you know, when we talk about a friendly

(01:57:50):
committee to Sam Altman, that's sort of where that's coming from.

Andrey (01:57:53):
Right? And, well, we've crossed the two hour mark, which we have done
in quite a while. So we're going to try and wrap up.
But, hopefully a few of us are still with us because I think was open
air stories are pretty notable and worth knowing about.
Yeah. And the next story is a bit less of a big deal.
So it's about the person who cloned Biden's

(01:58:14):
voice and made these robocalls, I believe, at a time saying, trying
to tell people not to go out and vote in the primary.
Well, now we know that, that person was fined $6 million
from the Federal Communications Commission.
And this is a pretty important thing.
This is apparently the maximum limit, and the actual paid

(01:58:36):
could be significantly less.
But, it's a bit of a precedent setting move, right?
Because we haven't seen too much of this kind of deepfake, really,
action happening and significant, responses.
So I think very important as we add into the next few months of
the US election.

Jeremie (01:58:56):
Yeah, absolutely. And it's it's interesting.
It's it's sort of trying to set a bit of a precedent maybe.
And, you know, penalizing at the higher end of what's possible, just
given the stakes. Sort of a interesting choice for the the system to
move in next.

Andrey (01:59:09):
Hacker releases jailbroken that God mode version,
version of ChatGPT.
And that what jailbreak means is things you usually
shouldn't be able to ask for, like how to make drugs.
Now, you'll be able to do that.
Apparently this is using something like these.
Spec leads speak a language that replaces certain letters

(01:59:32):
of numbers. Various methods exist for jailbreaking,
so it seems to be using some of those that is not able
to combat yet. We can expect OpenAI to try and block
those efforts. Well.

Jeremie (01:59:46):
Yeah. This jailbreak comes to us by way of one of my favorite
Twitter followers, which is Pliny the Prompt.
Or he always comes out with, really cool jailbreaks and things, leet
speak. By the way, is this disappointingly stupid, simple way of
modifying your, request your prompt.
So replace every instance of the letter E with the number three.
Replace every instance of the letter O with the number zero, for

(02:00:06):
example. And you kind of get what the what that vibe looks like.
It doesn't work anymore. So this is a GPT that he deployed.
And for a short period of time people could use it's been taken
down. Surely this this vulnerability is going to be all patched up,
but it's another example of like, just how dead simple and
embarrassingly simple some of the jailbreaks can be.

(02:00:27):
This is a symptom of how hard it is to get AI systems to do what you
want them to do, and not what you're training them to do.
Right? That idea of the alignment, the the difficulty of aligning
these models. So, yeah, kind of interesting.
I tried clicking on the link. It does not work any longer,
unfortunately, or at least as of the time that I tried.
So, no, no God mode for us today.

Andrey (02:00:46):
And while I was in China, creates 47.5
billion that ship fund to back nations
firms. This is the third phase of a national integrated circuit
industry investment fund.
And that means that they have ¥344 billion.
That's 47.5 billion USD, which is coming from the

(02:01:08):
central government and state owned banks and enterprises.
That'll back, get semiconductor and,
compute efforts.

Jeremie (02:01:18):
Yeah. You can think of this as China's answer to the Biden
administration's chip to chips act, right.
The the Chips and Science Act, which was 39 billion, for
chip makers. But then there was an additional 75 billion in loans
and guarantees. So you're talking about here something in the same
orbit. You know, China obviously trying to achieve semiconductor
self-sufficiency for all kinds of strategic reasons.

(02:01:39):
Turns out the largest shareholder in this fund is going to be
China's Ministry of Finance. So you're very much looking at, the
classic sort of Chinese, PRC central planning, game here.
So, you know, that maybe in a bit of contrast to the approach
that's, that's been taken in the States there, it looks like they're
actually going to be, owning some, some, some

(02:02:00):
equity in the outcomes. But yeah, really interesting.
Beijing struggled with getting these things to pay off.
By the way, we've talked about this before, but lots of corruption,
lots of fraud as the, capital just flows into this ecosystem.
And that's just the cost for them of doing business.
They they need to burn some of that cash to take bets on a large
number of potential semiconductor companies to let them achieve what

(02:02:22):
they're trying to achieve.

Andrey (02:02:23):
And on to the next section, synthetic media and art.
We only have one story. So this is the last story of this big long
episode. And it is.
Alphabet and meta are for millions to partner with Hollywood
on a yeah.
So for instance, Warner Bros.
Discovery Inc has expressed a willingness to license some of its

(02:02:45):
programs to train, video models.
It sounds like not everything.
While Disney and Netflix aren't willing to license their content to
these companies, but have expressed an interest in other types
of collaborations. So this is a bit of an overview article, and
I also found it interesting that it notes that no major studio has so

(02:03:06):
far sued a tech company over AI.
So these Hollywood companies, unlike, let's say, music companies,
are maybe taking a more friendly stance in general.

Jeremie (02:03:19):
Yeah, yeah. I'm really curious how this, shakes out.
There's so much money it's taken. And one of the challenges they
face to right, is just like the democratization of the ability to
generate Hollywood quality, video.
Right. I mean, it's it's not impossible that, you
know, video production gets like Midjourney fired in that sense.

(02:03:39):
So we get Midjourney for video, or stability for video,
in a very reliable way. Sometime in the coming, I would say coming
years, coming months, maybe even, and at that point, you know,
you're really leveling the playing field between the insane
production costs in Hollywood, which is obviously why they're.
They're interested in this tech in the first place.
And, and then the, indie developers, indie, I guess

(02:04:01):
producers, if you will, and not a Hollywood guy, but, you know, that
that population. So, so this could signal a structural shift to
long term in the way that, movies and content is, generated
and consumed.

Andrey (02:04:14):
And with that, we are done with this episode of last
week and I again, I'll mention the newsletter at last
weekend that I.
And yeah, this was a good episode to have you back, Jeremy.
Lots to talk about and lots of cool drama,
let's say.

Jeremie (02:04:32):
Yeah. No shortage.

Andrey (02:04:34):
So as always, I will mention that if somehow
you're not subscribed, we would appreciate it.
In particular, if you head over to the YouTube channel, it's called
Last week and I might be helpful to subscribe and like, stuff,
I don't know, algorithms rule all these systems.
And speaking of algorithms, if you do stop by and give us an overview

(02:04:55):
on Apple Podcasts or Spotify that will help us reach
more listeners. So we will like that review
and any feedback you've got.
But more than anything, we do enjoy people listening, so please do
keep tuning in and enjoy the AI generated song that
will now start.

Jeremie (02:05:15):
Ooh!

Unidentified (02:05:17):
Oh, hey. Hi. Oh, wow.
Oh, hey. Hi. Oh, wow.
Oh! Jeremy. Marisol.
Hey.

(02:05:38):
Hey, now.
Last week. Last week.
News insights.
Minds as well.

Jeremie (02:05:53):
Oh, breakthroughs can't slow down.

Unidentified (02:05:57):
Jeremy back with flair.
This week. This week I do feel like.

All Episodes

#169 - Google's Search Errors, OpenAI news & DRAMA, new leaderboards

Episode Transcript

Popular Podcasts

Stuff You Should Know

The Joe Rogan Experience

24/7 News: The Latest

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}#169 - Google's Search Errors, OpenAI news & DRAMA, new leaderboards