All Episodes

July 23, 2025 53 mins

Our 217th episode with a summary and discussion of last week's big AI news! Recorded on 07/17/2025

Hosted by Andrey Kurenkov and guest co-host Jon Krohn. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Check out Jon's workshop on Agentic AI Engineering, and find his consultancy here.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

In this episode:

  • OpenAI's new ChatGPT agent: The episode begins with a detailed discussion on OpenAI's latest ChatGPT agent, which can control entire computers and perform a wide range of tasks, showcasing powerful performance benchmarks and potential applications in business and research.
  • Major business moves in the AI space: Significant shifts include Google's acquisition of Windsurf's top talent after OpenAI's deal fell through, Cognition's acquisition of Windsurf, and several notable hires by Meta from OpenAI and Apple, highlighting intense competition in the AI industry.
  • AI's ethical and societal impacts: The hosts discuss serious concerns like the rise of non-consensual explicit AI-generated images, ICE's use of facial recognition for large databases, and regulations aimed at controlling AI's potential misuse.
  • Video game actors strike ends: The episode concludes with news that SAG-AFTRA's year-long strike for video game voice actors has ended after reaching an agreement on AI rights and wage increases, reflecting the broader impact of AI on the job market.

Timestamps + Links:

  • (00:00:10) Intro / Banter
  • (00:02:49) News Preview

Tools & Apps

Applications & Business

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:11):
Hello and welcome to the last weekin AI podcast where you can hear us
chat about what's going on with ai.
As usual, in this episode, we willsummarize and discuss some of last
week's most interesting AI news.
You can go to the episode descriptionfor the timestamps and links to
skip to any of the many storieswe'll be talking about today.
I am one of your regular hosts, Andre ov.

(00:34):
I studied AI in grad school and Inow work at Regenerative AI startup.
Nice.
This is John Cron.
I am irregular.
You might even say one of your odd hosts.
Mm-hmm.
Would be a good adjective.
One of our regular guest co-hostsis how I like to think about it.
Yeah, right.
Exactly.
That's, I really appreciate that.

(00:54):
Yeah.
I've been on, I, ive.
I've been on the showprobably half a dozen times.
Mm-hmm.
At least and love being on the show.
It's the only podcast that Ilistened to last week in ai.
If people have heard me onthe show before, I'm, I'm sure
they've heard me say that before.
Delighted to be here.
I'm perhaps best known forhosting a show called Super Data
Science, which you've been on.
Andre.
It's an interview format showas opposed to, you know, news

(01:15):
focused show like last week in ai.
So they're a nice complimentto each other, we might say.
Mm-hmm.
And something big since.
I've last been on the show is that inMarch I co-founded a new consulting
firm, which I'm CEO of, and we'recalled Y Carrot, like Y hat, but mm-hmm.

(01:35):
Carrot like the computercharacter, the thing above the
six on a US English keyboard.
it's a bit of a machine learning jokefor people who are in the know, but.
we're focused on agent stuff.
We're focused on generative stuff, ragand bringing that into enterprises,
letting people get ROI on allthe latest and greatest in ai.
So there's some stories that I'llbe able to relate to from firsthand

(01:57):
experience because of that.
Makes sense.
And now it's probably a good time tobe consulting people 'cause there's
certainly a lot happening very quicklyand it's honestly hard to keep up
even if you're like hosting a podcast,much less if you're not doing that.
Andre, it's unreal.
I've never had an experiencein business like this before.
Every.

(02:18):
Other commercial thingthat I've ever tried.
It's hard to get product market fit, butfor any of our listeners out there, I'm
probably, now I am cannibalizing my own,but I, there's so much work out there,
like a rising tide lifts all boats.
There's so much opportunityout there right now to be
transforming organizations with.
LLM enabled technologybasically that it's crazy.

(02:39):
Every conversation leads to next steps.
Nobody's ever like, this isn't, andI'm not sure this is what I need.
Mm-hmm.
It's just a matter of prioritizingand getting things done.
and it's actually a quite good fitto give a quick episode preview.
This episode is gonna be prettyheavy on the tools section.
Lots of new things, and most excitinglywith Chad g PT agent just came out.

(03:02):
So that'll be probably oneof the big focus areas.
But in business lots of interestingdevelopments in the hiring front we've
been talking about for the last few weeks.
Even more kind of weird news ofacquisitions, hires, movements, et cetera.
And beyond that, we'll only have a couplestories in research and policy and safety.

(03:24):
This is gonna be a bit of aquick episode, so it's just
gonna, raise by, try to keep up.
So let's go ahead and dive in toolsand apps, starting with open, AI's new
chat, GPT agent, which can control anentire computer and do tasks for you.
So the way this looks like is in chat GPT,they have this kind of selector menu where

(03:46):
you can choose various modes includingdeep research, web search, et cetera, and.
Chad GP agent is now a new option there,and the gist of it is it's combining
two previously existing things.
They already had operator whichcould browse the web for you
and do various tasks that way,and they had DP search, which.

(04:10):
Analyze and summarize information.
So the way that OpenAI pitches, thisis a sort of like best of both worlds,
a much more powerful agent that cando general computer use, it can, you
know, click it, can do commands, itcan browse web and so on and so on.
And so, yeah, this is the latestfrontier you could say on agen

(04:33):
task execution beyond like code.
This, this is.
Able to do conceptually, I suppose,anything you could do with a computer.
And coming along with an announcementbesides the utility of this, they also
show really, really strong performanceon various benchmarks like humanities,
like, last exam, frontier math things.

(04:55):
We cover this tragedy agent with.
Browser and computer and terminalis able to outdo open IO four
mini with tools deep research allof these by quite a big margin.
So, this seems to be sort ofthe most trained agent that.
Opening AI has ever released.

(05:15):
It's cool.
I used it already, andit's really effective.
You can watch it working so you cankind of, you can see it going on
the internet, doing tasks for you.
You can actually interrupt it and takeover in like, so you kind of have this,
this view if you've ever remoted intoyou know, a remote service, like watch,
it's like doing that and watching.

(05:36):
A colleague of yours program or searchthe web and you can actually go in
there and interrupt it if you want to.
I haven't tried the interrupting yet.
I'm not sure what value that wouldreally provide or if it can continue
after you stop interrupting it.
I don't know exactly how that works,it can create assets for you, like
spreadsheets, like, slideshows.
And so we've been using it for thatalready and it's been really good.

(06:00):
So it has, I've been a deepresearch user for months now.
I pay for the pro tier of chatGPT in order to be able to get
used to amazing report building.
Like it seems like it would becomparable to having a McKinsey
analyst working for you.
Except that they can get theirwork done in minutes instead of.
Days or weeks.

(06:21):
But it's that level of quality with deepresearch and now adding into it as well.
You know, the, the ability to beoutputting assets for you to be able
to to be able to see what it's doingwhile it's crawling the web genetically.
It's it's a cool interface.
I like it.
Powerful.
Yeah, that certainly seems like it.
And in fact, it's so powerful thatthere are some kind of safety concerns.

(06:44):
It's gonna ask you for permission forthings like sending emails and making
bookings since it can kind of do whatever.
Also as restrictions on financialtransactions, probably a good idea.
And as you said, this is now rollingout to Pro Plus and team users.
With enterprise andeducation coming out later.
So lots of people aregonna start using this.

(07:06):
I think we're gonna start seeingsome pretty cool examples of
what you can do with this.
Onto the next story.
We covered Qmi K two brieflyin the last episode as a new
exciting open source release.
But we didn't dive into it, so I thinkwe will cover it a little bit more.
The headline is Alibaba BackedMoonshot Releases new Qmi AI model

(07:28):
that beats Cha GT Clot in coding.
It costs less.
So va g is qm.
A K two is a 1 trillion parameter modelthat is very has a lot of experts.
So only 32 billion activeparameters at a time.
And it had really impressivebenchmark numbers.

(07:49):
What I've seen since then is kindof, it passes the vibe check.
Everyone seems to agreethis is a really good.
Really impressive open source modelcompetitive, even as this article says
potentially with Claude or Chad GTor other proprietary private models.
So way beyond Lama, way beyondprobably anything we have in new open

(08:12):
source, including Deep Seq V three.
And this is not even a reasoningmodel, so they presumably have an
R one variant of this in the works.
Yeah, this is kind of a story that isunsurprising, I suppose this is kind
of like the trajectory that you're on.
You're kind of expecting somebody to comeup with open source approaches that rival
you know, Jeremy talks a lot on the show.

(08:34):
I'm sure you do as well.
But for some reason, I rememberJeremy saying this frequently
of kind of six months.
after a proprietary model comesout, you can expect kind of
similar capability in open source.
And that's what we're seeing here.
Yeah, I haven't used it myself,but the benchmarks look good.
Yeah, and there areinteresting notes about it.
As for instance, people say that itis really good at creative writing.

(08:54):
It has like a different writingstyle potentially because of
being trained on different datadistributions coming out of China.
So, yeah, interesting developments.
And as with deep seek.
Interesting to see this coming outof China where they are more hardware
constrained due to export restrictions,as you talk about quite a bit.
And so in the technical reportsimilar to Ieq, they go into some of

(09:17):
the interesting technical insights.
They in particular highlight, moan thisnew optimizer that hasn't been proven
so much yet, but in this case, scale toa, a gigantic model recommendation of
really exciting developments for opensource, but also some new technical
insights that are quite interesting.
And next Amazon targets vibe,coding, chaos with new Kiro

(09:41):
AI software development tool.
So, kind of a surprise story for me.
We've seen Cursor, of course,be a very important agent
powered ID for co-development.
Curs cursor code has been killingit in the past couple months now.
Amazon has released this new Kiro,development environment that basically

(10:03):
positions it as another agentic codingtool that is particularly focused on
making it a little more principled.
So they highlight specs andplanning and all these kinds
of things in their blog post.
It also has all the various featuresthat you expect to have MCP and so on.
So.
Boy, this is a really, really busy spacewith all this coding agentic stuff.

(10:27):
I was just exploring likesea line and roof, these
extensions by open source teams.
There's like forks and combinations andnow Amazon isn't a fray of this new tool.
It's clearly people are puttinga lot of work and trying to
optimize and make this work well.
I'm a big cursor fan personally.
How about you, Andre?

(10:48):
I used to use Cursor as my main tool,but cloud Code has kind of overtaken
it and I actually moved back to VS.
Code from Cursor just because it isnow pretty feature comparable and
cursor updates a lot and sometimesnot in ways that works too well.
Nice.
That's good to hear.
I'll have to, yeah, try that out andkind of maybe go back also to VS.

(11:11):
Code myself.
This one here, this Kiroannouncement from Amazon.
This one feels kind of random to me.
I know Amazon is often throwing stuffat the wall to see what will stick.
And this kind of, this seemsto fit into that category.
You know, big company tryingout lots of different projects.
Amazon hasn't been like, I can'toff the top of my head, think of any

(11:36):
big LLMs releases like proprietaryor open source that have been
anywhere near the cutting edge.
Can you think of anything?
No, they, they have developed somemodels, but they really haven't tried
to compete in terms of performance.
They have internal models presumablyfor their chat bots and so on.
So yeah, this is Amazon strategyis, I, I think, interesting.

(12:00):
They don't try to be a frontierlab so much, but they work
with philanthropic for example.
And they do develop some things likethis to be in the ecosystem in some ways.
Yeah, we'll see what happens.
my crystal ball predicts thatwe're not gonna be all using
kero browsers in a year or two.

(12:21):
Yeah.
It's also cur IDs.
Sorry.
Yeah, it's, it's a bit strange.
They don't target enterprise that much.
But regardless it lookspretty slick, so who knows?
Maybe it will actually take off.
And speaking of ag agentic coding tools.
Next story, anthropic tightensusage limits for cloud
code without telling users.

(12:43):
So this is developmentthat happened this week.
I saw this happening in real timeon Reddit, where people on the cloud
subreddit were complaining that theirusage seems to be more restricted.
They hit the limits.
On using Opus biggest model quicker.
So apparently that's true.

(13:03):
At least this article seems to supportit, especially on the $200 per month max
plan where you have like crazy amountof, kind of budget to use up tokens.
And this has coincided with someinstability, like, Wednesday,
Thursday, clot code and philanthropic.

(13:25):
Were both down briefly and,and were just not usable.
So.
In a way, not surprising.
Like they are definitely losinga lot of money by being so
generous with this max plan.
But I think an indication of wherethings are heading, where I guess at some
point we'll have to be profitable andthe cost of e subscriptions are gonna.

(13:45):
Go even beyond 200.
Yeah.
With functionality like agentsnow being available in Claude
as well, you can imagine thattheir compute is getting slammed.
so I mentioned earlier in the episodethat I have a chat GPT Pro subscription.
I also have a paid Claude plan becausethere's different kinds of things that
I like to do with different providers.
I have Gemini Ultra as well.

(14:05):
And Claude is my favoritefor most tasks actually.
It's kind of my default go-to.
And I have been hit.
It just invests.
Just funny that the story came up.
I had never been hit withone of these overload errors
before, but I hit one this week.
So it seems like we're all kind of inthe same boat, and as you said, it's
unsurprising given how much money.

(14:27):
All of the big frontier labs arehemorrhaging on providing those services.
You know, they're, they're losingmoney by giving us access to such
powerful models at such low cost.
And you wonder when, whenthings are gonna have to change.
And so I, I understand, like you said,that they have to make some changes.
What's surprising.
Because Anthropic is usually goodorganizationally about communication

(14:50):
and getting things right.
Maybe they just didn't,didn't anticipate that so many
people would feel this change.
But it's a rare own goal,I'd say from Anthropic.
I agree.
Yeah.
They, they rarely seem to take these sortsof missteps and I think it's probably.
An indication of just cloud code hastaken off pretty rapidly and they have

(15:10):
been probably trying to just keep up.
it's a fun detail for me.
So all these models allow you touse 'em with a subscription plan.
You're not.
Paying per token generallyespecially in this max mode.
So if you use some tools, youcan see like the hypothetical
amount of money you spent.
And as a user myself, I'mspending like $2,000 in tokens

(15:34):
on this $200 per month plan.
It's insane.
So, I don't know.
I think this is a, asign of things to come.
That's a great stat there again, I knowthose are whatever the inverse of a margin
is a loss that you're putting in there.
Mm-hmm.
Yeah.
Nice.
Next up we've got Mistral and they arealso keeping up with all the agentic hype.

(15:58):
They have rolled out deepresearch in their LA chat.
Offering for talking to theirmodels, you know, the equivalent to
tragedy, PT and Claude and so on.
This is actually part of several things.
They now also have projects.
We have image editing,multi-lingual reasoning.

(16:18):
So very much in line with Mistral,kind of just racing to be feature
equivalent to Cha GT and Claude, andprovide an offering that's comparable.
As we say with Jeremy here all thetime, Mr is in a tough position.
They don't have as much money, theydon't have as much compute but it's
always cool to see them kind ofrolling out things pretty rapidly.

(16:42):
Yeah, I mean everyone isrolling out deep research.
There's been peopledoing it for a year now.
Some, some of the early movers,and it's kind of, it's expected,
it's what we call table stakes insoftware product design these days.
If you are an LLM provider, I think, andit actually, I mean there's all kinds
of safeguards you need to get in place.
There's all kinds of engineeringcomplexity when you roll this out

(17:04):
on the kind of scale that LA chat.
Would be.
But I actually, I'm gonna pluga free thing that I published
a month ago on YouTube.
I, I published this ageAgentic AI engineering course.
It's four hours long and thefirst hands-on project, we use
the open AI agents, SDK to create.
A deep research kind of functionality.
And so you can kind of see how that works.

(17:26):
And yeah, so that's free on YouTubeand I'll provide a link for you
to, to provide in the show notes.
It's a pretty cool 30,000 people.
I've already watched it onYouTube and there's no ads.
I've turned off ads.
It's just there is a educationalresource for people who wanna be
doing cool stuff with AI agents.
Yeah, it sounds like apretty fun project for sure.

(17:46):
Next.
Moving on to Grok.
We spent quite a while talkinglast week about GR four and some
of the controversies around it.
Soon after there was a strangedevelopment with Grok and X.
They have released a feature calledCompanions in the Grok app, which you
can access if you're on the Super Rocksubscription costing $30 per month.

(18:10):
And these companions, there'sa couple personas you can chat
with as sort of characters.
They have 3D models.
They talk to you with audio andyou can talk to 'em of audio.
One of them.
Is an anime girl wearingsort of dark Lolita fashion.
And the article here is called, Ispent 24 hours Flirting with Elon

(18:35):
Musk's AI Girlfriend, which issurprisingly entirely accurate.
This.
Character Companion isliterally designed to be flirty.
It's in their system, prompt thatit should be a 22 girly, cute
character who is into whoeveris talking or chatting with her.

(19:00):
And you can like build up a meter for howmuch risk companion is attached to you.
At some point you can.
Get into inappropriate territory.
You can actually like reacha level where you're able to
put the character in lingerie.
I mean, interestingfeature here from Rock.

(19:20):
I suppose.
I did not know this story.
I've clicked on the link andI'm looking at the photos
and videos and it is intense.
It feels like I shouldn't belooking at this while working.
Yeah, it's, not safe for work entirely.
And I mean, there's something to becommented on as it actually is potentially

(19:45):
a significant concern and problem thatpeople are already kind of falling
in love with these AI companions.
This has been happening for a while.
So.
You know, this might have someinteresting effects on people if
they really do start to bond with it.
But yeah, just go and look at thescreenshots and the videos of us
because it's, it's something else.

(20:07):
Whoa.
In this article it says, yeah,things can include descriptions of.
I'm not gonna read them out loud.
I feel uncomfortable saying thesewords, but sex acts, uh mm-hmm.
there's a quote here.
At no point did it ask me to stopor say I'm not built to do that.
And then, yeah, I guess you.
If there's something, I'm kind of vaguelyjust quickly skimming this as we're

(20:30):
speaking here, but it's kind of gamifiedin that depending on, I guess on how
long you talk or the kinds of things yousay, I don't know, you get hearts on the
screen and that allows you to level up todifferent levels in, I guess this game.
And yeah, when you get to levelfive, she's wearing lingerie.
That's yeah, it's interesting.
It's interesting.
I mean, in some ways it's kindof, it's, you know, this kind

(20:51):
of thing is inevitable, right?
It's like it's, but it's, it's kindof surprising that it's such a. Such
a big mainstream company that's raisedso much money and yeah, just last
week was making headlines for beingat the frontier in some capabilities.
Yeah.
To be clear, this is not a new thing.
There's plenty of apps that providethis exact kind of feature, and it

(21:12):
is just surprising that, you know,in rock, the equivalent to Chad,
GBT or Claude or so on, this is nowa built-in feature literally like
a. Sexy companion to chat with.
Certainly a differentiator.
I guess that it certainly is.
Next we've got a story of Uber beingclose to completing its quest to

(21:37):
become the ultimate Robo Taxii app.
So this is because they have announceda partnership with Baidu to deploy
robo taxis outside the US and Chinafocusing on Asia and the Middle East.
They already Baidu already operatesaround 1000 Robotaxis globally.

(21:59):
And in a pretty good spot, from what Ican tell, like competitive with Waymo
and Uber already has a partnershipwith Waymo where you can ha hail a
robot, robot taxi through AR app.
So I, I think the headlinehere is not too sensational.
It does seem like Uber is trying topartner and, and kind of use Robotaxis
as part of the product, which Isuppose they kind of need to, right.

(22:24):
Yeah, the Uber share price has longpriced in being able to go to autonomous,
to not have to be paying human drivers.
And it's a, it's a pretty wild thing aswe start to have cars driving themselves,
trucks driving themselves in the USin something like 30 states of 50 in
the us truck driving is the number oneoccupation, and then lots of the other

(22:46):
top jobs are supporting that in some way.
And so.
we're marching inevitably tomore and more autonomous driving.
I think ultimately it can be a goodthing for society because that kind
of job, whether it's, you know, I feelso bad for, I live in New York and
taxi drivers, Uber drivers, you cantell it pains them in a lot of cases

(23:08):
to be using that right foot becausejust all day using that right ankle.
And so you're like, in some ways.
it'll be a good thing, but it's alsogonna be very disruptive to all these
people who have this kind of job today.
So retraining programs will need to comeinto place or some other kind of solution.
Right.
Yeah, it's, it's been an interestingthing with Waymo kind of slowly but

(23:30):
surely expanding their robot Botoxcapabilities over the last couple years.
Tesla just rolled out robot taxis andthere are companies working on autonomous
trucks as well that, that are not Waymo.
Tesla itself is presumably working on it.
As you said, there are like 3.5million truck drivers in the US

(23:51):
around 1 million Uber drivers.
So it's gonna be here in a year, twoyears, three years, and it's, it's gonna
be disruptive hopefully in a good way.
And onto applications andbusiness as promised some
interesting kind of acquisitionand hiring developments this week.
First up, open AI's Windsurf deal is off,and Windsor's, CEO is going to Google.

(24:18):
So we reported previously thatOpenAI was in talks with Windsurf.
Windsurf created a number one of thesecoding tools with AgTech capabilities
seem to be in talks to be broughtout for free billion dollars that
was canceled and the CEO and some ofthe top talent went over to Google.

(24:41):
For a deal I think reportedlyaround $2.4 billion with some
licensing details as well.
So another case of a nonacquihire acquihire where.
The big company hires away the top talent,the, the leaders really of the project.
Frozen some license dealor something of that sort.

(25:05):
And the company windsurf, youknow, stays, it's still there.
It, hasn't been bought out in any sense.
In fact, I don't think any sharesin Windsurf went to Google.
We've seen many example of, of this inthe last couple years at this point.
Scale AI with meta had this happen.
Other, I think Lamini witha MD, different examples of

(25:26):
that, a very different kind of.
New seeming normal thing forSilicon Valley, like you either
buy the company to acquire itspeople, or you buy the company.
Acqui hires a term.
But now you can kind of hireaway the key people and the
original company sticks around.
This used to be an antitrust kindof move in a Biden era, but in the

(25:50):
now antitrust is not really a worry.
So it just seems like.
A new profitable or, or easy way for largecompanies to do these kinds of deals.
Yeah, and I think they were doingthese kinds of deals originally
to avoid antitrust inquiries.
Mm-hmm.
But then it started to become suchcommon practice that antitrust

(26:10):
regulators were like, wait a second,this is, you're just, you're you.
You've slightly changed the approach here,but ultimately this is anti-competitive.
Mm-hmm.
And then so this had a lot ofdiscussion in the Silicon Valley
circles around like where.
The other Windsurf employeeskind of screwed over in this deal

(26:34):
because the top talent clearly,you know, got handsomely paid.
But the way this works instartups is you get some share
of ownership in the startup.
You hope that either it becomesa big, profitable company and
goes public or it gets acquired.
Your shares get transferred, convertedto cash that you can actually use, right?

(26:57):
This is the kind of bet you make withstartups when you have this structure
of deal where the company isn'tacquired, but the leadership goes away.
That in some ways, like breaks thetypical contract or, or expectation
with being a startup employeebeing someone who joins a startup.

(27:17):
So.
Yeah, lots of kind of questions bypeople around the nature of this
kind of deal for Silicon Valley.
And in fact just like a couple daysafter this happened, cognition, who is
a maker of the AI coding agent, Devinannounced that they are acquiring.

(27:39):
Windsurf.
So they kinda swooped in.
We got the announcement that the topbrass is leaving for Google, and now
this other AI startup cognition isnow buying out the remaining company
windsurf, which, is is quite the story.

(27:59):
This, this whole like businessdevelopment, at least even in
the startup world and business.
This is pretty interesting stuff.
And even more news on this front.
Cognition, maker of the AI coding agentDevin, acquires Windsurf | TechCrunch
Philanthropic hired back two of itsemployees who had just left for cursor,
recovered this Boris journey and Kaz W2leaders of developing Cloud Code announced

(28:24):
to have gone to Cursor apparently.
Just reverted that again, really weirdkind of story in Silicon Valley two weeks
since the announcement, they apparentlyare going back to philanthropic, so.
Wow.
Yeah, it's bizarre.

(28:44):
It is bizarre.
And on that theme, continuing you know,the way this was all kicked off is,
of course meta going on a hiring just.
Binge just a complete spree ofthrowing around money to get top
talent from OpenAI and others.
And there are new developmentsat Mad Front as well.
Reports of other high profileOpenAI researchers going to Meta.

(29:07):
We've got OpenAI researchers JasonWay and also Huon Wong Chong.
Both pretty significanttalents as far as I can tell.
So, yeah, it's, there's now tradingcards that you can see on Twitter
for when people swap companies.
Going from opening AI to meta or, Idon't know, opening AI to philanthropic.

(29:30):
It's, it's quite a meme,I suppose at this point.
That's funny.
Yeah, definitely.
As you say, exactly.
Kicked off by meta, puttingall this budget into it, and I
think it's also, it's a very.
From speaking to friends whowork at the frontier in these
big labs, it is very stressful.

(29:51):
It is super intense work because you'retrying to stay at the frontier against
other companies that are also spendingbillions of dollars on the same problem.
And so very stressful work.
And so I'm sure the money and the kind of.
These a hundred million dollarcontracts that supposedly Mark
Zuckerberg is personally negotiatingyou know, that's part of it.

(30:12):
But I think also part of the storyhere, which I don't see talked about
publicly, but is just kind of my,my hunch is that you also probably.
You know, if you've been at a FrontierLab for years, you've been helping roll
out cutting edge LLMs, you're kind of,you're hoping that by switching to a
competitor, that maybe there's gonna belike a bit of a culture shift that, you
know, you're just hoping that somehowthe new role is gonna be a bit less

(30:35):
stressful than what you've been goingthrough for years at your current firm.
Yeah.
And in opening eye in particular,they have grown like crazy, right?
They went from something like1000 people to 3000 people
in I think less of a a year.
And when you have that sort of startupscaling, it just compounds the craziness.
Like it, it must be really messy, reallyfast moving and chaotic now at open ai.

(31:01):
And that could be one of themany reasons besides money
that these people are leaving.
From Open ai.
One more story on this front.
Meta has also hired twokey Apple AI experts.
Mark Lee and Top Gunter, who wereresearchers at Apple and now are going

(31:21):
to meta so, not just going after OpenAI.
every kind of top talent isbeing sought out by Mark.
On a related story meta of course,is doing this for its super
intelligence efforts and they're oneof many in the field with OpenAI,

(31:42):
of course being one of the key ones.
Mira Mira's Thinking Machines Labhas now closed their $2 billion seed
round with a valuation of 12 billion.
This of course, is composed of alot of people from OpenAI, including
the former CTO Mia ti and wehaven't seen too much from them.

(32:04):
They're saying that in a few monthswe'll start rolling out some products
and open source things of some nature.
We've known that they have been lookingat this kind of number, billions
of dollars in a seed round with noproduct to speak of, and they got it.
So the competition for a GIis certainly not slowing down.

(32:25):
Yeah.
If you're not going to take a hundredmillion dollars contract from Mark
Zuckerberg as an engineer that is one ofthe trading card players right at the top
of their game, then the thing to do isexactly what Mir Mirati has done here.
And yeah, we've seen other.
Folks from OpenAI Ilio Sr.do a similar kind of thing
with Safe Super Intelligence.

(32:47):
And The Economist did an interestingarticle a week or two ago that made
the case that these AI valuationsare completely insane unless a GI
really is just a few years away.
And I think that's.
Quite reasonable given the kind ofrevenues and profits you might expect.

(33:11):
You know, there's word thatsome of these are being valued.
A hundred billion, 200 billion,just absolutely fantastical numbers.
And speaking of billions, next upwe have an actually very profitable
business reaching that status.
No.
Yes.
Well, at least, youknow, revenue, at least.
Revenue generating, yes.

(33:32):
Revenue generating.
We don't know about profitable.
This is lovable.
They just, we raised a 200 millionseries A. Just eight months after
launching, they're now valued at 1.8billion, and in the case you don't know
it, it's one of the big winners in theag agentic kinda vibe coding world.

(33:53):
Users can create websitesand apps just vibe coded.
Apparently they have over 2.3million active users and 180,000
paying subscribers that yields75 million in annual revenue.
I mean.
Crazy, crazy rise, super successfulkind of play in the vibe coding

(34:15):
space at the exact right time withthe exact right kind of approach.
Yeah.
And I haven't used lovable myself, butit's not like you see the code, right?
So much as a, as a lovable user.
It's more about, it's like, it'slike gen AI of a whole application.
Exactly.
Yeah.
This is for sort of non-technicalPE-people, broadly speaking, where you

(34:36):
don't need to touch the code generally,and so it's focused on apps and websites,
things that are not, it's kind of supercomplicated, not the sort of things
that let's say AI engineers tackle.
And it's got a lot of users anda lot of people are building apps
and websites at this point of this.
And just one more story, dealing withbillions of dollars related to XI.

(35:00):
SpaceX has committed $2 billion to XI.
So that's one of Elon Musk'scompanies investing in another
of Elon Musk's private companies.
There's also apparently gonna bea Tesla shareholder vote for Tesla
to put in some billions into XAI.
So, you know, we could have an hour longdiscussion about w weird business Empire

(35:24):
that e is Elon Musk and the variousmoves of, of different business entities
like XI buying X that recently happened.
But suffice it to say XI is lookingfor lots of money to keep, you
know, doing what they've been doing.

(35:45):
Nice.
I think all of this $2 billion wentto an alien themed sex chatbot.
Is that right?
I mean, that's definitely one of thebig investments that Musk is betting on.
It seems.
Imagine if there was no gravity, baby, I.
And we are done with all thisstuff with Bill Billions and hires.

(36:06):
But next story in research and investmentsactually is related in some ways.
So this is a blog post coveredin this article with headline.
A former open AI engineerdescribes what it's really like
to work there, so Calvin French.
Owen who was an engineer there for overa year at OpenAI has published this since

(36:29):
Moving on it, it's not a drama type post.
He just wanted to move onand, and start something new.
And so there is quite a detailedkind of description of what
it's like to work at OpenAI.
He worked, for instance onCodex, which is very agentic.
Coding tool and lotsof interesting tidbits.

(36:50):
Here, for instance, talkingabout opening eyes experience.
Opening is rapid growth, where it wentfrom 1000 people to 3000 people in
the time that this person spent there.
The crazy scale of this being aproduct that, you know, as soon as
you launch something like Codex, youget a huge number of users using it.

(37:11):
A lot of details on the cultureof sort of, being bottom up,
people taking initiative anddoing different kinds of things.
Lots of nitty gritty stuff that.
Isn't critical, isn't sortof dramatic, but interesting.
If you work in the space as an engineeror just follow open ai, this backs up the
case that I was trying to make earlier.

(37:33):
That people, you know, looking for,you know, some kind of culture maybe,
you know, just hoping that by switchingto another Frontier Lab, they're not
gonna be in such a hectic environment.
Yes.
like so many little bits that could beworth mentioning like he highlights.
An unusual part of OpenAI isthat everything runs on Slack.

(37:55):
There are no emails ifyou're a software engineer.
That's a very interesting detail if you.
I guess work in an office.
That might be an interesting detail.
Yeah.
And I guess this is a slow week forresearch and advancements, Andre,
that this is one of the key researchand advancement story is a report
on what it's like to work at OpenAI.
Yeah.

(38:15):
Well, we are trying to keep this onea bit shorter, so I decided to not
include too many papers and, and dosomething a little bit different.
We do have one researchpaper that we'll touch on.
The title is Reasoning or Memorization,unreliable Results of Reinforcement
Learning due to Data Contamination.
So this is related to a whole bunch ofresearch in recent months dealing with

(38:41):
reinforcement learning for reasoning.
There's been many papers kind ofpresenting weird ways to train
that sort of work unexpectedly.
Things like rewarding, likeincorrect rewards, things like
training on super limited data.
We've covered quite a few, maybefive, six of these kinds of papers.
We also covered how there was skepticismand criticism of some of these papers.

(39:06):
Seem to be first a result of incorrectevaluations on these benchmarks.
Now we also see that these results arevery particular to the Quinn Model family.
So the kind of claim here is youget these nice results on Quinn.
Potentially because Quinn was trainedon the data of these benchmarks.

(39:28):
When you actually do this onother models, you don't see the
same sorts of positive results.
And so that kind of basicallydisproves the conclusions of these
other, papers they do show that thelike correct, kind of intuitive way
to do RL works, as we would know.
But yeah, and an ongoing kind ofdevelopment in the research world here.

(39:52):
Yeah.
Leakage is a big problemwith these benchmarks people.
like training to excel at thesebenchmarks, but then may, but then
the models maybe not performingoutside of the benchmarks.
All kinds of problems withbenchmarks in this way.
Actually recently did an episode ofmy show spec specifically on this.
I'll, I'll look that up kind of whileyou're speaking next and have a, a link

(40:13):
that people can follow if they want.
Kind of like an hour long discussionon the issues with LLM benchmarks.
This is a really interesting one herebecause, it's specific to one model
family and it's, and it's researchersfollowing a thread of surprising
evidence where, incorrect rewardstrategies were leading to reasoning

(40:35):
performance or random reward signalswere leading to reasoning performance.
And that shouldn't be the case.
It just shouldn't happen, and itwould happen if there's leakage from
the training set into the test set.
Exactly, and they like figure, one of thispaper shows that if you give it an input,
like if you give to Quinn an incompletequestion, like for how many positive

(40:58):
integers greater than one is, and youstop there, the model outer completes.
To the actual question and answer.
So clearly there is data leakage thatyou can demonstrate and this is not gonna
happen if you use Lama for instance.
Nice.
And then thank you Andrefor talking there a bit.
If people want to hear all about theissues with LLM Benchmarks, it's episode
9 0 3 of my podcast, super Data Science.

(41:21):
I'm gonna link it as well in the episode.
So yeah, just one note on this paper.
I think this whole story is aninteresting examination of a, like the
super rapid pace of developments in ai.
Now, papers get published.

(41:42):
In a matter of weeks or months there'snot much time for good peer review.
And so some things kind of leakthrough and the scientific process
is struggling at the same time.
This showcases the kind of selfcorrective nature of research where
pretty quickly after these initialpapers, we've had these follow up.
Papers explaining orrebuking their results.

(42:04):
So overall, an interesting kind oflittle micro example of the way that
science works in the current world of ai.
Onto policy and safety.
First up, we've got some bigmoney coming from the Department
of Defense and roly Google.
OpenAI and Xai have beenawarded up to $200 million in

(42:28):
contracts for AI development.
So there is initiative to integrate AIagents across various mission critical
areas This is coming right after thelaunch of Grok for Government, a suite of
AI products for US government customers.
OpenAI and philanthropic have alreadylaunched their own government things.

(42:50):
June actually, OpenAI introducedOpenAI for government, so.
Yeah.
another trend among all these frontierlabs is getting the money of a federal
government is definitely look, youknow, a nice bounty to go after.
On the regulation front, we'vegot California State Senator
Scott Wiener introducing abill to regulate AI companies.

(43:14):
So this is SB 53.
We covered this.
This was a big deal earlier thisyear, last year with an effort to
regulate that ultimately failed.
It was vetoed by thegovernor of California.
There was lots of lobbying.
There's now a renewed.
Push for this kind of billwith kind of tweaked details.

(43:37):
And, and the key thing is additionalreporting requirements and security
protocols for AI models above a certaincomputing performance threshold.
So.
Still an ongoing kind of story.
Still a big deal if it does getpassed, and I think we'll probably keep
reporting on it as developments happen.

(43:58):
And on the more concern sideof a spectrum, we've got
a article titled AI Nuy.
New Defy Websites areraking in millions of.
So one of the big sort of, ethicalissues with ai we've known for some years
now is non-consensual explicit images.

(44:20):
This has been a problem for yearswith even teenagers being the target
of false imagery, deep fakes thatshowcase them inappropriately.
Now there are.
Multiple, many websites according tothis article, there's an average of 18.5
million visitors per month, and these maybe earning up to $36 million annually.

(44:46):
So, just to showcase the scale ofa problem, you know, there's a lot
of talk about safety with X ai sox risk and kind of, issues like
that, but we shouldn't forget that.
Already there are super, kind ofsignificant ethical implications
and actual negative impacts beingbrought on by things like this.

(45:10):
Yeah.
You know, I talked earlier in the episodeabout how it's kind of inevitable that
you'd have, you know, the sex chatbotscome out of LLM technology, and this
is a really concerning thing thatalso kind of seems like an inevitable
misuse in this case of the technology.
And yeah, hopefully, hopefully yeah.

(45:31):
You know, hopeful.
I, I, I, I don't know how you regulateit exactly, but maybe penalties become
so large that it just becomes, you know,something that, that, that's very hard
to find online, which it seems rightnow it, it, it's easy to find, right?
There are regulations being proposedand, and passed in some cases
to target these kinds of things.
So presumably it's up to Googleand, and other cloud providers to

(45:53):
go after these kinds of things.
And on another topic relatedto concerning uses of ai, we've
also got facial recognition.
So this is another thingthat's been ongoing for years.
We're concerned that you're gonnahave the ability to get someone's
name and potentially other detailsjust from a photo of their face.
It was developed even before chat, GPT.

(46:17):
There's now this article insideICE's supercharged facial recognition
app of 200 million images.
So ice, the department within theUS that enforces immigration and
has been cracking down quite hard.
Apparently have an internal appcalled Mobile Fortify that allows

(46:38):
the officers to use facial conditionto access a database of 200.
Million images, and these areimages coming from multiple
government sources, the StateDepartment, C-B-P-F-B-I, and others.
So if you think.
State surveillance is concerning,or state police power is concerning.

(47:00):
There's more reasons to beconcerned as a result of ai clearly.
Well, yeah, and then, and in, soyeah, ICE stands for Immigration
and Customs Enforcement and ICEwill Receive, apparently is a part
of this big beautiful bill that waspassed recently by US Congress that
is going to multiply many fold.

(47:21):
The budget, billions and billionsof dollars, more budget for ice.
And it kind of makes me wonder,so you know, in the beginning.
Or recently in this currentadministration, there's a big focus
on, okay, you know, this personis like shown to be a gang member.
I mean, you still end up in weirdsituations where, for example, people who
have been deported for supposedly beinggang members, you know, these people

(47:43):
aren't, they're not going to a judge.
There's not much due process.
And so they make some mistakes.
So there's, there's issues anyway,even with how they're doing it today.
But if you're multiplying by manyfold, the budget that ICE has.
You're gonna start presumably the ideais to be, to be taking, you know, there,
there are a lot of illegal immigrantsin the us but at the same time, the

(48:05):
US economy, for the most part, has ahuge demand for those illegal migrants.
So, the construction sector, forexample, I recently read, 30% of
people who work in the constructionsector in the US are illegal migrants.
And for things like fooddelivery apps, farming.
Oh my goodness.
I mean, that's gonna be way more than 30%.

(48:26):
there's economic repercussions todeporting a lot of these people as well.
So it's, I don't know.
It's an interesting, Idon't have all the answers.
yeah, there can be a lot said aboutICE and the state of us politics.
Certainly.
I have a lot of thoughts aboutmany things that have been ongoing,
but this is not the place forit, so I think we'll move on.
That's true.

(48:47):
That's true.
Yeah.
And just one more story.
In the synthetic media and artsection that we occasionally
have, video game actors strikeofficially ends after AI deal.
So video game actors with voice actorsin video games have ended this year long.
Strike that.
They have an agreementwith major companies like

(49:07):
Activision and Electronic Arts.
There were 2,500.
Members of the us union Sag aftrathere was a big vote and they had
agreed for things like protectionsfor their rights to their voice
wage increases, things like that.
So we've seen this happenwith Hollywood actors.

(49:30):
We've seen this happennow multiple times, and.
This is the latest example of kindof the world of entertainment.
Grappling with the reality of deepfakes and, and AI generated media
and, and coming seemingly to a new,understanding of how to do this.

(49:51):
Yeah, it's interesting.
This is a whole world thatI hadn't really thought of.
So this, there's this woman in thearticle Ashley Birch, who I guess is
kind of a big proponent of this videogame actors strike or big player in it.
And she's voiced a huge numberof actors in well-known games
like Fortnite, the Last of us.

(50:11):
Many others, Minecraft and, and you know,I I, I hadn't really thought of this,
this whole world, and I could, I couldimagine there would've been, or I guess
there could still continue, there cancontinue to be tons of work for video
game actors because unlike a film whichwould typically be at most like two
hours long, you could have huge amountsof dialogue that needs to get recorded.

(50:33):
But now you could have, you know, usetechnology like 11 Labs to generate it.
And that is it for this episodeas I promised kind of a quick one.
Hope you kept up if you made it to end.
Thank you for listening and of course,thank you, John, for fulfilling your.
Guest cohost duties anytime.
Andre, it's so great to be back.

(50:55):
Do check out the links mentioned ina description for John's Cool YouTube
video and, and related episodes.
And as always, we appreciate your reviews,your shares even though I sometimes
don't get around to replying to comments.
Also appreciate your comments,so please do keep engaging
and please keep tuning in.
Advertise With Us

Popular Podcasts

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.