All Episodes

Is AI finally ready to do your job — better, faster, and cheaper?

In this week’s Leveraging AI news recap, host Isar Meitis unpacks a flurry of groundbreaking developments in the world of artificial intelligence — from the release of GPT-5.2 to jaw-dropping advances in recursive self-improving AI (yes, it’s as intense as it sounds).

Whether you lead a business, a team, or just need to stay ahead of the AI curve — this episode is your executive summary for everything that matters (and nothing that doesn’t).

We’ll also dig into the billion-dollar OpenAI–Disney partnership, how real users are actually leveraging AI in the wild, and why the Fed is finally admitting AI is changing the job market.

In this session, you'll discover:

  • The GPT-5.2 release: performance benchmarks and real-world capabilities
  • Is GPT-5.2 better than humans at actual work? (71% of the time, yes)
  • Why OpenAI’s new “not-an-ad” ad rollout caused a user revolt
  • OpenAI x Disney: Why $1B is being bet on AI-generated Mickey Mouse content
  • GPT-5.2’s weak spots and where Claude Opus still dominates
  • What Recursive Self-Improving AI means (and why Eric Schmidt is nervous)
  • AI designing its own hardware: A startup that could rewrite Moore’s Law
  • New usage data from OpenRouter, Microsoft, SAP & Perplexity – how people actually use AI
  •  Why prompt length is exploding (and what that means for your business)
  • AI agents in browsers: the productivity revolution or a security nightmare?
  • Databricks proves AI sucks at raw documents (and how to fix it)
  • The psychological bias against AI-created work — it’s real
  • Claude’s new Slack integration: is this the dev team you didn’t hire?
  • Apple’s AI brain drain & why it matters
  • Gartner says: Block AI browsers (for now)
  • AI and unemployment: The Fed finally connects the dots

Want to future-proof your team’s AI skills?
Isar’s AI Business Transformation Course launches again in January — a proven, real-world guide to using AI across content, research, operations, and strategy.
👉 Learn more & enroll here:  https://multiplai.ai/ai-course/                

About Leveraging AI

If you’ve enjoyed or benefited from some of the insights of this episode, leave us a five-star review on your favorite podcast platform, and let us know what you learned, found helpful, or liked most about this show!

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker (00:00):
Hello and welcome to a Weekend News episode of the

(00:02):
Leveraging AI Podcast, a podcastthat shares practical, ethical
ways to leverage AI to improveefficiency, grow your business,
and advance your career.
This is Isar Metis, your host,and we had another really
exciting week behind us.
First of all, we got the releaseof GPT 5.2.
That is.
A very interesting model, andwe're gonna dive into that as
the first topic then we're alsogoing to dive into multiple

(00:25):
researches from severaldifferent leading labs on how
people are actually using AI andthrough what purposes and how it
has evolved through this pastyear.
And we're also going to diveinto recursive self-improving
ai, which suddenly became a bigtopic from multiple different
sources all in the last 10 daysor so.
So these are gonna be our threemain topics, and then we have a

(00:45):
lot of other small topics tocover with lots of new,
interesting releases andfeatures.
And lawsuits and interestingpartnerships, including Disney
and OpenAI.
So we have a lot to cover.
And let's get started.
As all the rumors suggested lastweek, GPT 5.2 was released this

(01:07):
week, and it is a model that isfocusing on something very
specific and instead of tellingyou what that specific thing is,
I'm just gonna read a few quotesthat either was as part of the
release notes from OpenAI orthat several of the leading
figures in OpenAI shared indifferent interviews and press
releases and then it will allbecome very clear.

(01:27):
So the first one is from therelease that says, overall GPT
5.2 brings significantimprovement in general
intelligence, long context,understanding, a gen tool,
calling and vision.
Another quote from the release,we designed GPT 5.2 to unlock
even more economic value, betterat creating spreadsheets,
building presentations, andhandling complex multi-step
projects.

(01:48):
Fiji CMO OpenAI, CEO ofapplication said we designed 5.2
to unlock even more economicvalue for people.
It is better at creatingspreadsheets, building
presentations, writing code,perceiving images, understanding
long context, using tools, andthen linking complex multi-step
projects.
She also said it is the mostadvanced frontier model and the
strongest yet in the market forprofessional use.

(02:08):
Greg Brockman said GPT 5.2 ishere.
The most advanced frontier modelfor the professional work and
long running agents.
Brad LightUp said, introducing5.2, our latest model and the
most capable for knowledge worksits a new state of the art
across many benchmarks.
So if it's not clear, the focus,at least from a messaging
perspective, is real life workand agents.

(02:30):
That is definitely the focus ofGPT 5.2.
This is where they invested mostof the resources in aligning
this model, and this is wherethey're investing all their
focus when it comes tomessaging, why this model is
important.
One person that the overallfeedback over the web was very
positive, especially from peoplewho had early access, such as
Ethan Molik that I follow allthe time.

(02:51):
And Ethan Molik was very excitedabout its ability to run long
and complex task based on oneshot, single, not too
sophisticated, prompt.
The example that he gave, he's,he asked ChatGPT 5.2 Pro to
generate a new type of shaderand with a very simple prompt,
he got a complete running,working shader that included the
creative interpretation of whathe requested, the mathematical

(03:14):
precision of actually developingthe capabilities behind it and
the code to actually run thisproperly.
He specifically called it achallenging assignment and not
just a toy example, meaning thisis something that can be used in
real life that he was able tocreate with a single shot,
relatively simple, prompt.
Molik was also very positiveabout the success of the model

(03:34):
of GDP Valve, which is abenchmark that OpenAI themselves
invented.
But he said the GPT valve isprobably the most economically
relevant AI benchmark so far.
For those of you who don'tremember what GDP Valve is, it's
a benchmark.
That uses 44 occupations acrossnine different sectors.
It reflects actual real lifeprofessional work versus
different puzzles and specificstructured things that it needs

(03:56):
to solve.
It measure tasks that usuallytake humans four to eight hours
to complete, and 5.2outperformed humans 71% of the
time judged by other humans.
So when other humans got to seethe responses, they don't know
which one is aon, which is not,they just can pick the response
they like more, and GPT 5.2performed 71% of the time better

(04:19):
than humans when other humansevaluated the work.
This is very significant.
It also performed the work 10times faster and at 1% of the
cost of humans performing thesame tasks.
to put things in perspective,the next model is Claude Opus
4.5 with 59%, so it's 71%compared to 59%.
That's a very big spread and thespread to Gemini three Pro is

(04:40):
even higher.
Gemini three Pro is at 53.5%.
Now, I am not a huge believer inbenchmarks in general, but it's
still a good way for us tounderstand how the models rank,
at least on specific topics.
So on the SWE Bench verified,which is real world coding
tasks, g PT 5.2 thinking is at80%.
Claude Opus 4.5 is at 81, andGemini three Pro is at 76.

(05:03):
So the three are very closetogether, very high on that
ranking.
On the GPQA diamond, which isgraduate level science, GPT 5.2
thinking is at 92.4%.
The pro version is at 93.2%.
Gemini three deep think is at93.8%, so slightly higher.
And Claude Opus 4.5 is at 87.

(05:24):
a little bit behind, but allrelatively close at the AIM
2025, which is a math benchmarkwith no tools.
GPT 5.2 reached 100% accuracy.
Again, without using tools onthe arc, a GI, which is more of
an abstract reasoning, it's likethese kind of puzzles that the
systems cannot prepare for.
It just has to understand.
Uh, so it really measures theability of these models to

(05:45):
understand a problem and developa logic in order to solve it.
GPT 5.2 thinking scored 53%, 5.2Pro scored 54%, again, compared
to Gemini.
Three deep think with 45%.
So a very big spread.
And Claude Opus 4.5, that's 38%humanity's last exam, GPT 5.2
scored 36, slightly still behindGemini three deep think with

(06:06):
41%.
So again, the focus of thismodel was very clearly to
develop a model that would begood at real life work versus be
really good at benchmarks.
We'll have to wait a littlelonger to see if this focus
actually translates into realvalue for people.
You know, the model right now isonly in early stages on the LLM
Arena.
On the LLM Arena.

(06:26):
Right now it's not doing verywell, but I assume we'll start
to see it rank higher in thenext few days as more and more
people start using it, uh, andtesting it on the LLM Arena as
well.
As of right now, it isdefinitely not doing well on the
LLM Arena.
Which conceptually tests forreal life use cases because it's
people loading their own promptsfor actual things they're trying
to do.
So whether the promise fromOpenAI is going to materialize,

(06:50):
we will be able to measurebetter within the next few days
or the next few weeks once westart seeing real feedback on
how it is from the initial usecases that I put it through, I
was not impressed.
It actually was able to fail meseveral times on a few very
specific things that I wastrying, including getting
correct.
References to topics that I wasresearching and it actually did

(07:11):
worse than 5.1 and even five ongetting me accurate links to
relevant articles on somethingthat I was researching.
And yes, that's a very specificedge case.
So that doesn't imply onanything else.
I did not yet try it on thethings that they're saying it
does very well, which isspreadsheets and slides and
multi-step projects.
I.
Right now for those things, myfavorite tool is Claude 4.5

(07:33):
Opus.
Uh, and the same thing forcoding.
I find Claude Opus 4.5, anincredible coder, and I'm going
to record an entire episode onthe project I've done with it in
the past few days.
Uh, but that's for a laterepisode.
But in general, a new model fromOpenAI.
By the way, they specificallysaid several different people
that this is not related totheir Code Red Event that we
reported about last week.

(07:54):
This is a model that they'vebeen working on for months that
had this release date formonths.
And so the code red thing willprobably be an early Q1 release,
potentially January, and thiswas an important and yet
incremental upgrade to.
PT 5.1 that was scheduled forthis release and have been
worked on for a while now.

(08:15):
And since we're talking aboutOpenAI, a few more interesting
things from OpenAI this week.
First of all, they have quietlybeen testing a new memory search
tool that basically allows youto search through the growing
list of items in the memory thatit saves about you.
So one of the strongest featuresthat church BT has is that it
has a really long memory of youif you have been sharing stuff

(08:37):
with it, and I.
Intentionally do that.
I actually teach ChatGPT aboutme, so it provides better and
better answers, which means thestuff it has in its memory about
me is now really, really long,and the ability to search that
will actually be very powerful.
This was shared by testingcatalog, which is a company that
monitors what's actuallyhappening and being released
across the board, includingbehind the scenes pieces of code

(08:58):
and exactly what they're doingnow.
The new tool mirrors a verysimilar search that currently is
available in the Atlas browseron Mac.
So those of you who are like me,who are Mac users who are using
the browser, there is a browsermemory research tool that you
can actually go and seeeverything, and a very similar
implementation is.
Potentially coming to the memoryinside of ChatGPT, which I think

(09:20):
will be very helpful to be ableto find weird reasoning of why
ChatGPT does something specificbased on its memory.
You can find what the issue isand then either fix it, delete
it, or at least know what it is.
So useful tool that I assumewill roll out to all of us in
the next few days.
A big piece of news from OpenAIthis week is that they've signed
a very interesting deal withDisney.

(09:42):
So the way that deal works isDisney is going to invest$1
billion into AI and also a partof a licensing deal that will
allow ChatGPT and moreimportantly, Sora to remix and
create videos of all the knownDisney characters across all its
different brands, such as, youknow, obviously mickey Mouse,

(10:03):
star Wars, Marvel, Pixar, likeall the characters that we all
love and that everybody probablywants to create videos of.
Now you'll be able to do this inSora with a licensed approach
from Disney.
Now the deal itself is reallyweird.
On one hand, yes, it providesOpenAI with another billion
dollars, which is really greatfor OpenAI it.

(10:23):
The specific article that sharedthis information comes from the
information and they said thatthis$1 billion is a small
amount, but it is very, veryimportant because OpenAI is
planning to burn through$115billion.
Between now and the end of 2029.
And so they need to continuouslyraise large amounts of cash in
order to finance these crazylosses that they're planning to

(10:44):
have.
But what the deal doesn't share,or at least wasn't share
publicly, is exactly what.
Disney is actually getting outof that.
So Disney invests$1 billion whenthey don't have huge amounts of
cash on their balance sheet.
As of the end of Q3, they havejust over$5 billion on their
balance sheet, meaning 20% ofthat is going to go to open ai.
That's a very large percentagefor an investment and it doesn't

(11:06):
clearly state exactly what theyare getting back as far as the
licensing deal itself.
So if I create a video of LukeSkywalker, what does Disney get
in return?
There are hints to the fact thatthey may get additional shares
at the current valuation if thishappens, but that means that
they only make money if thevalue of open air actually

(11:26):
increases, because otherwisethey're just getting it at face
value and there's not much valuein that.
Not really clear what Disney'sgetting out of this.
They're definitely betting a lotof money on this deal, what
they're getting for sure, andthat's something I mentioned on
this podcast several timesbefore, is they're getting more
exposure on a future platformthat will allow the next
generation to know more aboutDisney characters.

(11:46):
So I definitely see that as abenefit.
I assume there is some kind of alicensing deal behind the scenes
that will allow Disney to makemoney when its characters are
being used in Sora and orChatGPT.
And as I mentioned several timeson this podcast, I really hope
that's the direction everythingwill go, that we'll find a way
to have AI have access to allthe ip, but compensate the

(12:07):
creators one way or another,similar to how Spotify figured
it out and now is payingcreators and everybody can enjoy
the music in a much easier waythen buying CDs are vinyl
records, so hopefully this isthe direction that it is going.
This may lead, and I don't knowhow far this envelope can be
stretched to people,individuals, or groups, creating
new series or complete filmsabout the Disney characters

(12:31):
because now it is available inSora, and if it is available
legally, I can create as manyscenes that I want and I can
stick them together and thenpotentially release my own
episodes.
I don't know what does that meanfrom my exposure if I do this
from a legal perspective withDisney, but if the tool legally
allows it, that definitely opensthe door for stuff like that.
Another big news from OpenAIthis week is Slack's, CEO,

(12:54):
Denise Dresser, defected fromSlack, which means Salesforce,
and joined OpenAI as the ChiefRevenue Officer.
OpenAI has seen a incrediblegrowth in its enterprise segment
in 2025, and they are nowpushing that even further to
figure out how to drivesignificant revenue from this
channel.
Dressers experience insalesforce for 14 years and in

(13:16):
Slack as the CEO since 2023 isthe perfect fit for stuff like
that, right?
She has amazing experience inhow to drive revenue from
implementing enterprise levelsoftware solutions that drive
additional value to companiesopen AI's, CEO of applications.
Fiji CMO praised the hire in aninternal memo that said, we're
on the path to put AI tools intothe hands of millions of workers

(13:39):
across every industry.
Denise has led that kind ofshift before and her experience
will help us make AI useful,reliable, and accessible for
businesses everywhere.
So this is just another big hirefrom of a very prominent,
dominant figure in the softwareworld as it comes to
implementing AI in enterprises.
Another big one is obviouslyFiji Cima herself, who became

(14:02):
the CEO of applications.
This is just another example ofthe crazy talent wars that are
going on between the leadinglabs and companies across the
board.
In the past few weeks there'sbeen several major departures
from Apple to all the majorlabs, including meta and AI and
anthropic.
So.
Not good for Apple.
We didn't dive into this in anyof the episodes but it's all in

(14:24):
the links in the newsletter ifyou wanna learn more about this.
But overall, a big win, I think,for open AI when it comes to
developing the rightrelationships with enterprises
For ChatGPT implementation, thebad news for OpenAI this week
came from a big backlash ofusers online that were bitching
about the fact that ChatGPTstarted showing them ads.
One of these was Benjamin deCraker, that posted on X the

(14:47):
following.
I'm in ChatGPT in parentheses,paid plus subscription asking
about Windows Beat Locker, andit's fucking showing me ads for
shop at Target.
Yes, screw this, lose all yourusers.
And then he post a screenshot ofthat offer and there have been
multiple other users showing thesame thing.
OpenAI very quickly came out andsaid that this is not ads, that

(15:07):
they're not testing ads, atleast at this point, and that
they're not getting anyfinancial compensation for this.
The idea was to promote newapplications inside of chat PT,
to tell people that theyactually exist and that they can
use them.
Initially, OpenAI denied it evenexists, but then later that day,
Mark Chen, the chief researchofficer for OpenAI, posted on X.
Admitting, I agree that anythingthat feels like an ad needs to

(15:30):
be handled with care and we failshort.
We've turned off this kind ofsuggestions.
While we improve the model'sprecision, we're also looking at
better controls.
So you can dial this down or offif you don't find it helpful.
ChatGPT Head, Nick Turley,followed up on X.
There are no live tests for ads.
Any screenshots you've seen areeither not real or not ads, and

(15:51):
yet there were several otherexamples of unsolicited plugs of
Peloton workouts and again,target shopping that is very
clearly real.
So what does this tell us?
It tells us three differentinteresting things.
The first one is open AI isreally bad at pr, and this is a
very basic mistake that justshouldn't happen.

(16:11):
You cannot release somethinglike this into the wild without
testing it initially and tellingpeople that it's coming and
explaining what you're doing.
It just looks bad.
The second thing is that thefeature itself makes absolutely
no sense because it is showingthese quote unquote, not ads
when the context is completelyirrelevant.
If you are looking for somethingin Microsoft programming and
you're getting ads orpromotional things to go and

(16:34):
shop on the target applicationinside of chat pt, it makes
absolutely no sense and theyhave the context.
They know what the user islooking for.
So this just looks very, verybad from an implementation
perspective as well.
The third thing, and this is tome the most interesting aspect
of this, is that people arereally not willing to see ads
inside of ChatGPT, especiallywhen they're paying for it.

(16:55):
And it is very obvious thatOpenAI and probably the other
labs as well are looking at thisas a great way to make more
money and as a legitimatechannel.
And the fact that users are soanti anything like this in their
feed should raise a lot of redflags in the open AI universe
and driving them to understandhow they can actually integrate
this in a way that will beaccepted by their user base.

(17:18):
Maybe the solution is to have afree.
Chat GPT that is less limitedthan the current one that will
be ad supported.
That will allow them to get to alot more users who are not
willing to currently spend the20 bucks a month, and will still
be able to use the entireplatform because it will be paid
for by ads.
And the last point that I willmake about all of this is
another place where Google canwin big because people already
expect Google to show ads.

(17:39):
This is part of what we know.
Google's also very good atshowing you relevant ads that
will compliment what you'redoing when you're doing it as
you need it, which makes a lotmore sense than just random ads.
So I think Google will find it alot easier to implement ads in
specific relevant scenariosbecause they have the experience
in doing this.
They have the infrastructure todo this, they have the

(18:00):
frameworks to do this.
They have everything they needin order to do this more
successfully.
And overall, not a good stunt byOpenAI across the board.
And since we talked about GPTfive point two's release, a few
other releases this week.
Misra ai, the French company,has deployed their next
generation of vibe codingplatforms called Devra two,
which uses their recentlyreleased Misra three models.

(18:22):
It is very obvious that 2025 hasbeen the year of code writing
with AI and coding agents, andthey're just joining the bunch.
Their models are definitely notin line with the frontier
models.
Their models are more or lessnowhere to be seen on the LM CS
Arena ranking.
But it is an open source modelthat is really cheap to run.

(18:44):
they Have two differentvariations.
One with 123 billion parametersthat requires a minimum of four
H 100 GPUs to run.
But the other one, it's smallerbrother with 24 billion
parameters that can run on yourlocal computer.
So the fact that these are opensource models that can write
code decently well might berelevant to different people.
And we're gonna see later onwhen we discuss about the

(19:04):
research about global usage ofai, that open source models for
coding and stuff like that isactually gaining a lot of
traction.
I do think that they will findit very, very hard to compete
with the Chinese models that arealso open source and currently
much better than Misra models.
Another interesting release thisweek comes from the uk, so a
company called Loci, which islike Lock, LOC AI, just released

(19:27):
their first model, which is acrowdsourced UK built chatbot
that is claiming to beat ChatGPTon a few specific benchmarks.
They're claiming based on theirinternal evaluations that Loki
or lock ai, I am not sure how topronounce it, surpasses g PT
five Gemini and deep seek inconversational ability and human

(19:47):
preference.
What does that exactly mean?
I'm not exactly sure, do Ireally think they have a chance
to compete with the leaningmodels?
I don't.
But the interesting thing isactually the business model.
And again, the fact that it'spowered by blockchain based user
computing instead of let's builda data center for a hundred
billion dollars.
And so the fact that they'reusing a distributed blockchain

(20:09):
based net of computers withalmost no investment in
infrastructure.
And while allowing it to createa model that is worthwhile
mentioning is interesting to me,or as their CEO Jameson said,
Britain does not need tooutspend the world to lead in
ai.
We need to outthink it becausewe will not win the AI race
simply by building bigger datacenters.

(20:30):
A few things that were mentionedabout this model by the people
who got access to it before it'sreleased says that it is a lot
more polite and British than theaverage models that we know
today, but it is lacking almostall the other bells and whistles
and other capabilities and toolsthat we are expecting right now,
such as a mobile app and imagegeneration, voice mode, limited
tool calling, et cetera.
So right now it's a very basicmodel, but again, I am

(20:52):
interested less in the modelitself and more in the concept
of how it was developed.
And it will be very interestingto see if that is actually
scalable and can deliver realresults compared to, yeah, let's
invest$50 billion and build abunch of data centers.
But maybe the most interestingmodel announcement of this week
actually comes from Japan, fromseveral ex Googlers who have
built a model that they'reclaiming is the first real A GI

(21:15):
model.
But they've built it as a way tobuild robots to learn
independently with no data setsand no handholding.
So Integral AI has shown thatthey have built an autonomous
skill learning solution thatallows robots to learn new
skills without being trained onany of those in areas that are

(21:36):
not similar to things that theywere trained on before.
So in unsupervised trialsbasically means it's something
new that the robot doesn't knowthat exists.
A system taught robots freshtasks that are uncharted
domains, things they haven'tseen before.
While being able to delivermulti-layered, very nuanced
precision, including languagethought and very specific

(21:57):
dexterity to complete thesetasks.
They're also claiming that theamount of energy it took to
train these robots to do thesetasks is similar to the amount
of energy it will require humansto learn these tasks.
They're calling it a fundamentalleap compared to how AI is being
trained till today.
That is extremely data intensivewith a huge investment in

(22:20):
infrastructure, that is notrequired in the way they have
implemented it or as their CEOJad Tariff said.
Today's announcement is morethan a technical achievement.
It marks the next chapter in thestory of human civilization.
Our mission now is to scale thisa GI capable model still in its
infancy towards embodied superintelligence that expands

(22:40):
freedom and collective agency.
What they have sharedimmediately reminded me of the
interview with Ilia Sr.
From just over a week ago thatwe discussed in the episode last
week in the interview.
Ilia is sharing that what hebelieves about Superintelligence
is not necessarily the sheerimmediate intelligence, but its
ability to learn on its own, andimprove by learning.
So developing a system that isbetter at learning, which seems

(23:03):
to be what this group hasdeveloped.
Now while they are focusing onembodied intelligence, basically
robotics, the same concepts I'msure can be applied not just to
robotics and developing systemsthat will learn very effectively
without having access to hugeamounts of data.
This may not replace largelanguage models as we know them
right now, but it might behighly relevant for specific

(23:23):
fields.
This connects directly to thewhole concept of recursive
self-improving ai, which we'regoing to talk about in a minute
because there've been a lot ofmentions about this in the last
few days.
Before that, a few moreinteresting releases.
N8N just released N8N versiontwo.
Those of you who do not know N8Nis a workflow automation
platform that gain huge successin the last couple of years,

(23:46):
especially in the more geeky,more technical community.
So think about tools like Zapierand make and Relevance and so
on.
So N8N belongs to these, butit's a lot more technical, less
user friendly, but hence a lotmore capable because it can run
code and a lot of other thingsin much more effective ways.
And in this release, they sharedsome information about the crazy
growth that they're seeing.
So this is an open sourceproject, and they shared that

(24:09):
since version one that wasdeployed in July of 2023, GitHub
stars skyrocketed from 30,000 to160,000.
Their forum members grew from6,000 to over 115,000 members,
and their team, the core team,grew from 30 to 190.
I actually, I really like N8N.
I find it to be extremelypowerful despite the fact it is

(24:32):
not user friendly, and I use itrelatively frequently.
Together with make.com that I'malso using.
So when I need something verysimple, I go to make.com.
When I need something morecomplex, I will go to N8N.
The biggest benefit of N8N, as Imentioned, is that it is open
source, which means you canself-host it and then only pay
for the hosting.
And you can run as manyautomations as you want without
paying per step or perautomation.

(24:53):
So what's the big deal ofversion two?
On the surface, it looks verysimilar.
There's a facelift to the userinterface and there's a side
menu that's supposed to make thework in N8N a little better.
But most of the upgrades happenunder the hood focusing on much
higher levels of security andperformance.
So these are the key things, is.
Better performance, highersecurity, and higher level of

(25:14):
stability and consistency as theautomation is running.
All of them are really importantif you want to implement this to
actually run operations in yourbusiness.
I haven't seen any feedback fromactual users yet.
It was literally just released,but I will update you on what is
the overall sentiment after morepeople share what they found
about working with N8N versiontwo.
And I'll do our second bigtopic, which is recursive

(25:37):
self-improving ai, also known asRSI.
Why are we suddenly talkingabout this?
Because within a week and ahalf, there's been multiple
angles that have addressed thisspecific topic.
The first one was Eric Schmidt,the former CEO of Google, who
spoke at the Harvard KennedySchool, and he was specifically
talking about RSI as a near termgovernance problem.

(25:59):
Schmidt was talking aboutrecursive self-improvement AI in
very practical term versus thetheoretical concepts that were
available until not too longago.
He explained that today's AIsystems still require humans to
design the training runs and themodels and to create the
infrastructure and so on, but heis arguing that this is
temporary and that this canchange relatively quickly.

(26:20):
He was talking about the factthat there is a lot of
discussion that this can happenwithin the next two years.
His timelines are slightly morestretched.
He's talking about four years,but it's still in the very near
future for us to deal with thesekind of things.
He suggested that AI systemcapable of autonomously
generating new scientifichypothesis.
Discovering new mathematicalinsights and processing new

(26:41):
medical fields are likely veryclose again in the next few
years.
And Schmidt wanted to sound thisalarm saying that as a society,
this is something that we needto address and we need to
clearly draw the line of howmuch agency we're willing to
give machines to self-improvethemselves.
So before we dive into whatother relevant people are saying

(27:01):
and the other facts, I want togive you a quick explanation of
what the hell is recursiveself-improving ai.
So if you think about, again.
Eric Schmidt said, the way AI isdeveloped today is you have a
bunch of scientists who collecthuge amounts of data.
They create the models, theycreate the training runs, they
improve the algorithmsthemselves, and then this way,
AI gets better over time.
However, a self-improving AI cancreate its own data sets as

(27:26):
we'll see in a minute, cancreate its own hardware, can
create its own new algorithms,and then create a new version of
AI that will then be able to dothe same thing again, just
faster and better, because it'sa better model to then do this
again, faster and better and soon.
This is what's called, in theprofessional term, a fast
takeoff.
It's the point that you losecontrol because the AI can
develop better and better ai,faster and faster building on

(27:48):
the improvements of the previousmodel to build a better next
model.
And this is how we might losecontrol on what AI and computers
can do.
Almost at the same exact time,within a day or two.
After the interview with EricSchmidt, we heard an
announcement from Anna Goldieand Azalea Rosini, that they
have founded a new companycalled Recursive Intelligence.
Both of these founderspreviously worked at Google

(28:10):
affiliated researchorganizations and were deeply
involved in AI drivensemiconductor design, most
notably the Alpha Chipinitiative.
What is this?
It was an initiative to use AIto design better chips that
these designs were actually usedin chip building.
And they were able to prove thatAI can design better chips than
humans cans while coming up withoriginal ways to design chips

(28:32):
that humans did not think aboutpreviously.
So what are they trying to do?
They're trying to build AImodels to design the next
variation of chips.
Those chips will allow to trainthe next version of models
faster and cheaper, which willthen be able to design the next
version of chips, and the cyclecontinues.
So unlike the concepts that EricSchmidt was talking about, which
was only software, they're alsotalking about the hardware

(28:54):
aspect of this.
Their goal is to compress amulti-year process of
semiconductor design into weeks.
This means that the iterationsof next variations of AI
specific hardware can happen inmuch faster cycles, which will
enable even faster developmentof models, and the rest is very,
very clear.
Based on their previous success,they were able to raise$35

(29:15):
million seed round led bySequoia at an estimated of$750
million valuation.
If they can figure this out,this will completely change the
way computer hardware isdeveloped right now, which may
put the entire AI hardwareecosystem into a very
interesting scenario.
Just think about how much moneyis being poured right now into

(29:38):
GPUs and building data centersand I shared with you last week
in the bubble discussions thatthere are very big questions on
what is the actual life lengthof these chips in an effective
way?
And the discussion is aroundfive years.
Some people are saying slightlymore, some people are saying
slightly less.
But if this thing happens andthe next cycle of chips can
happen in, let's say threemonths, then the previous
version becomes obsolete andthen three months later and so

(30:01):
on, which puts a very bigquestion mark about the current
business model of AI hardware.
Obviously amplifies the wholerecursive development question
because now the hardware getsbetter at a much higher pace
than it is right now.
Another angle that adds to themix came from open ai.
So open AI's alignment andsafety research has issued
another paper, and that paperspecifically recursive

(30:22):
self-improving as one of thecategories of concern.
Now in this paper, they're notclaiming that they're there yet,
that they have achievedrecursive self-improving AI or
that it even exists today, butit is treating it as a future
threshold of risk, which is acapability that should require
systems to be controllable,auditable, and aligned with
human values.
The emphasis in the paper is notabout stopping the progress of

(30:44):
these systems, but ensuring thatthese systems improve in a way
that humans still stay incontrol and have complete
oversight and can stop it beforeit goes out of control Again,
this is the first time thatOpenAI names RSI, specifically
in its papers as an area ofconcern.
Another reference to this camefrom Jared Kaplan, who's the
chief scientist officer andco-founder of Philanthropic, and

(31:05):
in early December, he discussedrecursive self-improvement in
interviews on public commentaryassociated with philanthropics
safety messaging.
He also mentioned that RSI as acivilizational decision point,
he emphasized that such processis inherently uncertain.
You basically do not know whatAI will do.
Once system begins to designtheir successors, humans may no

(31:28):
longer be able to fully predictthe outcomes which is obviously
a huge risk for humanity.
So from all of these and someadditional reference, we can
learn a few things.
First of all, these RSI systemscan self-improve across multiple
verticals.
What is the architecture thatthey're built on?
What is the training process andthe training data that it is
using?
What is the hardware that isrunning on, which algorithm it

(31:49):
is using, and the actualresearch on how to develop
better system.
All of these will be able to bedone by ai, that they will
create a better version thatwill be able to do the same
thing better.
So it's not just the algorithms,it's not just the code, it's all
of the different components allcombined.
Now, where can this go?
This can go in three differentways.
One of them is human governed.
RSI.
Basically, humans remain tightlyin the loop and can pull the

(32:11):
plug whenever they see that itis the right time to pull the
plug.
The other option is partiallyautonomous RSI, basically
systems that can handle most ofthe tasks on their own with
human oversights only at bigsteps and stages.
And then the last thing is fullyautonomous RSI, which is very
obvious.
It just runs and does whateverit wants and there's obviously
conflicting interests herebetween safety and the potential

(32:33):
benefit of such a system, right?
So if you are in the race forglobal dominance, letting the
system run autonomously willhappen significantly faster, but
from a safety perspective, couldlead to a complete catastrophe.
And so when all these leadersall roughly at the same time
start talking about the risksthat this represent, it means
one thing and one thing only,which is they're seeing the

(32:53):
glimpse of this in their labsalready.
You need to remember that allthese labs have models that are
significantly more powerful thanthe one they're releasing that
we have access to.
So when we are comparingbenchmarks and things of GPT 5.2
to Cloud Opus 4.5 to Geminithree Pro.
All of these are the thingsthey're releasing, what they
have in their labs and thatthey're testing and
experimenting and researching isbetween six to probably 18

(33:16):
months ahead of what they'reactually releasing.
And if all of them suddenlyrough at the same week, started
talking about these things, itmeans that they're seeing the
glimpse of that in their labs.
And this is becoming very, veryreal to them.
And this is scary because thiswith the wrong process could
lead to us practically losingcontrol on how AI operates and

(33:37):
what it does and what it can do,which could, in specific
scenarios, represent anexistential threat to humanity.
And so.
On one hand, I'm happy thateverybody's raising the flag.
On the other hand, I would loveto see significantly tighter
collaboration between labsacross the world and governments
across the world to specificallylook into AI safety and RSI

(33:58):
included in that, and make surethat we are all aligned to make
sure that nothing catastrophichappens in the future.
This might be more dangerousthan atomic weapons, and we know
that that was a successfulinternational initiative.
So I really hope we'll startseeing this coming together in
the near future.
And our last deep dive topiccomes from multiple sources and
several different labs andorganizations released research

(34:20):
based on actual usage of AIacross different platforms,
showing how people actually useai.
The first one.
That we're going to discusscomes from open router.
Those of you who don't know OpenRouter.
Open router is basically anaggregator.
It's a hub of more or less,every AI API out there.
And all you need, you need oneconnection to them and just
their API key.
And you can consume tokensacross all the different AI APIs

(34:43):
that are out there.
I've used it for multiple usecases previously.
It's a very effective way to dothis.
They take a few percentage ofrevenue, basically arbitrage
between the real cost of thetokens to what they're charging
you.
But you are getting a verysimple integration to as many
models as you want, includingrollback and including
redundancy capabilities that canroll over to other models based
on what you define.

(35:04):
So very useful tool overall.
So they work together with a 16Z to review the usage of how
people are actually using thesemodels across over a hundred
trillion tokens.
That was consumed in 2025.
And in this paper open routertogether with a 16 Z has.
Evaluated how models are beingused across over a hundred

(35:25):
trillion tokens that wereconsumed through the open router
platform between November, 2024and November, 2025, across over
300 models from 60 differentproviders.
So this is a lot of tokens beingconsumed now, to put things in
perspective, even though thenumber sounds really
significant, and it is.
Google Gemini shared recentlythat they are generating over a

(35:47):
quadrillion, which is 1000trillion tokens in a month.
So a hundred trillion is stillnot a crazy amount, but it's
definitely a big enough datasetto look at trends, for sure.
So what are the things thatthey've shared?
US?
LED closed source model likeopen ai, philanthropic are still
leading by a very big spreadwith 70% of tokens.
But at the same time, opensource models have been growing

(36:09):
specifically Chinese open sourcemodels that have grown from 1.2%
in late 2024 to over 30% in peakweeks in 2025, and an average of
numbers in the teens for theseopen source models now.
Also how people are using it hasshifted dramatically through
this year.
The usage in 2025 hadprogramming related tokens.

(36:31):
Were only 11% at the end of2024, and it is over 50% for all
tokens by late 2025 Anthropic,Claude is leading the pack with
over 60% for most of the year.
Deeping just below that inNovember of 2025 while open
source models, and again, mostlyChinese use most of the tokens
for role play and gaming, orbasically day-to-day

(36:52):
conversations.
What does that tell us?
It tells us that when peoplewant reliable code writing,
they're willing to pay more forthe closed source models, but
when they don't need the samelevel of reliability and
consistency across really largedata sets, then they are going
for price in which the Chinesemodels are delivering very good
value.
Another big clear trend is theusage of reasoning models That

(37:14):
has hit more than 50% now, andif you think about it, the first
reasoning model was justreleased by open ai.
And if you think about how crazythis is, the first reasoning
models OpenAI is, oh one wasreleased in December of 2024, so
less than a year ago, and nowit's already being used for more
than 50% of these tokens underopen router.
Another big difference is thegrowth in the length of prompts.

(37:35):
So prompts, tokens hasquadrupled in this past year to
6,000 tokens in average perprompt.
Another thing that has grewdramatically is the length of
sequences, basically longer backand forth conversations.
Which tells you that peopleinvest more time in writing
better prompts, and that they'reusing it for more and more
complex processes.
As far as where these tokens arebeing consumed, north America is
still number one with 47.2%, butAsia is surging and it's now

(37:59):
28.6%, so still way behind theNorth American market, but
growing very, very fast.
They also looked at stickiness.
So how many people are switchingmodels versus how many people
are staying with the same model?
And they found that it dependson the specific use cases where
there were serious,sophisticated, hardcore use
cases, people stuck with thesame models longer.

(38:21):
And while it was day-to-daytasks, uh, there was no
stickiness almost at all.
And people would switch back andforth a lot more frequently.
So what can we learn from this?
The first thing I wanna mentionis a caveat about the specific
tool.
A, again, this is not a hugeamount of tokens, but definitely
a good statistical size to lookat.
But B, the people who are usingopen router are more technical
people who are building theirown applications through an API,

(38:44):
which is definitely not yourcommon user.
So I think in the common users,we will not see more than 50%
going to programming.
But that being said, thesefindings are very interesting
and they're definitely showingthe trends.
To me, the most interestingparameter out of all of this is
the huge growth in the lengthsof prompts and the length of
change.
It basically signals thatdespite all the conversation

(39:04):
about how the models areunderstanding us much better and
that prompt engineering is goingaway, it is clear that people
understand that as they writebetter, more detailed, prompt,
and know how to construct longerdetailed conversations in a
structured way, they are gettingbetter results.
I can very clearly see that onmyself, especially since I
started voice typing, most of myinteractions with my computer

(39:26):
and a lot of it is AI related.
My prompts became significantlylonger, significantly more
nuanced and details and I, andwhile I follow specific
frameworks as far as thestructure of the prompt, I'm
providing each of the componentsof the framework with more and
more details, making it a lotmore nuanced, and I'm getting
much better results Now,speaking of knowing how to build
prompts better and gettingbetter results.

(39:49):
If you want to learn how to dothis properly, if you want to
learn the art and the science ofprompting, if you wanna know how
to use AI for data analysis andresearch, if you wanna know how
to create content, both visual,written and video content with
ai, basically, if you wanna knowthe fundamentals on how to use
AI effectively for businessusage, our next cohort of our
highly successful AI businesstransformation course starts at

(40:12):
the third week of January.
If you have not yet invested inmore structured AI education for
either yourself or people inyour team or your company, you
owe yourself to do this in thebeginning of 2026 because the
gap, the chasm between thepeople who know how to use AI
effectively and do not know howto use AI effectively is growing
every single day and havesignificant impact on careers

(40:34):
and futures of companies.
So if you're interested, there'sa link in the show notes.
You can click on that and learneverything about the course.
I started teaching this coursein April of 2023.
So it's been two and a halfyears of evolvement.
I've been upgrading this courseevery single month and I've
taught it to some of the largestcompanies and enterprises in the
world.
And then about once a quarter, Iopen it to the public.
So don't wait.

(40:54):
If you need AI training and youdo, if you haven't done this so
far, come and join us at the endof January.
Now back to release ofinteresting findings of how
people are using ai.
Perplexity has released a studytogether with Harvard and
they've looked together athundreds of millions of
anonymized Comet and Cometassistant interactions.
So those of you who don't knowComet come is their agent

(41:16):
browser, and Comet assistant isthe agentic aspect that runs
within the browser.
And then we're trying to analyzehow people are using Comet for
what purposes and how it evolvesover time.
And they shared some reallyinteresting findings.
First of all, 57% of agentactivity focuses on cognitive
tasks, 36% on productivityworkflows and 21% on learning

(41:37):
and research.
They also saw a very clearpartnership between the users
and the agents where the humansdelegate and expand the
capabilities to gather andsynthesize information while
allowing the humans to make thefinal call, which is more or
less how I use AI for more orless everything.
Another very interesting thingthat they found is that when
people get started, they startwith very simple, low stake, fun

(42:01):
stuff like travel and trivia.
But then as time goes by, peoplego to more and more high power,
high quality, how utilityworkflows that actually provide
them real value.
They have broken the overallusage to six different
categories.
The largest one is productivityand workflow at 36.2%.
This category includes documentand form, editing, account

(42:21):
management, email management,spreadsheet and data editing,
computer programming, uh,investing in banking, and a few
others.
The second largest category islearning and research at 20.8%
of overall usage with courses at69% and research at 37%.
The next category was media andentertainment at 20.8%.

(42:42):
Roughly the same level aslearning and research.
And over there they have socialmedia and messaging, movies, TV,
videos, online games, music andpodcasts, and some other smaller
categories.
The next one was shopping andcommerce at 10.9% with buying
goods at 89% and services at10.3%.
And then the two smallercategories were travel and

(43:02):
leisure at 7.1% and job andcareer at another 7.1%.
As I mentioned.
The other graph that was veryinteresting to me in that
research talks about how peoplechange their behavior between
their first queries and theaverage of all queries.
So on first queries, basicallythe first time people use this
media entertainment was numberone by a big spread followed by
travel and leisure.

(43:22):
And then shopping and commerce.
But if you look at the overallqueries, number one is
productivity and workflows, andnumber two is learning and
research, which tells you thatas people start to understand
how they can use agenticbrowsers for actual real work,
this is the direction thatthey're going to take, which
gives us a hint on how thefuture of Agentic universe looks
like with agents helping usbasically across the board, but

(43:46):
with a very clear focus onproductivity on business life.
I've been using Comet and Atlasfor a while now, and I can say
that I use them mostly forbuilding automations and helping
me troubleshoot code when I needto, and it's actually very
effective at doing that.
It is saving you the copying andpasting into a chat with a
ChatGPT or Clot or Gemini.

(44:06):
And it has all the context ofthe specific websites and the
flows because you can see thebrowser and it can research, uh,
things that are going on andlook at different components
that you don't necessarily giveit in your screenshots.
So I find it to be very, veryeffective in these kind of use
cases.
The same thing can obviously beexpanded to any other process
that you are doing online andyou need somebody to hold your
hand or help you because you'renot an expert.

(44:28):
And so I highly recommend youtry that as well.
Microsoft research shared theirdive into 37 and a half million
anonymized copilot chats showinghow people are using copilots in
the wild.
And what I mean in the wild,this research looked only at
users that are not in theeducation or enterprise
licenses.
So basically the open to thepublic version of Microsoft

(44:48):
copilot.
And they're sharing some veryinteresting findings.
First of all, mobile users haveused copilot.
A lot more for health queriesaround the clock.
While desktop users focused alot more on business related
tasks between nine to five,which makes perfect sense.
Programming related prompts haspeaked on weekdays while gaming
and exploration has peaked onweekends again, makes perfect

(45:10):
sense.
But the other thing that wasinteresting when it comes to
weekends is that a lot of peoplewent to copilot for
philosophical questions,including as they quoted
existential clarity.
And these kind of conversationshappened a lot more after dark
in the late night hours wherepeople start to wonder and have
AI help them think about theirlives, the future of the world,
and so on.

(45:30):
More philosophical question.
So what does these patternbasically tell us?
They tell us that AI is gettingembedded across the board to
everything that we do.
This is what it's actuallyshowing, right?
So this is from our daily livesto our psychological wellbeing,
to business use cases andday-to-day requirements.
Which is telling you that AI isbecoming mainstream, right?
It's no longer just a geekything, but everybody's using it

(45:51):
for more or less everythingacross the board.
Now, while this is interestingand it's.
Cool findings.
You need to remember that thegeneral public is using copilot,
almost none whatsoever.
So just over 3% of globalchatbot share belongs to copilot
compared to close to 80% of chatGPT as an example, another
interesting research that comesto how AI is actually failing on

(46:13):
specific things came fromDatabricks, uh, this week when
they released office qa, whichis a new benchmark that is
looking through 89,000 pages ofUS government information as the
dataset to see how well AI canactually find, identify, and
synthesize information.
And GPT five barely got 43%accuracy on document heavy

(46:34):
tasks.
And that was actually evenbetter than Claude Opus 4.5
agents that hit only 37.4% onraw PDFs at large scale.
Which basically comes to tellyou that if you just unleash AI
on a large set of documents thatyou currently have in your
company, you should expect lessthan 50% accuracy, which is
obviously unacceptable.
Now, the reason obviouslyDatabricks share this is because

(46:56):
they are selling a service tomake that better.
So they have a service that ispre-processing of documents that
basically parse the documentsinto a structure that makes it
easier for AI to analyze andusing this process.
Claude Opus jumps to 67.8%, soover 30% growth.
GPT 5.1 climbs to 52%, a 9.3%jump.

(47:16):
And a lot of it has to do withthe structure of the documents,
such as removing nested headersand merged sales in Excels and
so on.
They also shared that agentfailed pretty bad on visual
charts and came up withplausible but wrong answers In
many of these cases, and as Imentioned that most of these
models plateau at around 40% ontough multi-step problems, which

(47:37):
basically signals the needbeyond just OCR parsing to get
real relevant results.
I agree with that 100%.
I shared with you on severaldifferent episodes that I also
have a software company.
That software company developeda really amazing product that
knows how to do invoice vouchingand reconciliation automatically
connected straight into your ERPor accounting system.

(47:58):
Now.
A potential client asked me in ameeting this week, what is the
system doing differently thanjust traditional OCR of scanning
and getting the invoices into amuch higher level accuracy that
our system does compared totraditional systems?
And the answer was exactly this.
It's agents that look at thecontext and the understanding of
what the process and what mightthe outcome needs to be way
beyond just the basic OCR.

(48:19):
Great examples are discounts andrefunds.
They come in different shapesand different sizes, so the OCR
itself can pick a line, but whatdoes it actually mean?
How does it apply in your actualsystem?
How should it be represented ornet, NetSuite or your accounting
system?
All these things are aspect thatif you don't build a system
around it, actual agents don'tunderstand context and
understand what the OCR maymean.

(48:41):
You might get very wronganswers.
Which basically tells us thatreal life is a lot more complex
and nuanced than just abenchmark.
Hence why in many cases you aretrying a new process and you do
a quick test of with arelatively small sample size and
it works really well.
And then when you actually tryto run it at a large scale
company wide usage, it failsmiserably.

(49:02):
And it's because of that, it'sbecause real life is a lot more
nuanced with much more edgecases than we tend to think.
And without being able to trainthe systems properly and teach
them how to handle these caseswith a lot more context, AI is
just not there yet.
Another great example of this,actually, you're gonna hear.
In an episode that's coming inthe next few weeks with Nate
Amon, who helps largeenterprises implement AI

(49:23):
effectively, and he was talkingabout how investing in the
structure of your documents andbuilding them correctly will
dramatically improve theirresults of AI when using these
documents.
He said something that is reallyprofound that I never thought
about before, that we need tostart creating documents that
will be easy for AI to read,that humans can also read versus
writing documents for humans andhoping AI will be able to read

(49:46):
them effectively.
Now the big question isobviously what do you do with
the tens of millions or evenmore than that documents that
enterprises already have andthat already existing?
And for that, I'm sure there'sgonna be different optimization
processes such as the one thatDatabricks is offering.
I also assume that AI will getbetter at reading them to be.
The best example is Claude'slatest ability to read
significantly more complex andless structured spreadsheets.

(50:09):
That literally blew my mind whenthey came up with a recent
version that does that.
So I think the AI will getbetter at reading these together
with some kind of anoptimization mechanism, will
allow us to make a better use ofdocuments with ai.
But in the last research thatwe're gonna share this week,
we're going to learn evensomething more interesting that
the AI itself is just half theproblem.
The other half is how we humansperceive AI work.

(50:30):
SAP just did a reallyinteresting experiment where
they gave five different SAPconsultant teams information
that was presumably generated byjunior interns.
Four out of the teams held theanalysis as impressive and
validated as high quality.
The fifth team was told thattheir AI platform, Juul has
generated the research and thisteam dismissed most of the

(50:53):
finding.
Went back to, to double checkalmost every aspect of the work.
So the same exact work was givento five different teams.
Four did not know it was ai, andthey praised the results and one
was told it was AI and said itwas pure junk and had to go and
redo the work.
What we learned from that is thecurrent perception of AI is
highly negative in that peoplehas serious deep fears on what

(51:16):
AI produces.
Now the question is do I thinkit is justified or not?
And I think there are caseswhere it is justified and there
are cases where it is notjustified.
I'll give you two quickexamples.
If it is mission critical to geta hundred percent accuracy, like
doing a financial report, thenhaving.
Second thoughts about theaccuracy of AI is justified.
However, if you are using AI foronline research and you're

(51:38):
asking it to provide clearcitations and you're asking it
for links to where the citationsare taken from, and you're
asking it to tell you exactlywhat was the quote from each and
every one of these sources, andyou're looking for a trend or a
broad brush direction for aspecific topic, it is definitely
good enough and it is mostlikely going to be better than
what you can do on your own.
Now the other thing that theyshared about Juul, again, which

(52:01):
is their own house grown AIagent, is that consultants who
are learning how to properlyprompt it are getting
significantly better results andbetter outputs compared to the
ones who just give it verysimple prompts, which takes us
back to learning how to promptproperly, which takes us back to
training.
You need training for yourself,for your company, for your team,
uh, in order to learn how toeffectively use AI in a much

(52:22):
better way.
Last week I've done a workshopfor a large enterprise teaching
their senior salespeople andtheir CBEs how to properly use
AI to research and respond togovernment style large scale
RFPs.
And it has blown their mind onwhat is possible that they did
not know because they did notknow.
And now that they do know, I wasable to show them how to more

(52:43):
effectively query documents, doclient research, people
research, and come up with thehypothesis on how to properly,
uh, respond to the RFP, how toanalyze the requirements down to
the requirement level comparedto your competitors and so on.
All of that is possible if youknow how to use AI effectively.
And they did all of thatmanually so far.
Staying on the topic of using AIat the enterprise level.

(53:06):
Philanthropic just released areally interesting feature that
connects Claude to Slack.
So what exactly does it do you?
All you need to do is to installthe Claude Slack app from the
marketplace.
Authorize your Claude Codeaccount to be connected to it.
Ensure that it has web access toClaude Code on the web, and then
it does the following magic.

(53:26):
It can look at the channels thatyou're giving it access to and
look for all the context,including everything that was
talked about in that Slackchannel, including coding
examples, including bugs,including features.
And if it has access to yourcode base, it can spin up
several different instances ofcloud code on the web and pull
threads and channels context andbuild codes and create the right

(53:49):
repos to.
Solve bugs, create new code andso on while reporting back to
the Slack channel.
So it basically acts like anindependent code writer in the
team that has full context ofeverything that happened in an
entire thread and canautonomously write code and fix
bugs on its own while reportingback what it is doing.

(54:11):
Is this scary as hell?
Is this, can this become one ofthe most effective tools of code
development teams?
100%.
Because a lot of the effort ishappening in Slack right now.
A lot of the coordination ishappening in Slack right now.
And if you can tailor exactlywhat Cloud code can and cannot
do in that environment, such assolve these kind of bugs, but
don't touch these kind ofinfrastructure things, it can go

(54:33):
autonomously and do these taskswhile reporting back so somebody
in the code review can checkthese changes.
As somebody who has been usingCloud code excessively this past
week, I can tell you that Claudecode on Opus 4.5 is incredible
and also as somebody who is theCEO of a software company, will
I allow this to run in mycompany?
At this stage, probably not as atesting on specific things.

(54:56):
Most likely, yes.
But I think this will evolveover time and I think the
reliability and then later on,the fear will change
dramatically.
And this will become common insoftware companies sometime in
the next 12 to 18 months.
And now to a few rapid fireitems.
The first one is Time Magazinejust named the person of the
Year.
It's something they've beendoing every single year for the

(55:17):
last a hundred years or so,since 1927.
And this year, the person of theyear is not just a person, it's
persons, and they've named theleading minds behind the AI
revolution that we are facing astheir persons of the year.
This include Mark Zuckerberg,Lisa Sue, who's the CEO of a MD,
Elon Musk, Jensen Huang, SamAltman, Demi Saba, the CEO, and

(55:37):
the founder of DeepMind, DarioAmede, the CEO of Anthropic, and
FEI Lee the.
Founder of World Labs and one ofthe founder of World Labs that
we talked about a few weeks ago,and they chose them.
And now I'm quoting fordelivering the age of thinking
machines for wowing and warringhumanity, for transforming the
present and transcending thepossible.
The architects of AI are times20, 25% of the year.

(55:59):
Interestingly, out of thisgroup, there are five
billionaires with a collectivefortune of over$870 billion.
I think the more importantaspect that these people are
basically deciding the future ofhumanity every single day.
The way this world is going tolook like in the next decade and
beyond is gonna be dramaticallydifferent than everything we've

(56:19):
known so far because of the workthat these people are pushing
forward.
But to me, the fact that TimeMagazine chose these people as
the persons of the year isconnected directly to what we
discussed earlier when we talkedabout how people using ai, and
that is the fact is that AIbecame from something that geeks
like Me used in 23 and 24 tosomething everybody is using
from grandparents tograndchildren across more or

(56:41):
less every task in every aspect,and in every place around the
world.
By the way, from a cool aspectperspective, it is worth looking
at the cover of Time Magazine.
They actually have two covers,but one of them mimics the
iconic 1932 photograph of lunchatop skyscraper, uh, by Charles
c EBTs, which is one of the mostfamous pictures maybe ever taken
that shows, 11 constructionworkers casually sitting on a

(57:03):
steel beam that is hanging inthe air above the skyline of New
York City.
So they took the same conceptand created an illustration of
these people sitting on the samekind of thing, and it is.
I love the connection, to thepast and to showing this is how
the future is being built.
This was how the future wasbuilt.
Built in 1932, and the newfuture is being built by AI and
for ai.

(57:23):
Our next rapid fire item talksabout the first time the Fed is
actually admitting that aI playsa role in the cooling labor
market.
Federal Reserve chair, JeromePowell.
Declared that artificialintelligence is part of the
story.
That's the exact quote behindthe worsening job market.
And he shared that during aDecember 9th economic forum, and
that was reported at theinformation.

(57:43):
The specific numbers he sharedis that the unemployment in the
US is now at 4.2% with non-farmpayrolls adding only 178,000
jobs, which was belowexpectations and sectors like
tech and finance, showing hiringfreezes because of AI
automation.
Powell is also talking about thebenefits of ai and he was
crediting, 2.8% productivitysurge in Q3 because of ai.

(58:07):
Not sure exactly how theymeasured or calculated that, but
he is definitely warning ofdisplacement risks because of AI
in white collar jobs based onthe analysis that they did.
If you have been listening tothis podcast in the last few
months, you know that I havevery serious concerns for the
job market in general anddefinitely in white collar jobs
in the next couple of years.
And you also know that in entrylevel jobs, this is becoming a

(58:29):
much, much, much bigger problembecause AI is much better at
automating entry level jobs thatare just simpler to automate.
Connected to that is a researchthat OpenAI just shared, and in
an internal study, they foundthat AI saved 50 minutes per day
for workers using different AItools, including ChatGPT.
And they did that by studyingover 5,000 participants in

(58:50):
different roles such as softwareengineering and marketing.
And that was shared by Bloombergon December 8th.
The biggest time savings came intasks of research.
That's very obvious.
I now use deep research at leastonce a day, sometimes in
multiple times a day acrossdifferent platforms.
and that obviously saves mehours of research.
and the lowest actually in theresearch was coding with saving
only 42 minutes per day.

(59:10):
But overall, 82% of usersreported faster completion of
tasks with AI handling, elementssuch as drafting emails,
debugging snippets, doingresearch, and so on.
Now, do I think the 57 minutesper day per person is a
realistic number?
I don't, but I definitely cantell you that from my own
experience working with myselfand with multiple companies as a
consultant, as in doingworkshops for companies, there

(59:33):
are specific areas in which itis saving a lot more than that.
And there are areas that aredriving significantly less
savings.
And hence, I think it's veryspecific to the role and the
specific task you are workingon.
But you can get very significantsavings.
And overall these are going togrow into longer periods of time
and more aspects of thebusiness.
I mentioned earlier in theepisode the Exodus that is

(59:54):
happening from Apple right now.
We talked multiple times in thispodcast about how Apple is
failing with all of its AIinitiatives in the past few
years.
dozens of AI scientists andresearchers have been leaving
Apple in the past few months.
Most of them are going to metaand open ai, but they're also
going to other locations.
And the roles of the people thatare departing spans from

(01:00:15):
researchers to design engineersto audio tech, watch,
interfaces, robotics.
Combine that with some of theveterans that have been running
AI at Apple for a very longtime, including their machine
learning chief, and youunderstand that Apple is in
serious trouble when it comes toai.
Uh, we spoke in this podcastseveral times in the past that
they should have boughtsomebody.
There were conversations aboutpotentially buying perplexity

(01:00:36):
that didn't happen.
There were very vague rumorsabout buying anthropic, which is
probably what they should havedone, uh, while it was still
much cheaper than it is rightnow.
But they are in serious troublewhen it comes to their AI
initiatives.
And with the current term oiland all these really smart
people with experience leavingthem, it doesn't look very good
for Apple.
And the next topic is actuallyinteresting, and it's coming
from an interesting direction,and it's coming from Gartner.

(01:00:57):
Gartner has issued arecommendation for enterprises
and companies to immediatelyblock all AI browsers.
I'm quoting for the foreseeablefuture, and the reason they're
saying that is they're sayingthere's overwhelming
cybersecurity threats thatthey're detailing in a report
called Cybersecurity must BlockAI browsers.
For now, that's literally thename of the report.
They're claiming that tools likePerplexities Comments and Chachi

(01:01:18):
Atlas have two risky aspects.
They're claiming that thebiggest risk is that these
autonomous features could wreckhavoc by exploiting
authenticated environments thatthe users will have access to
because they have theircredentials saved in the
browser, as an example,basically enabling unlimited
data leaks and malicious actionsto these otherwise protected

(01:01:39):
environments.
Now in our Friday AI hangoutsthat happened yesterday.
And those of you, by the way,who want to join us for these
Friday Hangouts, you're morethan welcome to join.
There's a link in the show notesto come and join us.
it's an open mic kind ofenvironment that we talk about
AI risks and we talk about AIimplementation and we review
specific tools.
It is just a great community ofpeople who are implementing AI
and sharing what they'relearning.
So we talked about this topicyesterday on how safe it is to

(01:02:01):
use these tools.
And my answer is, is it dependshow you use them.
If you use them for things thatare not your core business and
you do not give them access toyour bank account or to your
accounting system and so on, andyou keep using them for just
very specific tasks, they aresafe and provide exceptional
benefits and you just gotta besmart about what you allow them
and do not allow them access to,which can definitely be done.

(01:02:21):
If you do deploy themorganization wide, then you have
an issue and that issue comeswith training and data security
issues that you gotta deal withbefore you deploy them company
wide.
That's it for this week.
There are a lot of other reallyinteresting stories that we just
don't have time for and you canread about all of them in our
weekly newsletter.
It includes the links to all thearticles we did cover and to all

(01:02:41):
the articles we did not cover.
And I believe this week thereare more articles we did not
cover than the ones we didcover, including Nadela.
Admitting that Microsoft has amassive disadvantage in the AI
race, including.
Adobe's new announcements ofinfusing ChatGPT with Photoshop,
including how a company is usingcloud code to run over a
thousand machine learningexperiments every single day,
and including another lawsuitagainst ChatGPT for a family

(01:03:05):
that Sues OpenAI and Microsoftover a murder suicide case that
was fueled by ChatGPT and a lotmore.
So if you wanna learn more aboutother topics, just sign up for
our newsletter.
It's also gonna tell you exactlyhow we can join our events, uh,
that are free and that provideaccess to really amazing experts
and a lot of other stuff.
So, it's a great way tocompliment what we do here on
the podcast.

(01:03:25):
That is it for this week.
If you are enjoying thispodcast, please like us and give
us a review on Apple Podcast andor Spotify.
And while you're at it, click onthe share button and share this
podcast with other people thatyou know that will benefit from
it.
I'm sure you know more than afew people that will benefit
from this podcast as much as youdo, and all you have to do is
click the share button, invest.
10 seconds in helping thosepeople and helping us, and I

(01:03:47):
will be really grateful if youdo that.
Don't forget to check out ourcourse.
It can literally change yourlife and it happens at the end
of January.
So go on and sign up.
They always get sold out beforethe deadline, so if you want to
be a part of that course, go andsign up to it right now.
We'll be back on Tuesday with anamazing episode that will show
you how to use cloud code tobuild amazing agents without

(01:04:10):
knowing how to code or evenknowing what code is, and create
incredibly powerful agents thatyou can apply through more or
less everything in your businessand in your life.
Keep exploring ai, keep sharingwith us and with others what you
learn.
And until next time, have anamazing rest of your weekend.
Advertise With Us

Popular Podcasts

Stuff You Should Know
The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.