AI PULSE - ElevenLabs releases a stand-alone app and How Far Can Reasoning Models Scale - Innovation Pulse: Daily News - AI, Startups, Cleantech, Auto + Learning Extras

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Welcome to Innovation Pulse, your quick no-nonsense update on the latest in AI.

(00:09):
First, we will cover the latest news.
XAI is advancing a file editor for GROC, Salesforce increases AI automation, DeepSeq
faces hardware hurdles, Anthropic Enhances app creation and Google's Imogen 4 impresses.
After this, we'll dive deep into the potential and challenges of scaling AI reasoning models.

(00:34):
Leaked code hints that XAI is working on an advanced file editor with spreadsheet support
for GROC, aiming to rival OpenAI, Google and Microsoft by integrating AI co-pilots into
productivity tools.
Reverse engineer Nima Auji revealed this, highlighting GROC's ability to assist users
while editing files.

(00:55):
Backcrunch has reached out to XAI for confirmation.
Although XAI hasn't shared detailed plans for interactive AI workspaces, their recent
announcements suggest a focus on these tools.
In April 2025, XAI introduced GROC Studio, a split-screen workspace for collaborative

(01:16):
document and code generation.
Users can also organize files and conversations in one place.
Google's Gemini workspace offers similar features, but is limited to Google's ecosystem.
XAI's editor could be a step toward Elon Musk's vision of an everything app that combines

(01:37):
documents, chat and social media.
Salesforce is ramping up its use of AI to automate tasks, as shared by CAO Mark Benioff.
He emphasized the shift towards AI performing tasks previously done by humans, allowing
for more high-value work.

(01:57):
Currently, AI handles 30 to 50% of Salesforce's tasks.
This move reflects a broader trend in the tech industry, where companies are using AI to
cut costs and boost efficiency.
Earlier this year, Salesforce cut over 1,000 jobs as part of its AI-focused restructuring.

(02:18):
Other actions have been seen across the industry, with companies like CrowdStrike and Klarna
reducing their workforce due to AI advancements.
Benioff describes this as a digital labor revolution, with Salesforce achieving 93% accuracy in
AI tasks.
However, he noted that reaching 100% accuracy is unrealistic and other vendors lag behind

(02:44):
due to less data and metadata.
Join us as we step into the challenges of hardware dependency.
DeepSeq's progress with its R2 AI model has stalled due to a shortage of NVIDIA's H20
processors in China.
The R1 model was trained using a vast number of GPUs supplied by high-flyer capital management.

(03:09):
However, it's unclear if R2 is fully pre-trained.
CEO Liang Wenfang is not satisfied with R2's current capabilities and improvements are
ongoing.
The R1 model had wide adoption among various users, but restrictions on H20 shipments are
now affecting its use and complicating R2's launch.

(03:33):
US government restrictions on NVIDIA's H20 sales have impacted Chinese AI companies
reliant on these processors.
DeepSeq, with AI software optimized for NVIDIA, is particularly affected.
Despite claiming efficiency in development, DeepSeq faces challenges due to dependency
on American hardware.

(03:54):
Meanwhile, OpenAI has unofficially accused DeepSeq of using its proprietary models for
R1.
Anthropic has expanded Claude's Artifacts feature to enable AI-powered app creation.
This interactive development platform allows users to build applications that process real-time

(04:16):
input, eliminating deployment complexity and enabling cost-free sharing.
Users can describe their needs in natural language, and Claude generates functional
code with React components, file processing, and Claude API integration.
Currently in beta, the feature is available for free, pro, and max plan users.

(04:38):
The system manages infrastructure, authentication, and API usage automatically, allowing users
to focus on application logic.
While current limitations include no external API calls or persistent storage, the platform
supports sophisticated functionality.
This development opens opportunities for marketing teams to create custom tools without needing

(05:03):
technical resources, addressing repetitive tasks, and enhancing client interactions.
Future expansions may include more integrations and capabilities.
Imogen 4, the latest text-to-image model from Google, is now available for paid preview
in the Gemini API and limited free testing in Google AI Studio.

(05:29):
This model offers improved text rendering and image generation quality over its predecessor
Imogen 3.
There are two versions, the standard Imogen 4, priced at 4 cents per image, and Imogen
4 Ultra, which provides more precise alignment with text prompts and costs 6 cents per image.
These models are designed to meet diverse creative needs, offering versatility in style and content.

(05:56):
Earlier rate limits can be requested, and new billing tiers will be introduced soon.
Users are encouraged to explore the capabilities of Imogen 4 through provided documentation
and cookbooks.
The team is eager to see the creative outcomes users will achieve with these advanced models.

(06:18):
Join us as we discover the future of voice AI.
AI assistants are rapidly emerging, aiming to automate daily tasks and boost productivity.
Eleven Labs has introduced Eleven AI, a voice assistant with over 5,000 voices, setting
itself apart by integrating Anthropics model context protocol, MCP.

(06:40):
This integration allows Eleven AI to connect with tools like Slack and Google Calendar,
creating commands through voice interaction.
Users can clone their own voice if desired.
The MCP ensures secure connections, giving users control over the assistant's actions.
Currently in Alpha, Eleven AI seeks feedback during its experimental phase.

(07:05):
Users can customize voice speed, appearance, and behaviors.
Eleven Labs may use personal data to improve AI models, though it attempts to anonymize
this data.
The company has also launched a mobile app for creating voiceovers in 70 languages.
New users receive 10,000 free characters monthly.

(07:30):
Eleven Labs has launched a mobile app for iOS and Android, allowing users to generate
voice clips from text on the go.
Previously, users needed the web app for this functionality.
The mobile app offers about 10 minutes of free audio generation and lets users choose
different models to balance cost and quality, sharing credits between web and mobile.

(07:52):
It includes access to V3 Alpha models for expressive control.
Jack McDermott, mobile growth lead, noted that many creators were already using mobile
browsers for voice samples, prompting the native app development.
Eleven Labs aims to enhance creative projects for content creators, marketers, and professionals.

(08:14):
The app will compete with tools like Speechify and Captions.
This follows the release of their Reader app, which supports article and e-book listening.
Future plans include features like Speech to Text and AI Agent Tools.
OpenAI is navigating a critical phase with Microsoft, which owns nearly half of OpenAI's

(08:37):
for-profit unit.
They are negotiating restructuring terms.
OpenAI plans to introduce features similar to Microsoft Office 365 and Google Workspace,
integrating generative AI into business IT.
Beyond collaboration tools, OpenAI is building a broader ecosystem with a browser, AI hardware,

(09:00):
and a social content feed in ChatGPT, aiming for more user control in content creation and
sharing.
For businesses using ChatGPT alone, these features could make it a central tool, prompting a
rethink of their software bundles traditionally dominated by Microsoft and Google.
While no official product launches or timelines are known, these moves suggest OpenAI is poised

(09:27):
to significantly impact the enterprise technology sector as AI reshapes productivity tools.
And now, pivot our discussion towards the main AI topic.
Alright everybody, welcome to another deep dive on innovation pulse.

(09:48):
I'm Alex, and as always, I'm here with my co-host, Jakov Lasker.
Today we're tackling something that's been on everyone's mind in the AI world.
Just how far can these new reasoning models actually go?
Hey Alex, great to be back.
And yeah, this is fascinating stuff.
We're talking about models like OpenAI's O3, which are less than a year old, but already

(10:09):
showing these incredible leaps in capability.
The big question is whether we're looking at a new frontier that can scale indefinitely,
or if we're about to hit some pretty hard walls.
Right.
Before we dive into the scaling question, let's make sure everyone's on the same page.
When we say reasoning models, we're not just talking about your typical language model,

(10:30):
are we?
Exactly.
Think of it this way.
Traditional large language models are like really well-read students who can recall information
and write coherently.
But reasoning models are more like students who have learned to show their work, to think
step by step through complex problems.
They go through an additional training stage after the initial pre-training, where they
learn to reason through difficult problems using reinforcement learning.

(10:54):
That's a great analogy, and the results have been pretty dramatic.
OpenAI just showed us that O3 is a 10x scale-up in training compute from O1, and this happened
in just four months.
That's pretty wild, right?
It's absolutely wild.
To put this in perspective, if you imagine compute scaling like climbing a mountain,

(11:16):
most AI development has been a steady climb upward.
But what we're seeing with reasoning models is more like taking a helicopter ride up the
mountain.
The question is, how much fuel is left in the tank?
Lay it on me.
What do we actually know about how much compute these models are using?
This is where it gets really interesting, and honestly a bit murky.
The companies aren't super transparent about this, but we can piece together some clues.

(11:40):
Take DeepSeq's R1 model, which performs similarly to O1 on many benchmarks.
Researchers estimate it used about 6x10 to the 23rd flop for its reasoning training.
That's roughly a million dollars worth of compute.
A million dollars sounds like a lot to regular folks, but in the world of frontier AI training,
that's actually...

(12:00):
That's pocket change.
The pre-training for these models costs tens or hundreds of millions.
So the reasoning training stage is currently just a small fraction of the total cost.
And here's the kicker, Dario Amodi from Anthropic, recently said that companies are working
to scale this reasoning stage up to hundreds of millions and billions of dollars.
So we're still in the early innings.

(12:23):
But here's what I'm wondering.
If they keep scaling at this 10x every few months pace, when do they hit the ceiling?
That's the million dollar question, literally.
If the current reasoning training is around that million dollar mark and they're scaling
10x every few months, let's do some quick math.
10x gets you to 10 million.

(12:43):
Another 10x gets you to 100 million, then a billion.
At the current pace, we're talking about hitting the frontier of total training compute,
which is in the hundreds of billions to trillions, within about a year.
And then what happens?
The helicopter runs out of fuel.
Exactly.
Once reasoning training hits that frontier, its growth rate has to converge with the overall

(13:04):
growth rate of computing power, which is much more modest, about 4x per year instead of
10x every few months.
It's like switching from that helicopter back to hiking up the mountain.
But hold on, scaling isn't just about throwing more GPUs at the problem, right?
Right?
There have to be other bottlenecks.
You're absolutely right to bring that up.

(13:26):
Think about it this way.
These reasoning models need to practice on difficult problems, but there isn't an infinite
supply of good diverse problems out there.
It's like trying to train a chess grandmaster.
At some point, you run out of challenging opponents in novel situations.
That makes sense, and I'm guessing the problems get even trickier when you move beyond math

(13:46):
and coding, which seem to be where these models shine brightest.
Exactly.
Math and coding have clear right and wrong answers, but what about creative writing or
understanding human emotions or navigating complex social situations?
The scaling laws that work beautifully for mathematical reasoning might not translate
to these fuzzier domains.
So we might see this incredible progress continue for another year or so, but then things could

(14:10):
play toe pretty quickly.
That's one scenario, but here's the thing.
The researchers at OpenAI and other companies are extremely optimistic about continued scaling.
They have inside knowledge about what's coming next that we don't.
And historically, when AI researchers say they can push further, they often find ways
to surprise us.
True, but let's think about the implications either way.

(14:32):
What does this mean for the AI landscape over the next couple of years?
If reasoning scaling continues at this breakneck pace, we could see some pretty dramatic improvements
in the short term.
We're talking about models that could potentially handle much more complex coding tasks, advanced
mathematical proofs, maybe even scientific research assistance.
The keyword here is could.

(14:54):
A lot depends on whether these models can generalize beyond their current strengths.
And if they hit those scaling limits sooner rather than later, then we might see progress
slow down to a more traditional pace, but that doesn't mean progress stops.
Think about it like the difference between a rocket launch and a cross-country road trip.
The rocket gets you up fast, but once you're cruising at altitude, steady progress becomes

(15:16):
the name of the game.
That's a helpful way to think about it.
So for our listeners who are trying to make sense of all this, what's the bottom line?
The bottom line is that we're in a really unique moment.
We have this powerful new paradigm that's still early on its scaling curve, which means
we're likely to see some impressive capabilities emerge over the next year or so.

(15:37):
But the laws of physics and economics haven't been repealed.
Eventually, this rapid scaling will slow down and converge with the broader trends in computing
power.
And the smart money is on enjoying the ride while it lasts, but also preparing for a world
where AI progress returns to a more predictable, steady pace.
Exactly, whether you're a developer, a business leader, or just someone trying to understand

(15:59):
where AI is headed, the key is to pay attention to both the short-term acceleration and the
longer-term trajectory.
The capabilities we see emerge in the next year could be pretty remarkable, but they
might also represent the end of this particular phase of rapid scaling.
Fascinating stuff, Yakov.
Thanks for walking us through the technical details and helping us understand what might

(16:21):
be one of the most important questions in AI development right now.
Always a pleasure, Alex.
And to our listeners, keep an eye on this space.
The next year could be pretty wild in terms of reasoning model capabilities.
But remember that sustainable progress often comes from steady, methodical advances rather
than just raw computational power.
Absolutely.
Thanks for tuning in to Innovation Pulse, everyone.

(16:43):
We'll keep tracking these developments and bringing you the insights you need to stay
ahead of the curve.
Until next time.
That wraps up today's podcast as we explore new developments in AI-driven tools and discuss
the limits and potential of AI reasoning models.

(17:04):
Don't forget to like, subscribe, and share this episode with your friends and colleagues
so they can also stay updated on the latest news and gain powerful insights.
Stay tuned for more updates.

All Episodes

AI PULSE - ElevenLabs releases a stand-alone app and How Far Can Reasoning Models Scale

Episode Transcript

Popular Podcasts

New Heights with Jason & Travis Kelce

Dateline NBC

On Purpose with Jay Shetty

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}AI PULSE - ElevenLabs releases a stand-alone app and How Far Can Reasoning Models Scale