AI PULSE - All the announcements from the Google 2025 IO - Innovation Pulse: Daily News - AI, Startups, Cleantech, Auto + Learning Extras

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Welcome to Innovation Pulse, your quick no-nonsense update on the latest in AI.

(00:10):
First, we'll cover the latest news.
Google launches AI Ultra for advanced tools, unveils efficient mobile models, introduces
Gemini's smart glasses and enhances, meet with real-time translation.
After this, we'll dive deep into OpenAI's codecs and its impact on coding.

(00:32):
Google has unveiled Google AI Ultra, a subscription service offering exclusive access to its advanced
AI products.
Priced at $249.99 per month, the plan provides high usage limits, early access to experimental
features and a YouTube premium subscription.

(00:52):
Josh Woodward from Google Labs described it as a VIP pass for AI enthusiasts.
The subscription includes 30 terabytes of storage and access to the Gemini app's Deep
Think Mode for in-depth research, along with tools like Flow for filmmaking and Notebook
LM for converting notes to podcasts.

(01:13):
Additional features include Project Mariner for multitasking and the VO3 Video Generation
tool.
This launch is part of Google's strategy to monetize AI products and diversify revenue
amid slowing ad growth and competition from AI chatbots like OpenAI's chatGPT.
The service is initially available in the US and will expand globally.

(01:40):
Google has unveiled Gemma3n, a new AI model designed to run efficiently on devices like
phones and tablets.
Announced at Google I.O. 2025, Gemma3n supports audio, text, images and videos and can operate
offline, preserving privacy.
It's engineered to function on devices with less than 2GB of RAM.

(02:04):
Alongside Gemma3n, Google introduced MedGemma, a model for health-related text and image
analysis, and SignGemma, which translates sign language into spoken text, focusing on
American Sign Language and English.
Despite criticisms of its licensing terms, Gemma's models remain popular, downloaded

(02:26):
millions of times.
Google's latest releases aim to enhance AI accessibility and functionality across various
applications, inviting developers to innovate further, especially for health and accessibility
needs.
Join us as we discuss the revolutionary Joules Coding Agent.

(02:48):
Google's Joules is an innovative coding agent that autonomously enhances your projects
by fixing bugs and adding features.
As an asynchronous agent, Joules allows you to assign tasks it completes in the background,
operating securely in Google Cloud's virtual machine.
Introduced at the iSLASHO 2025 event, Joules reads and understands your code base, performing

(03:13):
tasks similar to OpenAI codecs, such as writing tests, building features and updating dependencies.
It provides detailed feedback on its actions, including reasoning and change logs.
Your private code remains secure, not used for AI training.
Joules supports parallel task execution, allowing multiple simultaneous assignments and integrates

(03:38):
with GitHub for project context.
Currently, in public beta, it's free to use within certain limits, offering a powerful
tool for developers worldwide.
Smart Glasses powered by Google's Gemini AI can describe surroundings, summarize bookpages
and identify locations and videos.

(04:00):
Google envisions AI integrated glasses as a future tech frontier, planning devices with
partners like Warby Parker.
The Android XR system, co-developed with Samsung, will debut with a mixed reality headset, Project
Muhan, and later focus on glasses.
These AI glasses aim to enhance user experience by connecting to Google Play, offering features

(04:25):
like messaging, music playback and live translation.
Google plans to collaborate with iWare brands to accelerate the launch of these glasses.
Despite competition from Meta and Apple, Google's focus is to make AI more integrated with the
user's environment.
The company is working on a range of products, from basic video see-through headsets to advanced

(04:50):
AI glasses, exploring AR capabilities further.
Google has announced upgrades to its Gemini AI models, introducing DeepThink for the Gemini,
2.5 Pro.
This enhanced reasoning mode allows the model to evaluate multiple responses before answering,

(05:11):
improving its performance in benchmarks.
According to Demis Hassabis from Google DeepMind, DeepThink uses advanced reasoning and parallel
techniques.
While details are scarce, it may resemble open AI's models in synthesizing optimal solutions.
The Gemini 2.5 Pro, with DeepThink outperformed competitors on coding and reasoning tests

(05:36):
and is currently available to select testers.
Google is also updating the Gemini 2.5 Flash model for better efficiency and performance
in coding and reasoning tasks, with a wider release expected in June.
Additionally, Google introduces Gemini Diffusion, a fast model delivering results significantly

(05:58):
quicker than its peers, now available to trusted testers.
Now, we're about to explore AI model preferences.
The AI landscape is evolving rapidly, with shifting model preferences due to constant
upgrades and new releases.
This platform offers access to cutting-edge models, revealing trends in AI usage.

(06:22):
From January to May 2025, there has been significant growth in reasoning models, spurred by Deep
Seek's viral moment.
Image and video generation are becoming more competitive, with models like GPT Image 1
and Kling 2.0 gaining traction.
In audio generation, 11 labs leads, though new competitors are emerging.

(06:48):
Open AI's and Google's models have seen substantial growth in usage, while some older models are
being replaced by newer versions.
The report highlights the dynamic nature of AI development and the ongoing competition
among providers.
Poe aims to be a central hub for exploring and comparing AI models, reflecting broader

(07:10):
shifts in the ecosystem.
At Google I.O. 2025, Google unveiled real-time speech translation for Google Meet.
This feature uses a large language audio model from Google DeepMind, enabling natural conversations
across languages.
Spoken words are translated into the listener's preferred language in real-time while preserving

(07:34):
voice, tone, and expression.
The feature offers various uses, such as connecting English-speaking grandchildren with Spanish-speaking
grandparents or facilitating communication for companies with global teams.
Google highlights the low latency of the speech translation, allowing multiple people
to converse seamlessly, a first for such technology.

(07:58):
When someone speaks, their original voice is faintly heard, with the translation overlaid.
The feature starts rolling out in beta to consumer AI subscribers, initially supporting
English and Spanish, with Italian, German, and Portuguese to follow.
Early testing for businesses will be available to workspace customers this year.

(08:22):
OpenAI has released Codex, an agentic coding tool in research preview, designed to assist
developers by handling simple programming tasks.
Codex generates production-ready code and provides explanations.
It is accessible through the chat GPT web app's sidebar, where users can prompt it to code
or answer questions.

(08:44):
Codex operates within a container, preloaded with the user's code base, mirroring their
development environment.
Developers can enhance Codex's effectiveness by adding an agents.md file to their repository,
which offers custom instructions for the AI, similar to a redmi.md, but tailored for AI

(09:05):
understanding.
Codex is based on Codex One, a version of OpenAI's O3 reasoning model, trained with
reinforcement learning on various coding tasks to efficiently analyze and produce code while
iterating through tests.
Next we'll discuss the screenless AI device concept.

(09:28):
OpenAI CEO Sam Altman aims to launch a groundbreaking AI device by the end of 2026.
Targeting sales of 100 million units, OpenAI envisions this device as essential as a smartphone.
Altman recently acquired a company led by former Apple designer Joni Ive for $6.5 billion,

(09:50):
which could potentially add significant value to OpenAI.
The device, neither a phone nor a wearable, is designed to integrate seamlessly into daily
life, being fully aware of its user's environment.
Despite challenges faced by earlier AI devices, Altman believes this innovation is key for

(10:12):
OpenAI to directly reach consumers.
However, the company faces financial hurdles, not expecting profits until 2029.
The acquisition of Ive's company is a strategic step towards achieving this ambitious goal.
And now, pivot our discussion towards the main AI topic.

(10:39):
Welcome to Innovation Pulse, your window into the world of cutting edge technology.
I'm your host, Alex.
Today we're going to explore all the exciting announcements from Google's biggest developer
conference of the year, Google IO 2025.
This annual event showcases the latest advancements across Google's expansive portfolio, from

(11:02):
Android to AI, and this year's conference was particularly packed with groundbreaking
innovations.
To help us unpack everything that was announced, we have tech analyst and industry insider
Jakob Lasker joining us today.
Thanks for having me on Innovation Pulse, Alex.
It's always a pleasure to discuss these technological milestones, and Google IO 2025 certainly delivered

(11:24):
plenty to talk about.
Why don't you hit me with your first question and we can dive into the details?
Let's start with the headline offering.
Google announced something called Gemini Ultra at a pretty steep price point.
What exactly is it, and what does it offer users willing to pay that premium?
Gemini Ultra is essentially Google's highest tier of AI access, available only in the US

(11:47):
for now, and priced at $249.99 per month.
It's designed for those who want the most advanced AI capabilities Google has to offer.
The subscription includes access to Google's VO3 video generator, their new Flow video editing
app, and an upcoming AI capability called Gemini 2.5 Pro Deep Think Mode, which hasn't

(12:11):
launched yet.
Beyond those flagship features, subscribers also get higher limits in Google's notebook
LM platform and whisk image remixing app, access to the Gemini chatbot in Chrome, agentic
tools powered by Project Mariner, YouTube Premium, and a whopping 30TB of storage across Google

(12:33):
Drive, Photos, and Gmail.
It's clearly aimed at professional creators and power users who can leverage these advanced
AI tools for their work.
That's quite a package.
You mentioned Deep Think Mode as part of this.
Can you elaborate on what this feature is and how it enhances Gemini's capabilities?
Deep Think is an enhanced reasoning mode specifically for Google's flagship Gemini 2.5 Pro model.

(12:58):
What makes it special is its ability to consider multiple answers to questions before responding,
which notably boosts its performance on certain benchmarks.
It's similar to how a human might think through various approaches to a problem before settling
on the best solution.
Google hasn't revealed the specific technical details of how Deep Think works, but industry

(13:19):
speculation suggests it might be similar to OpenAI's 01 Pro and upcoming 03 Pro models,
which likely use an engine to search for and synthesize optimal solutions to complex problems.
Currently, Deep Think is only available to trusted testers via the Gemini API, as Google
is taking additional time to conduct safety evaluations before a wider rollout.

(13:42):
Fascinating!
Let's shift to content creation.
Google unveiled VO3, their latest video-generating AI model.
What capabilities does it bring to the table?
VO3 represents a significant advancement in AI-generated video.
What sets it apart is its ability to generate not just visuals, but also sound effects.
background noises, and even dialogue to accompany the videos it creates.

(14:07):
This creates a much more immersive and complete video experience compared to earlier models.
Google also claims VO3 substantially improves upon its predecessor, VO2, in terms of the
overall quality and realism of the footage it generates.
It's already available in the Gemini Chatbot app for AI Ultra subscribers, and users can

(14:27):
prompt it with either text or images to create videos.
This puts Google in direct competition with other major players in the AI video generation
space.
And it wasn't just video generation that got an upgrade.
Tell us about Imogen 4, Google's latest AI image generator.
Imogen 4 is Google's newest AI image generator, and speed is one of its stand-out features.

(14:49):
It's already faster than Imogen 3, and Google plans to release a variant in the near future
that will be up to 10 times quicker than its predecessor, which is quite remarkable in
terms of computational efficiency.
As for image quality, Imogen 4 excels at rendering fine details, such as fabrics, water droplets,
and animal fur.
It's also versatile, capable of handling both photorealistic and abstract styles, and

(15:15):
can create images in various aspect ratios at resolutions up to 2K.
Both Imogen 4 and VO3 will power flow.
Google's AI-powered video tool designed for filmmakers, creating a comprehensive ecosystem
for visual content creation.
The Gemini app has gained quite a following.
What updates did Google announce for users of these applications?

(15:38):
Google revealed that Gemini apps now have more than 400 million monthly active users,
which is quite impressive.
The most immediate update is that Gemini Live's camera and screen sharing capabilities are
rolling out this week to all users on iOS and Android.
This feature, powered by Project Astra, enables near real-time verbal conversations with Gemini,

(15:58):
while streaming video from the smartphone's camera or screen to the AI model.
Additionally, Google is planning deeper integration with its other apps in the coming weeks.
Soon, Gemini Live will be able to offer directions from Google Maps, create events in Google Calendar,
and make to-do lists with Google Tasks.
They're also updating deep research, Gemini's AI agent that generates thorough research reports,

(16:23):
by allowing users to upload their own private PDFs and images for analysis, making it much
more versatile for personal research needs.
For developers and designers, Google introduced a tool called Stitch.
What problem is it trying to solve?
Stitch is an AI-powered tool specifically designed to help people create web and mobile app front-ends
without extensive coding knowledge.

(16:44):
It generates the necessary UI elements and code based on simple prompts.
Users can describe what they want in a few words or even upload an image, and Stitch will
provide the HTML and CSS markup needed to create those designs.
While Stitch is somewhat more limited in functionality compared to some competing no-code or AI-assisted
development tools, it offers a fair amount of customization options.

(17:09):
This represents Google's entry into the growing space of AI-assisted design and development
tools that aim to make software creation more accessible to non-developers.
They've also expanded access to jewels.
Their AI agent aimed at helping developers fix bugs in code.
Project Mariners sounds like it could change how we interact with websites.

(17:30):
What exactly is it, and what can it do for users?
Project Mariners Google's experimental AI agent that essentially
browsers and uses websites on behalf of users.
Google announced significant updates to how it works.
Most notably the ability to handle nearly a dozen tasks simultaneously, and they're
now rolling it out to users more broadly.
The practical application is quite revolutionary.

(17:54):
Project Mariners users can complete complex online tasks without ever directly visiting
a third-party website.
For example, they can purchase baseball game tickets or buy groceries online by simply
chatting with Google's AI agent, which then visits the necessary websites and takes actions
on their behalf.
This represents a potential shift in how we interact with the web, moving from direct

(18:17):
browsing to delegated conversation-based interactions.
Another intriguing project mentioned was Project Astra.
How does it differ from other AI initiatives?
And what applications does Google envision for it?
Project Astra is Google's low-latency, multimodal AI experience that originated from Google
DeepMind.
What makes it unique is its ability to process multiple types of input, like vision, audio,

(18:41):
and text in near real-time, creating a much more responsive and natural AI interaction
than what's currently standard.
Google is planning to integrate Project Astra across multiple products, including Search,
the Gemini AI app, and even third-party developer offerings.
Perhaps most interestingly, they're building Project Astra glasses with partners, including

(19:04):
Samsung and Warby Parker, suggesting a push into augmented reality applications.
Although they haven't announced a specific launch date yet, this initiative seems positioned
to bring ambient, context-aware AI into our daily lives in new ways.
Google Search is getting an AI upgrade, too, with AI mode.
What new capabilities will users notice when they use this feature?

(19:27):
AI mode is Google's experimental search feature that's rolling out to US users this week.
It transforms the traditional search experience by allowing people to ask complex, multi-part
questions through an AI interface rather than typing keywords, making searches feel more
conversational and potentially more precise.
Some specific enhancements include support for complex data queries in sports and finance,

(19:52):
and try-it-on options for apparel, bringing a virtual shopping element to search.
Perhaps most exciting is Search Live, coming later this summer which will let users ask
questions based on what their phone's camera is seeing in real-time, essentially allowing
you to point your camera at something and immediately get information about it.
Also noteworthy is that Gmail will be the first app supported with personalized context

(20:15):
in AI mode.
The Beam 3D teleconferencing system sounds like something straight out of science fiction.
How does it work, and what makes it different from traditional video calls?
Beam, previously called Starline, is Google's attempt to create truly immersive remote communication.
It uses a sophisticated combination of hardware and software, including a 6-camera array and

(20:38):
custom light field display, to create the illusion that you're sitting across from
someone who might be thousands of miles away.
The technology works by using AI to convert video from multiple cameras positioned at
different angles into a 3D rendering of the person.
Google claims Beam offers near-perfect millimeter-level head tracking and 60fps video streaming.

(21:01):
When used with Google Meet, it also provides AI-powered real-time speech translation that
preserves the original speaker's voice, tone, and expressions.
This moves video conferencing much closer to the experience of being physically present
with another person.
Beyond the specific products we've discussed, Google announced several other AI updates across

(21:22):
their ecosystem.
Could you summarize some of the most significant ones?
There were indeed numerous AI updates across Google's product lineup.
Gemini is coming to Chrome as a browsing assistant that helps users quickly understand
web page context and complete tasks.
They're also launching Gemma 3n, a model designed to run efficiently on phones, laptops,

(21:43):
and tablets that can handle audio, text, images, and videos.
For Workspace, Google announced several AI features coming to Gmail, Google Docs, and
Google Vids.
Gmail is getting personalized smart replies and a new inbox cleaning feature, while Vids
is receiving new content creation and editing capabilities.

(22:06):
Additionally, video overviews are coming to Notebook LM, and they've rolled out Synth
ID Detector, a verification portal that uses Google's watermarking technology to help
identify AI-generated content.
Lyria Real-Time, the AI model powering their experimental music production app, is now
available via an API.
Wearables are a growing segment of Google's ecosystem.

(22:28):
What updates are coming to Wear OS in its sixth iteration?
Wear OS 6 brings several refinements to Google's wearable operating system.
One of the more noticeable changes is a unified font for tiles, creating a cleaner app appearance
across the watch interface.
Pixel Watch users will benefit from dynamic theming that synchronizes app colors with

(22:49):
watch faces, creating a more cohesive visual experience.
The core of this update focuses on a new design reference platform that enables developers
to build better customization options in their apps, along with seamless transitions between
screens and states.
To support developers in implementing these changes, Google is releasing comprehensive

(23:11):
design guidelines along with Figma design files.
These improvements aim to make the Wear OS experience more polished and consistent for
users, while giving developers more tools to create engaging watch applications.
The Google Play Store is a critical touch point for Android users.
What enhancements is Google bringing to this platform?

(23:33):
Google is significantly upgrading the Play Store experience for both developers and users.
For developers, they're introducing fresh tools to handle subscriptions, including multi-product
checkout that allows offering subscription add-ons alongside main subscriptions under
a single payment.
They're also adding dedicated pages for testing and releases, and tools to monitor and improve

(23:53):
app rollouts, with the ability to halt live app releases if critical problems arise.
For users, the Play Store is adding topic pages that allow people to dive into specific
interests and topic browse pages for movies and shows in the US that will connect users
to apps related to specific shows and movies.

(24:14):
Another user-friendly addition is audio samples, giving folks a preview of app content before
downloading.
These changes collectively aim to make app discovery more intuitive, while giving developers
more control over their app distribution.
Android Studio is the development environment of choice for Android developers.
How is Google enhancing it with AI capabilities?

(24:36):
Android Studio is receiving a substantial AI upgrade with the integration of several new
features.
The most significant addition is journeys, an agentic AI capability that coincides with
the release of the Gemini 2.5 Pro model.
This will be complemented by an agent mode designed to handle more complex development
processes than current AI assistants can manage.

(25:00):
Another practical enhancement is an improved crash insights feature in the App Quality
Insights panel.
Powered by Gemini, this feature will analyze an app's source code to identify potential
causes of crashes and suggest specific fixes, potentially saving developers considerable
debugging time.
These additions reflect Google's broader strategy of integrating AI throughout their

(25:21):
developer tools to streamline the app creation process.
This has been an incredibly comprehensive overview.
Looking at all these announcements together, what would you say is the overarching direction
or strategy for Google in 2025 based on what we saw at I.O.?
The strategy emerging from Google I.O. 2025 clearly shows Google's all-in commitment

(25:44):
to AI integration across their entire ecosystem, with particular emphasis on making these
advanced technologies accessible through natural, multimodal interactions.
They're creating a comprehensive AI stack that spans from foundational models like Gemini
2.5 Pro to consumer-facing applications like AI mode in Search.

(26:06):
What's particularly notable is their focus on agentec AI, systems that can take actions
on behalf of users as seen in Project Mariner.
There's also a clear push toward visual computing with Imogen 4, VAO 3, and Project Astra.
Together these initiatives suggest Google is positioning themselves not just for incremental

(26:28):
improvements to existing products, but for a fundamental shift in how we interact with
technology.
Moving away from explicit commands and toward collaborative, contextual assistance that spans
devices and modalities, it's an ambitious vision that builds on their traditional strengths
while pushing into new frontiers.

(26:49):
Yaakov, thank you so much for this enlightening conversation about Google I.O. 2025.
You've helped our listeners understand not just what was announced, but the significance
and potential impact of these innovations.
My pleasure, Alex.
If there's one takeaway for your listeners, it's that the pace of AI innovation continues

(27:11):
to accelerate and Google is positioning itself at the forefront of making these advanced
capabilities accessible and useful in our daily lives.
Whether it's the creative possibilities of VAO 3 and Imogen 4, the efficiency of agentec
AI like Project Mariner, or the immersive experiences enabled by Beam, we're seeing

(27:32):
technology increasingly adapt to human needs rather than the other way around.
I think it's worth watching how these announcements materialize in actual products over the coming
months, particularly how Google addresses the ethical considerations and privacy implications
of these powerful technologies.
The most exciting innovations are those that enhance human capability and creativity while

(27:54):
respecting user agency and privacy, and that balance will be crucial as these tools become
more integrated into our lives.
And that concludes today's episode of Innovation Pulse.
I'd like to thank Yakov Lasker for joining us to decode Google I.O. 2025 and thank you,
our listeners, for tuning in.

(28:15):
If you enjoyed this episode, don't forget to subscribe for more insights into the latest
technological innovations.
Until next time, stay curious, stay informed, and keep your finger on the pulse of innovation.
Thank you, Alex.
Have a great day, everybody, signing off.

(28:59):
We're redefining how we interact with technology across various platforms.
Don't forget to like, subscribe, and share this episode with your friends and colleagues,
so they can also stay updated on the latest news and gain powerful insights.
Stay tuned for more updates.

All Episodes

AI PULSE - All the announcements from the Google 2025 IO

Episode Transcript

Popular Podcasts

Amy Robach & T.J. Holmes present: Aubrey O’Day, Covering the Diddy Trial

Dateline NBC

The Breakfast Club

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}AI PULSE - All the announcements from the Google 2025 IO