Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Logan:
The best image generation and editing model in the world. (00:00):
undefined
Ejaaz:
It's scary how realistic this stuff is. VO3 has kind of like killed the VFX studio. (00:03):
undefined
Logan:
And this is, I think, principally enabled by Vibe Coding. My hope is that it (00:09):
undefined
Logan:
actually ends up creating more opportunity for the experts and the specialists. (00:12):
undefined
Josh:
How much of the tools that you build do you find are built with Vibe Coding? (00:15):
undefined
Logan:
I'm like almost 85% of everything that I do Vibe Coded. (00:19):
undefined
Ejaaz:
I remember when I first booted up a PC and I just had access to all these different (00:22):
undefined
Ejaaz:
wonderful applications all within one suite. This kind of feels like that moment for AI. (00:26):
undefined
Josh:
Gemini is feeling faster, but it's also feeling better, and it's also getting cheaper. (00:31):
undefined
Ejaaz:
What's happening behind the scenes? (00:36):
undefined
Logan:
We crossed quadrillion tokens, which comes after a trillion if you're not. (00:37):
undefined
Logan:
You haven't thought about numbers higher than a trillion before. (00:41):
undefined
Logan:
It's what comes after a trillion, and there's no slowdown in sight. (00:44):
undefined
Josh:
We have an incredibly exciting episode today because we are joined by Logan (00:51):
undefined
Josh:
Kilpatrick. Logan is the product lead working on the Gemini platform at Google DeepMind. (00:54):
undefined
Josh:
We have an exciting announcement to break right here today with Logan, (00:59):
undefined
Josh:
which is the announcement of a model that we previously knew as Nano Banana. (01:02):
undefined
Josh:
The reality is this is a brand new image generation model coming out of Google (01:06):
undefined
Josh:
and you can access it today. (01:10):
undefined
Josh:
So Logan, tell us about this brand new model and what we need to be excited about. (01:12):
undefined
Logan:
Yeah, for people who are not chronically online and seeing all the tweets and everything like that, (01:15):
undefined
Logan:
part of the excitement has been in over the last, I think, like six months, (01:20):
undefined
Logan:
we've seen the emergence of like native image (01:23):
undefined
Logan:
generation editing models historically um you (01:26):
undefined
Logan:
would see models that could actually do a really good job of generating (01:30):
undefined
Logan:
images um they were usually like tend to be like very like beautiful um aesthetic (01:33):
undefined
Logan:
images the challenge was like how do you actually use these things in practice (01:38):
undefined
Logan:
to do a lot of stuff that's where this editing capability is really helpful (01:42):
undefined
Logan:
um and then so we started to see these models that can actually edit images (01:46):
undefined
Logan:
if you could provide an image, (01:49):
undefined
Logan:
it would, and then prompt it, it would, it would actually change that image. (01:51):
undefined
Logan:
What's really interesting though, is this fusion of those two capabilities with (01:55):
undefined
Logan:
the actual base intelligence of the Gemini model. (01:59):
undefined
Logan:
And there's a lot of really cool ways in which this like manifests itself. (02:02):
undefined
Logan:
And we'll look at some examples of this. (02:05):
undefined
Logan:
But it's this benefit of the world knowledge. The model is like smart. (02:06):
undefined
Logan:
So as you, as you ask it to do things, and as you ask it to make changes, (02:10):
undefined
Logan:
it doesn't just like take what you're saying at face value. (02:14):
undefined
Logan:
It takes what you're saying in the context of its understanding of the world (02:17):
undefined
Logan:
and its understanding of physics, its understanding of light and all this other (02:21):
undefined
Logan:
stuff, and it makes those changes. (02:25):
undefined
Logan:
So it's not just blindly making edits or generations. They're actually grounded (02:26):
undefined
Logan:
in reality and context in which that's useful. (02:31):
undefined
Logan:
And we can look at some examples of this. My favorite thing is actually this editing capability. (02:34):
undefined
Logan:
So this is an AI studio, and we'll have a link somewhere, hopefully in the show (02:40):
undefined
Logan:
notes, that will let us do this. My friend Amar, who... (02:43):
undefined
Logan:
Is on our team and drives all of our design stuff um build (02:47):
undefined
Logan:
this and it's called past forward and what you can do is you can (02:51):
undefined
Logan:
put in an image of yourself and it'll regenerate a (02:54):
undefined
Logan:
version of yourself um in this (02:57):
undefined
Logan:
sort of like polaroid-esque vibe um following all the different trends from (02:59):
undefined
Logan:
the last 10 or 20 uh 30 years um so if you look at this example this is from (03:05):
undefined
Logan:
me from the 1950s and And I'm sure I have a picture of my dad from the 1950s (03:12):
undefined
Logan:
somewhere or my grandpa who looks somewhat similar to that. (03:16):
undefined
Logan:
Here's me in the 1980s, which I love. Here's me. (03:20):
undefined
Ejaaz:
Some of these facial expressions are also different. Like you're showing your (03:24):
undefined
Ejaaz:
teeth more in some and then it's a smirk in others. That's super cool. (03:27):
undefined
Logan:
I like this sweater. I actually have a sweater that almost looks exactly like (03:32):
undefined
Logan:
this 1970s one, though I don't like my hair in this 1970s one. (03:35):
undefined
Logan:
Same with the 2000s. um so one of the cool things (03:39):
undefined
Logan:
about this uh this new model and one of the features i (03:42):
undefined
Logan:
think folks are going to be most excited about is this character consistency which (03:45):
undefined
Logan:
is as you took the original image um and you (03:48):
undefined
Logan:
made the translation to this 1950s image it actually (03:51):
undefined
Logan:
looks like me still um which is really cool so (03:54):
undefined
Logan:
there's lots of these really interesting use cases i think we'll (03:57):
undefined
Logan:
we'll go out with a um um like a (04:00):
undefined
Logan:
sports card demo where you can sort of turn yourself into (04:03):
undefined
Logan:
a you know a figurine sports card um (04:06):
undefined
Logan:
which is really cool so lots of really interesting examples like (04:09):
undefined
Logan:
this um and another thing you'll notice is actually the (04:12):
undefined
Logan:
speed and this is where the underlying model is not (04:15):
undefined
Logan:
uh the code name was nano nano banana um (04:18):
undefined
Logan:
the actual model is built on gemini 2.5 flash which is our workhorse model it's (04:21):
undefined
Logan:
super fast it's super efficient um it's relatively priced in the market which (04:26):
undefined
Logan:
is awesome so you can actually use it at scale um and yeah so this model behind (04:30):
undefined
Logan:
the scenes or for developers who people want to build with it is Gemini 2.5 flash image, (04:35):
undefined
Logan:
which is awesome. So this is a use case that I love. (04:40):
undefined
Logan:
And it's a ton of fun. You can do this in the Gemini app or in AI Studio. (04:44):
undefined
Ejaaz:
I mean, as you said, the character consistency just from these examples is like astounding. (04:49):
undefined
Josh:
I need to give a round of applause. This has been my biggest issue when I'm (04:57):
undefined
Josh:
generating images of myself. (04:59):
undefined
Ejaaz:
Genuinely. And Josh and I are early users of, you know, Mid Journey V1, (05:01):
undefined
Ejaaz:
OpenAI's Image Generator as well. (05:07):
undefined
Ejaaz:
And one of our pet peeves was it just couldn't do the most simplistic things, (05:10):
undefined
Ejaaz:
right? We could just say, hey, keep this photo and portrait of me exactly the same. (05:14):
undefined
Ejaaz:
But can you show me what I would look like in a different hairstyle or me holding (05:18):
undefined
Ejaaz:
a bottle of Coca-Cola instead of this martini? (05:22):
undefined
Ejaaz:
And it just could not do that, right? Just simple video, like photo editing. (05:25):
undefined
Ejaaz:
Can you give us a bit of a background as to what Google did to be able to achieve this? (05:29):
undefined
Ejaaz:
Because, you know, I've been racking my head around like, why other AI companies (05:35):
undefined
Ejaaz:
couldn't do this? Like, what's happening behind the scenes? Can you give us a bit of insight? (05:38):
undefined
Logan:
Yeah, that's a good question. I think this actually goes back to and I'll share (05:43):
undefined
Logan:
another example in a second as well. (05:46):
undefined
Logan:
But I think this goes back to this story of what happens when you build a model (05:47):
undefined
Logan:
that has the fusion of all these capabilities together. (05:52):
undefined
Logan:
And I was actually just This is a sort of parallel example to this, (05:55):
undefined
Logan:
but it's another example of why building a unified model to do all this stuff (05:58):
undefined
Logan:
and not having a separate model that doesn't have world knowledge and all these (06:02):
undefined
Logan:
other capabilities is useful. The same thing is actually true on video. (06:06):
undefined
Logan:
Like part of the story, and we haven't, we have a bunch of stuff coming that (06:09):
undefined
Logan:
sort of tells this a little bit more elegantly than I will right now. (06:14):
undefined
Logan:
But part of the story of like VO3 having this really state-of-the-art video (06:16):
undefined
Logan:
generation capabilities, if folks have seen this, is that the Gemini models (06:22):
undefined
Logan:
themselves have this state-of-the-art video understanding capabilities. (06:27):
undefined
Logan:
And a very similar context, actually, on the image side, which is since the (06:31):
undefined
Logan:
original Gemini model, we've like, (06:37):
undefined
Logan:
with the exception of probably a couple of months in that like two and a half (06:39):
undefined
Logan:
year time horizon, have had state-of-the-art image understanding capabilities. (06:42):
undefined
Logan:
And I think there is this like capability transfer, which is really interesting (06:46):
undefined
Logan:
as you go to do the generation step. (06:51):
undefined
Logan:
And if you can fuse those two things together in the same model, (06:53):
undefined
Logan:
you end up just being able to do things that other models aren't able to do. (06:56):
undefined
Logan:
And this was part of the bet originally that like, why build Gemini to be the original Gemini? (07:00):
undefined
Logan:
Gemini 1.0 model was built to be natively multimodal. It was built to be natively (07:06):
undefined
Logan:
multimodal because the belief at the time, (07:10):
undefined
Logan:
and I think this is turning out to be true is that like that's on the path to (07:13):
undefined
Logan:
AGI is that you combine these capabilities together and like similar to what (07:16):
undefined
Logan:
humans are able to do we have this fusion of all these capabilities in a single (07:20):
undefined
Logan:
entity just like these models should be able to do. (07:24):
undefined
Ejaaz:
Wow. So if I were to distill what you just said here, Logan, (07:28):
undefined
Ejaaz:
the way you've trained Gemini 2.5 or all future Google Gemini models is it's (07:32):
undefined
Ejaaz:
in a very multimodal fashion. (07:39):
undefined
Ejaaz:
So you're basically, it gets smarter in one particular facet, (07:41):
undefined
Ejaaz:
which trains itself or has transferable capabilities to other facets, (07:45):
undefined
Ejaaz:
whether it's image generation, video generation, or even text LLMs to some extent. (07:49):
undefined
Ejaaz:
I just think that's fascinating. (07:53):
undefined
Ejaaz:
I'm curious, I have one question for you, which I want to hear your take on. (07:55):
undefined
Ejaaz:
How are you going to surface this to the regular consumer, right? (07:59):
undefined
Ejaaz:
Because right now, you provide all of these capabilities through an amazing (08:03):
undefined
Ejaaz:
suite, you know, called Google AI Studio. (08:07):
undefined
Ejaaz:
But if I wanted to use this in, say, an Instagram app, or my random photo imaging (08:10):
undefined
Ejaaz:
editing app, is this something that could be easily proved to someone or sourced? (08:16):
undefined
Ejaaz:
Or do we need to go via some other route right now. (08:21):
undefined
Logan:
Let me just diverge really quickly, which is if any of the researchers who I (08:23):
undefined
Logan:
work with are watching this, they will tell me, they'll make sure that I know (08:29):
undefined
Logan:
that capability transfer that we just talked about, you oftentimes don't get that out of the box. (08:32):
undefined
Logan:
So there is some emergence where you get a little bit of that. (08:38):
undefined
Logan:
You do have to do, there's real true research and engineering work that has (08:42):
undefined
Logan:
to happen to make sure that that capability fusion happens. (08:46):
undefined
Logan:
It's not often that you just like make the model really (08:49):
undefined
Logan:
good at one thing and then it translates oftentimes actually it's (08:52):
undefined
Logan:
like uh it has a negative effect which is as you (08:55):
undefined
Logan:
make the models really good at code for example you trade (08:58):
undefined
Logan:
that off against some other you know creative writing as as a random example (09:01):
undefined
Logan:
of this um so you have to do a lot of like active research and engineering work (09:05):
undefined
Logan:
to make sure that you don't lose a capability as you make another one better (09:09):
undefined
Logan:
but then ultimately they benefit as if you can make them on the same level they (09:12):
undefined
Logan:
benefit from this interleaved capability together. (09:16):
undefined
Logan:
To answer the question about like, where is this going to be available? (09:19):
undefined
Logan:
The Gemini app is the place that like for by and large, most people should be going to. (09:22):
undefined
Logan:
So if you go to Gemini.Google.com, there'll be sort of a landing page experience (09:26):
undefined
Logan:
that showcases this new model and makes it really easy. (09:30):
undefined
Logan:
And you can put in all your images and do tons of fun stuff like the example that I was showing. (09:33):
undefined
Logan:
If you're a developer and you want to build something with this, (09:38):
undefined
Logan:
in AI Studio, we have this build tab. (09:41):
undefined
Logan:
And that's what we were just looking at as an example of one of the applets (09:43):
undefined
Logan:
that's available in the build tab, (09:47):
undefined
Logan:
the general essence is that all of these applets can be forked and remixed and (09:50):
undefined
Logan:
edited and modified so that you can keep doing all the things that you want (09:54):
undefined
Logan:
to do with the AI capability built in. (09:58):
undefined
Logan:
So it'll continue to be powered by the same model. It'll do all that stuff, which is awesome. (10:01):
undefined
Logan:
So there's lots of cool fusion capabilities that we have with this. (10:06):
undefined
Logan:
Same thing with this other example that we're looking at. So if you want to (10:10):
undefined
Logan:
go outside of this environment, we have an API. (10:14):
undefined
Logan:
You could go and build whatever. saw if your website is, you know, (10:16):
undefined
Logan:
AI photos.com or whatever, you could go and build with the Gemini API, (10:20):
undefined
Logan:
use the new Gemini 2.5 flash image model to do a bunch of this stuff, which is awesome. (10:24):
undefined
Josh:
Awesome. So while this is baking, I noticed you had another tab open, (10:29):
undefined
Josh:
which means maybe there's another demo that you were prepared to share. (10:32):
undefined
Logan:
There is another demo. This one I actually haven't tried yet. (10:35):
undefined
Logan:
But it's this idea of like, how can you take a photo editing experience and (10:39):
undefined
Logan:
make it super, super simple? So I'll grab an image. (10:44):
undefined
Logan:
Actually, we'll take this picture, which is a picture of Demis and I. (10:48):
undefined
Josh:
Legends. (10:52):
undefined
Logan:
We'll put an anime filter on it and we'll see. (10:53):
undefined
Logan:
And so this is a completely vibe coded UI experience and all the code behind (10:56):
undefined
Logan:
the scenes is vibe coded as well. (11:00):
undefined
Logan:
And we'll see how well this works with Demis and I. (11:01):
undefined
Josh:
How much of the tools that you build do you find are built with vibe coding (11:05):
undefined
Josh:
instead of just hard coding software? (11:08):
undefined
Josh:
Are you writing a lot of this as vibe coded through the Gemini model? (11:11):
undefined
Logan:
I think you sometimes you're able to do some of the stuff, (11:14):
undefined
Logan:
completely vibe coded um it depends on like (11:18):
undefined
Logan:
how specific that you want to do i do i'm like almost 85 (11:21):
undefined
Logan:
of everything that i do vibe coded somebody else on (11:25):
undefined
Logan:
my team built this one so i don't want to misrepresent the work it could (11:27):
undefined
Logan:
have it could have all been human programmed because we have an incredible set (11:30):
undefined
Logan:
of engineers the general idea is how can you make this oh interesting how can (11:34):
undefined
Logan:
you make this photoshop like experience let's go 90 or do you have suggestions (11:39):
undefined
Logan:
what would a good filter for this be i don't know oh. (11:45):
undefined
Josh:
Man i yeah like perhaps uh going back to (11:48):
undefined
Josh:
the last example maybe like a a 90s film or an (11:50):
undefined
Josh:
80s film grain all right and i guess while we wait for that to load is there (11:53):
undefined
Josh:
a simple way that you would describe a nano banana or this new image model to (11:56):
undefined
Josh:
just the average person on the street who's oh look there we go we have the (12:01):
undefined
Josh:
film grain okay so what we're watching for the people who are listening um you're (12:04):
undefined
Josh:
retouching you can retouch parts of the image you could crop adjust there are filters to be applied (12:08):
undefined
Logan:
I'm just clicking through buttons to be honest, I've never done it before. So it's been fun. (12:12):
undefined
Logan:
Live demo, day one. This is the exploration you're going to get to do as a user (12:17):
undefined
Logan:
as you play around with this. (12:20):
undefined
Ejaaz:
Logan is vibe editing and that's what's happening. Yeah. He's experimenting. (12:21):
undefined
Logan:
Vibe editing, which is fun. I love it. That's a great way. And the cool thing, (12:25):
undefined
Logan:
again, is like what I love about this experience is as you're going through, (12:29):
undefined
Logan:
oh, interesting, this one's like giving me edited outline. (12:32):
undefined
Josh:
Oh, yeah, a little outline. This is helpful for our thumbnail generation. We do a lot of this stuff. (12:36):
undefined
Logan:
Let's see if I can remove the background as well. (12:40):
undefined
Ejaaz:
Oh, yeah. Let's see. I should be. (12:42):
undefined
Josh:
If this removes the background, this is going to be trouble because this is (12:44):
undefined
Josh:
a big feature that we use for a lot of our imagery. (12:47):
undefined
Logan:
Hopefully. Come on. Oh, nice. (12:49):
undefined
Ejaaz:
Oh, done. (12:51):
undefined
Josh:
Nicely done. (12:53):
undefined
Ejaaz:
For those of you who are listening, he's typed in, put me in the Library of (12:54):
undefined
Ejaaz:
Congress. So we're going to hopefully see Logan. (12:58):
undefined
Logan:
Yeah, the context on that image was that Demis and I were in the Library of the DeepMind office. (13:02):
undefined
Ejaaz:
Oh, nice. (13:08):
undefined
Logan:
Yeah, so that was the Library of Congress reference in my mind. (13:09):
undefined
Logan:
But yeah, so much that you can do. (13:13):
undefined
Logan:
Again, what I love about this experience is that as you go around and play with (13:14):
undefined
Logan:
this stuff, if you want to modify this experience, you can do so on the left-hand side. (13:19):
undefined
Logan:
If you say, actually, here are these five editing features that I really care (13:24):
undefined
Logan:
about, the model will go and rewrite the code, and then it'll still be attached (13:28):
undefined
Logan:
to this new 2.5 flash image model. (13:32):
undefined
Logan:
So you can do all these types of cool stuff. This experience is something that (13:36):
undefined
Logan:
I'm really excited about that we've been pushing on. (13:40):
undefined
Josh:
Yeah, this is amazing because I myself, I do photography a lot. (13:42):
undefined
Josh:
I was a photographer in my past life and I rely very heavily on Photoshop and (13:45):
undefined
Josh:
Lightroom for editing, which is a very manual process. (13:49):
undefined
Josh:
And they have these smart tools, but they're not quite like this. (13:52):
undefined
Josh:
I mean, this saves a tremendous amount of time if I could just say, (13:54):
undefined
Josh:
hey, realign, restrain the image, remove the background, add a filter. (13:57):
undefined
Josh:
I think the plain English version of this makes it really approachable, but also way faster. (14:01):
undefined
Logan:
Yeah, it is. It is crazy fast. I think about this all the time. (14:07):
undefined
Logan:
Like there's definitely cases where you want to go deep with whatever the pro tool is. (14:10):
undefined
Logan:
I think there's, there's actually something interesting, like on the near horizon (14:16):
undefined
Logan:
that our team has thought a lot about, which is how you can have this experience (14:19):
undefined
Logan:
and how you can sort of in a, in a generative UI capacity, (14:23):
undefined
Logan:
have the experience sort of subtly expose additional detail to users. (14:29):
undefined
Logan:
And I think about this like if you're a new you know photoshop user (14:33):
undefined
Logan:
as an example and you show up like the chance that you're (14:36):
undefined
Logan:
going to use all of the bells and whistles is zero like you want (14:39):
undefined
Logan:
like the three things i want to remove a background i want to crop something (14:42):
undefined
Logan:
whatever it is like don't actually show this all of these bells and whistles (14:45):
undefined
Logan:
i think the exciting thing about like the progress on coding models is that (14:50):
undefined
Logan:
in the future the challenge with the challenge with doing this in the present (14:54):
undefined
Logan:
rather is that software is deterministic. (14:58):
undefined
Logan:
You have to build software to build the sort of like modified version of that (15:00):
undefined
Logan:
software for all of these different like skill sets and use cases is extremely (15:04):
undefined
Logan:
expensive. It's not feasible. (15:08):
undefined
Logan:
It doesn't scale to production environments. But if you can have this generative (15:10):
undefined
Logan:
UI capability where like the model sort of knows and as you talk to the model, (15:14):
undefined
Logan:
it realizes, oh, you might actually benefit from these other things. (15:18):
undefined
Logan:
It can create the code to do that on the fly and expose them to you, (15:21):
undefined
Logan:
which is really interesting. (15:24):
undefined
Logan:
So I think there's lots of stuff that is going to be possible as the models keep getting better. (15:26):
undefined
Josh:
This is amazing. So the TLDR on this new announcement, how would I, (15:31):
undefined
Josh:
if I were to go explain to my friend what this does, why this is special, (15:36):
undefined
Josh:
how would you kind of sell it to me? (15:39):
undefined
Logan:
The best image generation and editing model in the world, 2.5 flash image or (15:41):
undefined
Logan:
nano banana, whichever you prefer, is the model that can do this. (15:47):
undefined
Logan:
And I think there's so many creative use cases where you're actually bounded (15:52):
undefined
Logan:
by the creative tool. And I feel like this is one of these examples to me where (15:57):
undefined
Logan:
it's like, I feel like I'm 10x more. (16:01):
undefined
Logan:
I was literally helping my friend yesterday doing a bunch of iterations on his (16:03):
undefined
Logan:
LinkedIn picture because it was like, you know, the background was slightly (16:08):
undefined
Logan:
weird or something like that. (16:12):
undefined
Logan:
And we were just like, I did like 15 iterations and now he's got a great new (16:13):
undefined
Logan:
LinkedIn background, which is awesome. (16:17):
undefined
Logan:
So like, there's so many like actual practical use cases where you, (16:19):
undefined
Logan:
and I literally just like built a custom tool on the fly vibe coding in order (16:23):
undefined
Logan:
to solve that use case, which was a ton of fun. (16:28):
undefined
Josh:
Yeah, this is so cool. Okay, so this model, Nano Banana Gemini 2.5 Flash Image Gen, it's out today. (16:31):
undefined
Josh:
So we'll link that in the description for people who want to try it out. (16:37):
undefined
Josh:
I think one of my complaints for the longest time, and I've mentioned this on (16:39):
undefined
Josh:
the show a few times, is a lot of times when I'm engaging with this incredible (16:41):
undefined
Josh:
form of intelligence, I just have a text box. (16:45):
undefined
Josh:
And it's up to me to kind of pull the creativity out of my own mind. (16:47):
undefined
Josh:
And I don't get a lot of help along the way. but one of the things that you (16:50):
undefined
Josh:
spend your time in is this thing called Google AI Studio. (16:53):
undefined
Josh:
And I've used AI Studio a lot because it solves a problem for me that was annoying, (16:56):
undefined
Josh:
which is just the blank text box. (17:00):
undefined
Josh:
It kind of has a lot of prompts. It has a lot of helpers. It has a lot of guidance (17:02):
undefined
Josh:
into helping me extract value out of the model. (17:05):
undefined
Josh:
So what I'd love for you to do for people who aren't familiar, (17:08):
undefined
Josh:
Logan, is just kind of explain to everyone what Google AI Studio is and why (17:10):
undefined
Josh:
it's so important and why it's so great. (17:14):
undefined
Logan:
Yeah, I love this, Josh. I appreciate that you like using AI Studio. (17:17):
undefined
Logan:
Go. It is a labor of love. Lots of (17:21):
undefined
Logan:
people across Google have put in a ton of time to make progress on this. (17:25):
undefined
Logan:
I really want to, so I'll make a caveat, which is we have this entirely redesigned (17:28):
undefined
Logan:
AI Studio experience that's coming very soon. (17:33):
undefined
Logan:
I won't spoil it in this episode because it's like half faked right now. (17:35):
undefined
Logan:
And I wish I could show, and I think actually some of the features that you (17:40):
undefined
Logan:
might see in this UI might be slightly different at launch time than what you see here. (17:43):
undefined
Logan:
So take this with a grain of salt. We've got a bunch of new stuff coming. (17:47):
undefined
Logan:
And I think actually it should help with this problem that you're describing, (17:51):
undefined
Logan:
which is as you show up to a bunch of these tools today, the onus is really (17:55):
undefined
Logan:
on you as a user to like try to figure out what's capable, (17:58):
undefined
Logan:
what all the different models are capable of, what even are all the different (18:02):
undefined
Logan:
models, like all of that stuff. (18:06):
undefined
Logan:
So at a high level, like we built AI Studio for this like AI builder audience. (18:08):
undefined
Logan:
If you want to take AI models and actually build something with them and not (18:13):
undefined
Logan:
just, you know, chat to AI models, this is the product that was built for you. (18:18):
undefined
Logan:
We have a way to, in this like chat UI experience, sort of play with the different (18:22):
undefined
Logan:
capabilities of the model, feel what's possible. What is Gemini good at? What's it not good at? (18:27):
undefined
Logan:
What are the different tools it has access to? But as you go into iStudio, (18:33):
undefined
Logan:
you'll see something that looks like this. (18:37):
undefined
Logan:
You know, we're highlighting a bunch of the new capabilities that we have right now. (18:38):
undefined
Logan:
This URL context tool, which is really great for information retrieval, (18:42):
undefined
Logan:
this native speech generation capability, which is really cool. (18:45):
undefined
Logan:
Folks have used notebook lm and you want to build a (18:48):
undefined
Logan:
notebook lm like experience um we have (18:51):
undefined
Logan:
an api for for people who want to build something like that and (18:54):
undefined
Logan:
we have this live audio to audio dialogue experience where (18:57):
undefined
Logan:
you can share a screen with the model and talk to it and it can see the things (19:00):
undefined
Logan:
that you see and engage with it of course we have our native image generation (19:04):
undefined
Logan:
and editing model the old version 2.0 flash now the new version 2.5 flash um (19:08):
undefined
Logan:
and lots of other stuff that's available as you sort of experience what these models are capable of. (19:13):
undefined
Logan:
Um, so really this playground experience is one version. We have this chat prompt (19:18):
undefined
Logan:
on the left-hand side. We have this stream. (19:22):
undefined
Logan:
This is where you can talk to Gemini and sort of share your screen. (19:25):
undefined
Logan:
And, um, actually you can like show it things on the webcam and be like, what's this? (19:28):
undefined
Logan:
How do I use this thing? You can do this on mobile as well, which is really cool. (19:33):
undefined
Logan:
We have this generative media experience where like, if you want to build things (19:37):
undefined
Logan:
with, we have a music model, we have a VO, which is our video generation model. (19:40):
undefined
Logan:
We have all the text to speech stuff, which is really cool. (19:45):
undefined
Logan:
As I overwhelm people with so much stuff that you can do in AI Studio. (19:48):
undefined
Logan:
The sort of key threat of all this is we built AI Studio to showcase a bunch of these capabilities. (19:51):
undefined
Logan:
And everything you see in AI Studio has an underlying sort of API and developer experience. (19:57):
undefined
Logan:
So if you want to build something like any of these experiences, all of this is possible. (20:02):
undefined
Logan:
There's like no Google secret magic that's happening pretty much anywhere in AI Studio. (20:07):
undefined
Logan:
It's all things that you could build as someone uh using a (20:13):
undefined
Logan:
vibe coding product or you know by hand writing the (20:16):
undefined
Logan:
code um you could build all these things and even more (20:19):
undefined
Logan:
um and that is the perfect segue to this (20:22):
undefined
Logan:
build tab where we're trying to help also you know actually help you get started (20:24):
undefined
Logan:
building a bunch of stuff so you can use these templates that we have you can (20:29):
undefined
Logan:
use a bunch of the suggestions you can look through our gallery of a different (20:32):
undefined
Logan:
stuff um and we're really in this experience trying to help you build ai powered (20:36):
undefined
Logan:
apps which we think is something that folks are really really excited about and, (20:41):
undefined
Logan:
we'll have much more to share around all the ai app building stuff uh in the near future. (20:45):
undefined
Josh:
Awesome thanks for the rundown so as i'm looking at this i'm wondering who do (20:50):
undefined
Josh:
you think this is for what type of person should come to ai studio and tinker around here (20:54):
undefined
Logan:
Yeah so i think you know historically and and so you'll see a little bit of (20:58):
undefined
Logan:
this transition if you play around the product where there's some interesting (21:02):
undefined
Logan:
edges we were originally focused on building for developers So it was built (21:06):
undefined
Logan:
and there is like a part of the experience which like is tied to the Gemini (21:10):
undefined
Logan:
API, which tends to be used mostly by developers. (21:14):
undefined
Logan:
So if you go to dashboard, you can see all your API keys and check your usage (21:16):
undefined
Logan:
and billing and things like that. By and large, though, I think the really cool (21:21):
undefined
Logan:
opportunity of what's happening right now is this transition of like who is creating software. (21:25):
undefined
Logan:
And this is, I think, principally enabled by Vibe Coding. (21:31):
undefined
Logan:
And because of that, like we've recentered ourselves to be really focused on (21:34):
undefined
Logan:
this AI builder persona, which is like people who want to build things using AI tools. (21:38):
undefined
Logan:
Also, people who are trying to build AI experiences, we think is going to be (21:44):
undefined
Logan:
the market that creates value for the world. (21:47):
undefined
Logan:
So if you're excited about all the things that you're seeing, (21:51):
undefined
Logan:
if you want to build things, AI Studio is very much like a builder first platform. (21:53):
undefined
Logan:
If you're just looking for like a great everyday AI assistant product, (21:57):
undefined
Logan:
you, you know, want to get help on coding questions or homework or life advice (22:01):
undefined
Logan:
or all that type of stuff, (22:08):
undefined
Logan:
the Gemini app is the right place for this. It's very much like a. (22:09):
undefined
Logan:
DAU type of product where like you come back and it has memory and personalization (22:14):
undefined
Logan:
and all this other stuff, um, which makes it really great as like an, (22:19):
undefined
Logan:
uh, as a, as an assistant to help you in your life versus AI studio. (22:22):
undefined
Logan:
The artifact is like, we help you create something and then you go put that (22:27):
undefined
Logan:
thing into the world in some way. (22:31):
undefined
Logan:
Um, and you don't necessarily need to come back and use it every day. (22:32):
undefined
Logan:
You use it whenever you want to build something. (22:36):
undefined
Ejaaz:
It's funny. Um, I'm dating myself a bit here, (22:37):
undefined
Ejaaz:
but I remember when I first booted up a PC and I loaded up Microsoft Office (22:40):
undefined
Ejaaz:
and I just had access to all these different wonderful applications that were (22:45):
undefined
Ejaaz:
at the time super new or within one suite. (22:49):
undefined
Ejaaz:
This kind of feels like that moment for AI. (22:52):
undefined
Ejaaz:
And you might not take that as a compliment because it's a completely different (22:55):
undefined
Ejaaz:
company, but it was what I built my childhood off of and my fascination with computers. (22:58):
undefined
Ejaaz:
So I appreciate this and I love that it's this massively like cohesive experience. (23:03):
undefined
Ejaaz:
But kind of zooming out, Logan, I was thinking a lot about Google AI and what (23:08):
undefined
Ejaaz:
that means to me personally. (23:14):
undefined
Ejaaz:
I have to say it's the only company that I think beyond an LLM. (23:16):
undefined
Ejaaz:
And what I mean by that is when I think of Google AI, I don't just think of Gemini. (23:22):
undefined
Ejaaz:
I think of the amazing image gen stuff that you have. (23:27):
undefined
Ejaaz:
I think of the amazing video outputs that you guys have. I think of the text-to-voice (23:30):
undefined
Ejaaz:
generation that you just demoed and all those kinds of things. (23:36):
undefined
Ejaaz:
I remember seeing this advert that appeared on my timeline. (23:39):
undefined
Ejaaz:
And I remember thinking, wow, this must be the new GTA. Then I was like, (23:43):
undefined
Ejaaz:
no, no, that's Florida. That's Miami. (23:47):
undefined
Ejaaz:
No, people are doing wild stuff. That's an alien. Hang on a second. This can't be real. (23:51):
undefined
Ejaaz:
And then I learned that it was a Google VO3 generation of an advert for Kalshi, (23:55):
undefined
Ejaaz:
which is like this, you know, prediction markets situation. (24:00):
undefined
Ejaaz:
And I remember thinking, how on earth have we got to AI generated video that (24:04):
undefined
Ejaaz:
is this high quality and this high fidelity? (24:09):
undefined
Ejaaz:
I think in my mind, VO3 has kind of like killed the VFX studio. (24:12):
undefined
Ejaaz:
It's kind of killed a lot of Hollywood production studios as well. (24:17):
undefined
Ejaaz:
Give me a breakdown and insight into how you built or how you guys built VO3 (24:23):
undefined
Ejaaz:
and what that means for the future of movie video production and more (24:28):
undefined
Logan:
Yeah that's a great question i think there's something really interesting along (24:33):
undefined
Logan:
these threads and and not to not to push back on the notion that it's killing (24:37):
undefined
Logan:
hollywood because i think i think there is like um, (24:42):
undefined
Logan:
I think it's an interesting conversation. The way that I have seen this play (24:45):
undefined
Logan:
out and the great example of this, that folks have seen Flow, (24:49):
undefined
Logan:
which is our sort of like creative video tool. (24:53):
undefined
Logan:
And if you're using VO and you want to sort of get the most out of VO, (24:58):
undefined
Logan:
Flow is the tool to do that. (25:01):
undefined
Logan:
If you see lots of like the creators who are building, you know, (25:02):
undefined
Logan:
minute long videos using VO and it's like this really cohesive story and it (25:06):
undefined
Logan:
has like a clear visual identity, similar to what you'd get from like a. (25:10):
undefined
Logan:
Probably not the extent of a Hollywood (25:15):
undefined
Logan:
production, but like somebody thoughtfully choreographing a film. (25:17):
undefined
Logan:
Flow is the product to do that. And actually interesting, like Flow was built (25:21):
undefined
Logan:
in conjunction with filmmakers. (25:25):
undefined
Logan:
And I think that's actually like there is, and I feel this way about vibe coding as well. (25:28):
undefined
Logan:
And it's this thought experiment that I'm always running through in my head, (25:32):
undefined
Logan:
which is, yes, I think AI is like raising the bar forever or it's raising the (25:36):
undefined
Logan:
floor for everyone. We're like, now everyone can create. (25:41):
undefined
Logan:
What does that mean for people who have expertise? (25:44):
undefined
Logan:
And I think in most cases, what it means is actually the value of your expertise (25:49):
undefined
Logan:
continues to go up. And like, this is my personal bet. (25:52):
undefined
Logan:
And I don't know how much this tracks to like everyone else's worldview. (25:55):
undefined
Logan:
My personal bet is that expertise in the world where the floor is lifted for (25:58):
undefined
Logan:
everyone across all these dimensions is actually more important because there was something about, (26:02):
undefined
Logan:
and I think like video production is a great example for me because I would (26:08):
undefined
Logan:
never have been able to make a video. (26:12):
undefined
Logan:
Like it's not in the cards, like for my skillset, my creative ability, (26:14):
undefined
Logan:
my financial ability, like I will never be able to make a video. (26:18):
undefined
Logan:
I can make things with VO. Um, and now I'm like a little bit closer to imagining (26:21):
undefined
Logan:
like, okay, if I'm serious about this, I need to go out and like actually engage with people. (26:26):
undefined
Logan:
And I've like, sort of, it's like whetted my appetite in a way that I don't (26:32):
undefined
Logan:
think I would. It was just like too far in a way. (26:36):
undefined
Logan:
And I think software is another example where Vibe Coding, if you were to pull (26:38):
undefined
Logan:
a random person off the street and you start talking to them about coding and (26:43):
undefined
Logan:
seeing C++ and deploying stuff and all this, they're like, brain turns off, not interested. (26:46):
undefined
Logan:
I don't want to learn to code. That's not cool. It's not fun. It sounds horrible. (26:53):
undefined
Logan:
And then Vibe Coding rolls around and it's like, oh, wait, I can actually build (26:57):
undefined
Logan:
stuff. And like, yeah, I don't really need to understand all of the details. (27:00):
undefined
Logan:
But there's still a limit to what I can build and who is actually well positioned (27:04):
undefined
Logan:
to help me take the next step. (27:09):
undefined
Logan:
Like I, you know, vibe code something. I'm like, this is awesome. (27:11):
undefined
Logan:
I share with my friends. They all love it. (27:14):
undefined
Logan:
I want to, you know, go build a business around this thing that I vibe coded. (27:17):
undefined
Logan:
There's still a software engineer that needs to help make that thing actually (27:21):
undefined
Logan:
happen. So if anything, it's like it's increasing this. (27:24):
undefined
Logan:
I mean, on the software side, there's this infinite demand for software, (27:28):
undefined
Logan:
and it's increasing the total addressable market of like what software engineers (27:31):
undefined
Logan:
need to help people build. (27:36):
undefined
Logan:
I think there'll be something similar on the video side. (27:37):
undefined
Logan:
You know, there will be downsides to AI technology in some ways. (27:40):
undefined
Logan:
I think there is like as the technology shift happens, there is some amount (27:43):
undefined
Logan:
of disruption that's taking place. (27:48):
undefined
Logan:
And like someone's workflow is being disrupted. But I do think there's this (27:50):
undefined
Logan:
really interesting thread to pull on, which is my hope is that it actually ends (27:53):
undefined
Logan:
up creating more opportunity for the experts and the specialists. (27:57):
undefined
Ejaaz:
So it sounds like you're not saying VFX Studio teams are going to be replaced by software engineers, (28:01):
undefined
Ejaaz:
but rather that team in itself will become more adept at using these AI tools (28:08):
undefined
Ejaaz:
and products to kind of enhance their own skill set beyond what it is today. Is that right? (28:14):
undefined
Logan:
Yeah, yeah. And I think we've seen this already play out in some ways, which is interesting. (28:19):
undefined
Logan:
I think code is a little bit wider distribution than perhaps the VFX. (28:23):
undefined
Logan:
And it's VFX also in a space that I'm less familiar with personally. (28:29):
undefined
Logan:
But yeah, I think this is likely what is going to play out if I had to guess and bet. (28:33):
undefined
Ejaaz:
Can you help us understand how a product like VO3 gets used beyond just like (28:39):
undefined
Ejaaz:
the major Hollywood productions? (28:47):
undefined
Ejaaz:
Stuff, right? Because I've seen a bunch of these videos now. (28:49):
undefined
Ejaaz:
And I'll be honest with you, Logan, it's scary how realistic this stuff is, right? (28:52):
undefined
Ejaaz:
It's like from a high quality triple A game demo, all the way to something that (28:57):
undefined
Ejaaz:
is shot like in an A24 film, you know, the scenes, the cuts, the changes. (29:03):
undefined
Ejaaz:
I think it's awesome. I'm wondering whether that goes beyond entertainment in (29:09):
undefined
Ejaaz:
any way. Do you have any thoughts or ideas there? (29:13):
undefined
Logan:
Yeah, that is interesting. I think one of the ones that is like related to, (29:15):
undefined
Logan:
it's sort of one skip away from video generation itself, which was Genie, (29:20):
undefined
Logan:
which was our sort of world simulation work that was happening. (29:26):
undefined
Logan:
I think if folks haven't seen this, go look up Genie 3 and you can see a video. (29:29):
undefined
Logan:
It's mind blowing. It's like a fully playable game world simulation. (29:33):
undefined
Logan:
You can like prompt on the go and this environment will change. (29:36):
undefined
Logan:
You can control it on your keyboard similar to a game. I think that work translates (29:40):
undefined
Logan:
actually really well to robotics, which is cool. (29:45):
undefined
Logan:
So as you like one of the if folks aren't familiar with this, (29:47):
undefined
Logan:
like one of the principal reasons we don't just have robots walking around everywhere. (29:51):
undefined
Logan:
And the reason why we have LLMs that can actually do lots of useful stuff is (29:55):
undefined
Logan:
it's this data problem, which is like there's lots of, you know, (30:00):
undefined
Logan:
text data and other data that's like representative of the intelligence of humans (30:04):
undefined
Logan:
and all this stuff that's available. (30:09):
undefined
Logan:
There's actually not a lot of data that is useful for making robotics work. (30:11):
undefined
Logan:
And I think VO could be part of, or like generally that sort of segment of video (30:14):
undefined
Logan:
generation and this like physics understanding and all that other stuff, (30:20):
undefined
Logan:
I think could be really helpful in actually making the long tail of robotics use cases work. (30:24):
undefined
Logan:
Then I can finally have a robot that will fold my laundry so that I don't need (30:29):
undefined
Logan:
to spend my time doing that. (30:33):
undefined
Logan:
But that's my like outside of entertainment bet as far as like where that use (30:34):
undefined
Logan:
case ends up creating value in the world. (30:40):
undefined
Ejaaz:
With VO3, the goal is to enable humans to become a better version of themselves, (30:42):
undefined
Ejaaz:
a 10x, 100x better version of themselves using these different tools. (30:48):
undefined
Ejaaz:
So in the example of a VFX studio, you can now kind of like create much better movies. (30:51):
undefined
Ejaaz:
How does that apply for Genie 3 exactly, right? (30:56):
undefined
Ejaaz:
You gave the example of like being able to create simulated environments, (30:59):
undefined
Ejaaz:
but that's to train these robots. That's to train these models. (31:03):
undefined
Ejaaz:
What about us? What about the flesh humans that are out there? (31:07):
undefined
Ejaaz:
Can you give us some examples about where this might be applied or used? (31:10):
undefined
Logan:
Yeah, that's a good example. I mean, the robot answer is like the robots will (31:14):
undefined
Logan:
be there to help us, which is nice. (31:19):
undefined
Logan:
So hopefully there's a bunch of stuff that you don't want to do that you'll (31:20):
undefined
Logan:
be able to get your robot to do. (31:25):
undefined
Logan:
Or there's like industries that are like dangerous for humans to operate in (31:27):
undefined
Logan:
where it's like if you can sort of do that simulation without needing to collect (31:31):
undefined
Logan:
a bunch of human data to do those things, I could see that being super valuable. (31:35):
undefined
Logan:
I think my initial reaction to the Genie use case, like I could see lots of, (31:38):
undefined
Logan:
actually, the two that come to mind is like one entertainment I think will be cool. (31:46):
undefined
Logan:
Humans want to be entertained. It's a story as all this time. (31:51):
undefined
Logan:
I think there will be some entertainment value of a product experience like Genie. (31:54):
undefined
Logan:
Um, I think the other one is actually back to a bunch of use cases where you'd (31:59):
undefined
Logan:
actually want robotics to be able to do some of that work that don't yet, uh, (32:05):
undefined
Logan:
the robot product experience, like isn't actually there. (32:10):
undefined
Logan:
Um, this could be things like, you know, mining or like heavy industries, (32:14):
undefined
Logan:
things like that, where like, there's actually like a safety aspect of like, (32:19):
undefined
Logan:
how can you do these like realistic simulation training experiences, um, (32:22):
undefined
Logan:
in order to make sure that like you're You don't have to like physically put (32:27):
undefined
Logan:
yourself in harm's way in order to like understand the bounds or like the failure cases, (32:30):
undefined
Logan:
like disaster recovery, things like that, where it would be you don't want to (32:35):
undefined
Logan:
have to show up at a hurricane the first time to like really understand what (32:41):
undefined
Logan:
the environment could be like. (32:45):
undefined
Logan:
And like being able to do those types of simulations is interesting and building (32:47):
undefined
Logan:
software deterministically to solve that problem would actually be really difficult (32:51):
undefined
Logan:
and expensive and like probably isn't a large market that lots of companies are going to go after. (32:55):
undefined
Logan:
But if you have this model that has really great world knowledge, (33:01):
undefined
Logan:
you can throw all these random variables at it and like sort of do that type (33:03):
undefined
Logan:
of like training and simulation. (33:07):
undefined
Logan:
So yeah, it's perhaps an interesting use case. (33:08):
undefined
Logan:
I don't know if there's actually a plan to use it for things like that, (33:12):
undefined
Logan:
But those are things that come to mind. (33:15):
undefined
Josh:
This is something I've been dying to ask you about because this is something (33:18):
undefined
Josh:
that I've been fascinated by. (33:21):
undefined
Josh:
When I watched the Genie 3 demo for the first time, it just kind of shattered (33:22):
undefined
Josh:
my perception of where we were at because you see it work. (33:25):
undefined
Josh:
And I saw this great demo where someone was painting the wall. (33:30):
undefined
Josh:
We actually filmed an entire episode about this and it retained all of the information. (33:32):
undefined
Josh:
And one theme, as I'm hearing you describe these things, as I'm hearing you (33:36):
undefined
Josh:
describe VO3, Genie 3, you are building this deep understanding of the physical world. (33:39):
undefined
Josh:
And I can't help but notice this trend like you are just starting (33:44):
undefined
Josh:
to understand the world more and more and I could see this when (33:47):
undefined
Josh:
it comes to making games as an example where like a lot (33:50):
undefined
Josh:
of people were using genie 3 to just make these like not necessarily (33:53):
undefined
Josh:
games but virtual worlds that you can walk around and interact with and I'm (33:56):
undefined
Josh:
wondering if you could just kind of share the long-term reasoning why because (33:59):
undefined
Josh:
clearly there's a reason there's a lot of value to it is it from being able (34:02):
undefined
Josh:
to create maybe artificial data for robots if you can emulate the physical world (34:06):
undefined
Josh:
you can create data to train these robots is it because it creates great experiences (34:09):
undefined
Josh:
like perhaps we'll AAA design studios using Genie 5 to make AAA games like Grand Theft Auto. (34:12):
undefined
Josh:
I'm curious the reasoning behind (34:18):
undefined
Josh:
this like urge to understand the physical world and emulate it even. (34:19):
undefined
Logan:
I had a conversation with Demis about this who's our CEO at DeepMind and someone (34:24):
undefined
Logan:
who's been pushing on this for a long time. (34:30):
undefined
Logan:
I think a lot of this goes back to like there's two dimensions. (34:32):
undefined
Logan:
It goes back to like the original ethos of like why DeepMind was created and (34:35):
undefined
Logan:
a bunch of the work the initial work that was happening in DeepMind around reinforcement learning. (34:39):
undefined
Logan:
If folks haven't seen this, one of the challenges of, again, (34:44):
undefined
Logan:
making AI work is that you need this flywheel of continuing to iterate and you (34:48):
undefined
Logan:
need a reward function, which is what is the actual outcome that you're trying to achieve? (34:55):
undefined
Logan:
And the thing that's interesting about these simulated environments is it's (35:03):
undefined
Logan:
really easy to have a constrained environment. (35:07):
undefined
Logan:
World, and it's really easy to also, or not maybe really easy, (35:10):
undefined
Logan:
is overly ambitious. It's possible to define a simple reward function and then (35:15):
undefined
Logan:
actually infinitely scale this up. (35:21):
undefined
Logan:
And the opposite example of this, if folks have saw there was some work a very (35:23):
undefined
Logan:
long time ago, and this is in the AI weeds, but there was this hand, (35:29):
undefined
Logan:
this physical hand that could like robotic hand that could manipulate a Rubik's cube. (35:33):
undefined
Logan:
And they were using AI to like help try to solve this Rubik's cube. (35:39):
undefined
Logan:
And the, the, again, the analogy of why this, of why Genie and some of this (35:43):
undefined
Logan:
work is so interesting is if you were to go and try to like, (35:47):
undefined
Logan:
Hey, we need all the data to go and try to make this little hand, (35:50):
undefined
Logan:
physical robotic hand, be able to do this. (35:54):
undefined
Logan:
It's actually really challenging to scale that up. You need to go and build a bunch of hands. (35:57):
undefined
Logan:
You need to like, what happens when the rubik's cube drops you need (36:01):
undefined
Logan:
to have some system to like go and pick it back up and you just like go (36:04):
undefined
Logan:
through the long tail of this stuff the hand probably can't run (36:07):
undefined
Logan:
24 hours a day like there's all these challenges with getting um the the like (36:10):
undefined
Logan:
data in that environment to scale up um and these virtual environments don't (36:15):
undefined
Logan:
have this problem which is if you can emulate and like self-driving cars is (36:20):
undefined
Logan:
another example of this like again for folks who aren't familiar lots of you You know, (36:24):
undefined
Logan:
there's lots of real world data that's involved in self-driving cars. (36:28):
undefined
Logan:
There's also lots of simulated environments where they've built simulations of the world. (36:31):
undefined
Logan:
And this is how they can get like a thousand X scale up of this like data understanding (36:35):
undefined
Logan:
is by having these simulated environments. (36:41):
undefined
Logan:
Robotics will be exactly the same. If you want robotics to work, (36:44):
undefined
Logan:
it's almost 100% true that you're going to have to have these simulated environments (36:47):
undefined
Logan:
where the robot can fall down the stairs a thousand times. (36:51):
undefined
Logan:
And that's okay because it's a simulated environment and it's not actually going (36:56):
undefined
Logan:
to fall down your stairs. (36:59):
undefined
Logan:
So I think Genie is, there is definitely like an entertainment aspect to it. (37:01):
undefined
Logan:
I think it's more so going to be useful for this like simulated environment (37:05):
undefined
Logan:
to help us not have to do things in the real world and, but still have like (37:10):
undefined
Logan:
a really good proxy of what will happen in the real world when we do them. (37:15):
undefined
Ejaaz:
That's pretty funny. I spent the weekend watching the World Robot Olympics and (37:19):
undefined
Ejaaz:
there was some very real fails and crashes of these robots, which is pretty funny. (37:25):
undefined
Ejaaz:
Okay, so when I think of Genie, I think that it blows my mind because I still (37:32):
undefined
Ejaaz:
can't get my head around how it predicts what I'm going to look at. (37:38):
undefined
Ejaaz:
I remember seeing this demo of someone just taking a simple video of them walking and, (37:42):
undefined
Ejaaz:
you know, it was like a rainy day on a gravel path and they stuck that into (37:48):
undefined
Ejaaz:
Genie 3 and they could look down and see their reflection in the puddle. (37:53):
undefined
Ejaaz:
So the physics was astoundingly accurate and astute. (37:58):
undefined
Ejaaz:
Can you give us a basic breakdown of how this works? (38:02):
undefined
Ejaaz:
Is this like a real engine, game engine, like happening in the background? (38:05):
undefined
Ejaaz:
Or is there something more deeper happening? Like help us understand. (38:09):
undefined
Logan:
My intuition, and we can gut check this with folks on the research side to make (38:13):
undefined
Logan:
sure that I'm not, I'm not fabricating my intuition. (38:17):
undefined
Logan:
But if folks have an intuition as far as like how next token prediction works, (38:21):
undefined
Logan:
which is at some given, like if you're looking through a sentence of text, (38:25):
undefined
Logan:
for each word in that sentence, there's a (38:29):
undefined
Logan:
distribution uh between like zero (38:32):
undefined
Logan:
and one basically of like how likely that (38:35):
undefined
Logan:
word was to be the next word in the sequence um and (38:38):
undefined
Logan:
if you look the and if you like look through this is like the basic (38:42):
undefined
Logan:
principle of llms um this is why you get like the you know if you're to ask (38:45):
undefined
Logan:
the same question multiple times um the the llm will inherently perhaps give (38:49):
undefined
Logan:
you a different answer And that's why like small changes in the inputs to LLMs (38:55):
undefined
Logan:
actually change this because like, again, it's this distribution. (38:59):
undefined
Logan:
So like if you make one letter difference, it perhaps like puts you on a like (39:03):
undefined
Logan:
a branching trajectory that looks very different than what the original output (39:08):
undefined
Logan:
that you got from the model. (39:14):
undefined
Logan:
Similar similar like rough approximation of this just like much more computationally (39:16):
undefined
Logan:
difficult and i think they use a bunch of um architectural differences that (39:23):
undefined
Logan:
sort of it's not truly next token prediction that's happening for the sort. (39:27):
undefined
Ejaaz:
Of like pixels colors bunch of other things yeah (39:30):
undefined
Logan:
Exactly yeah so it's like you can like roughly map (39:33):
undefined
Logan:
the mental model of like as the as a model (39:36):
undefined
Logan:
looks down or as like the figure looks down (39:38):
undefined
Logan:
in some in some environment like again it has (39:41):
undefined
Logan:
all this like context of the state of the world but then it also knows like (39:44):
undefined
Logan:
what are the pixels that are preceding it etc etc it like loosely is doing this (39:49):
undefined
Logan:
like next next pixel prediction you could you could sort of approximate with (39:54):
undefined
Logan:
um that's happening at the at the genie level which is which is an interesting way to think about it. (39:58):
undefined
Josh:
So ijaz one of the things you were mentioning was that um it's happening much (40:04):
undefined
Josh:
faster right and it's happening presumably much cheaper because now i heard this crazy stat. (40:07):
undefined
Josh:
You're at like 500, hundreds of trillions of tokens per month that is being (40:11):
undefined
Josh:
pushed out by Gemini. It's unbelievable. (40:14):
undefined
Josh:
And I want to get into the kind of infrastructure that enables this because (40:16):
undefined
Josh:
Gemini is feeling faster, but it's also feeling better. And it's also getting cheaper. (40:20):
undefined
Josh:
And behind you earlier in the show, you mentioned you have a TPU. (40:26):
undefined
Josh:
I understand TPUs are part of this solution. And I want you to kind of just (40:30):
undefined
Josh:
walk us through how this is happening. (40:34):
undefined
Josh:
How are we getting these quality and improvements across the board? (40:36):
undefined
Josh:
And what type of hardware or software is enabling that to happen. (40:39):
undefined
Logan:
I think like one, you have to give credit to like all of these infrastructure (40:42):
undefined
Logan:
teams across Google that are making this happen. (40:46):
undefined
Logan:
If you think, and I think about this a lot, like what is Google's differentiated advantage? (40:48):
undefined
Logan:
What does our expertise lend us well to do in the ecosystem? (40:54):
undefined
Logan:
What are the things we shouldn't do because of that? What are the things we (40:58):
undefined
Logan:
should do because of that? It's something I think about as somebody who builds products. (41:00):
undefined
Logan:
One of the things that I always come back to is our infrastructure. (41:04):
undefined
Logan:
And like the thing Google has been able to do time and time again. (41:08):
undefined
Logan:
Scale up multiple products to billions (41:12):
undefined
Logan:
of users, have them work with high reliability, et cetera, et cetera. (41:14):
undefined
Logan:
And that's like a uniquely difficult problem. It's a even more difficult problem (41:19):
undefined
Logan:
to do in the age of AI where like the software is not deterministic. (41:24):
undefined
Logan:
The sort of compute footprint required to do these things is really difficult. (41:28):
undefined
Logan:
The models are a little bit tricky and finicky to work with sometimes. (41:33):
undefined
Logan:
So again, like our infrastructure teams have done an incredible job making that scale up. (41:37):
undefined
Logan:
I think the stat was IO 2024, we were doing roughly 50 trillion tokens a month. (41:42):
undefined
Logan:
IO 2025, I think it was like 480 trillion tokens a month, if I remember correctly. (41:49):
undefined
Logan:
And just a month or two later, and this was in the conversation I had with Demis, (41:56):
undefined
Logan:
we crossed a quadrillion tokens, which comes after a trillion, (42:01):
undefined
Logan:
if you're not haven't thought about numbers higher than a trillion before (42:06):
undefined
Logan:
it's what comes after a trillion um and there's (42:09):
undefined
Logan:
no slowdown in sight and like i think this is just a great (42:12):
undefined
Logan:
reminder of like um so many (42:15):
undefined
Logan:
of these ai like markets and (42:17):
undefined
Logan:
product ecosystems is still so early and there's (42:21):
undefined
Logan:
this massive expansion i think about in my own life like how (42:24):
undefined
Logan:
much ai do i really have in my life helping me like not (42:27):
undefined
Logan:
really that much on the margin it's like you know maybe (42:30):
undefined
Logan:
tens of millions of tokens a (42:33):
undefined
Logan:
month maximum and like you think (42:35):
undefined
Logan:
about a future where there's like billions of tokens being spent on a monthly (42:38):
undefined
Logan:
basis in order to help you in whatever you're doing in your professional life (42:42):
undefined
Logan:
and your work and your personal life whatever it is there's we're still so early (42:45):
undefined
Logan:
and tpus are a core part of that because it allows us to like um control every (42:50):
undefined
Logan:
layer of the hardware and software. (42:56):
undefined
Logan:
Delivery all the way to the actual like silicon that the model is running on (42:59):
undefined
Logan:
and we can do a bunch of optimizations and customizations, (43:04):
undefined
Logan:
other people can't do because they don't actually control the hardware itself (43:08):
undefined
Logan:
and there's some good examples of the things that this enables (43:11):
undefined
Logan:
one of them is um you know we've been at the Pareto (43:14):
undefined
Logan:
frontier from a cost performance perspective for (43:17):
undefined
Logan:
a very long time and again if folks aren't familiar the Pareto frontier is this (43:20):
undefined
Logan:
like trade-off of cost and intelligence and you want to be on the highest intelligence (43:23):
undefined
Logan:
lowest cost um and we've been sitting on that for for you know basically the (43:28):
undefined
Logan:
entirety of the Gemini life cycle um so far which is really important so people (43:33):
undefined
Logan:
get a ton of value from the Gemini models. (43:37):
undefined
Logan:
Another example of this is long context. Again, if folks are familiar, (43:40):
undefined
Logan:
there's a limit on like how many tokens you can pass to a model at a given time. (43:44):
undefined
Logan:
Gemini's had a million or 2 million token context windows since the initial (43:49):
undefined
Logan:
launch of Gemini, which has been awesome. (43:54):
undefined
Logan:
And there's a bunch of research showing we could scale that all the way up to (43:56):
undefined
Logan:
10 million if we wanted to. (43:59):
undefined
Logan:
And that is like a core infrastructure enabled thing. (44:01):
undefined
Logan:
Like research, There's a lot of like really important research to make that (44:05):
undefined
Logan:
work and make that possible. (44:08):
undefined
Logan:
But it's also really difficult on the infrastructure side. And you have to be (44:10):
undefined
Logan:
willing to do that work and pay that price. (44:13):
undefined
Logan:
And it's a beautiful outcome for us because we have the infrastructure teams (44:16):
undefined
Logan:
that have the expertise to do this. (44:19):
undefined
Josh:
Okay, Logan, one quadrillion tokens. That's a big number. (44:21):
undefined
Josh:
We need to talk about this for a little bit because that is an outrageously, (44:25):
undefined
Josh:
mind-bendingly big number. (44:28):
undefined
Josh:
And when I hear you say that number, I think I'm reminded of Jevin's Paradox (44:30):
undefined
Josh:
for people who don't know. it's increased technological efficiency in using (44:34):
undefined
Josh:
a resource which can lead to higher total consumption of that resource. (44:37):
undefined
Josh:
So clearly with these cool new TPUs, this vertically integrated stack you've (44:40):
undefined
Josh:
built, you are able to generate tokens much more cheaply and produce a lot more (44:43):
undefined
Josh:
of them. Hence the one quadrillion tokens. (44:47):
undefined
Josh:
Do you see this trend continuing? Is there going to be a continued need to just (44:49):
undefined
Josh:
produce more tokens or will it eventually be a battle to produce smarter tokens? (44:53):
undefined
Josh:
I guess the question I'm asking is the quality of the token more important than (44:58):
undefined
Josh:
the amount of the tokens? (45:03):
undefined
Josh:
And do you see a limit in which the quantity of the tokens starts to like kind (45:04):
undefined
Josh:
of go off of a cliff in terms of how valuable it is? (45:09):
undefined
Logan:
Yeah, I could buy that story. And some of this is and it's something that's (45:12):
undefined
Logan:
actually super top of mind for our teams on the like Gemini model side is around (45:17):
undefined
Logan:
this whole idea of like thinking efficiency, (45:21):
undefined
Logan:
which is like ideally you want (45:23):
undefined
Logan:
to get to the best answer using the limited amount of thoughts possible. (45:25):
undefined
Logan:
Same thing with humans like ideally like you're the example (45:28):
undefined
Logan:
of like you're taking a test you want to as you know (45:32):
undefined
Logan:
the shortest number of mental hops possible to get you to the answer. (45:34):
undefined
Logan:
Of whatever the question was is ideally what you want you don't want (45:37):
undefined
Logan:
to have to just like think for an hour to answer one question (45:40):
undefined
Logan:
um and there's yeah there's a bunch (45:43):
undefined
Logan:
of odd parallels in that world to like models and (45:46):
undefined
Logan:
and humans doing this approach um so i (45:49):
undefined
Logan:
do think thinking efficiency is top of mind you don't want to just (45:52):
undefined
Logan:
like use tokens for the sake of tokens um i (45:54):
undefined
Logan:
think even if we were to like 10x reduce the (45:58):
undefined
Logan:
number of tokens required which would be like awesome and (46:01):
undefined
Logan:
would be like a great innovation the models are like much more (46:04):
undefined
Logan:
token efficient i do think there's like a um a (46:07):
undefined
Logan:
pretty low ceiling to how far that will be able to go (46:11):
undefined
Logan:
specifically because of this like next (46:14):
undefined
Logan:
token prediction paradigm of like how the models (46:17):
undefined
Logan:
actually approach solving problems using using (46:21):
undefined
Logan:
like the token as a unit so it's (46:25):
undefined
Logan:
not clear to me that you'll be able to just like you know a thousand x reduce (46:28):
undefined
Logan:
the amount of tokens required to solve a problem i think it probably looks much (46:32):
undefined
Logan:
more like 10x or something like that and then the there'll be a 10x reduction (46:35):
undefined
Logan:
in the number of tokens required to solve a problem and there'll be a 10 000 (46:39):
undefined
Logan:
x increase in the total amount of ai and and sort of token consumption in the world. (46:42):
undefined
Logan:
So I think you probably, even if we made that reduction happen, (46:47):
undefined
Logan:
I think the graph still looks like it's going up and to the right for the most part. (46:51):
undefined
Josh:
It still keeps going. There is no wall. We have virtual data to train models (46:55):
undefined
Josh:
on. We have tons of new tokens coming into play. (46:59):
undefined
Josh:
There's another question I wanted to ask, which is just a personal question (47:02):
undefined
Josh:
for you, which is a feature that, because I find when a lot of people leave (47:06):
undefined
Josh:
comments on the show and they talk about their experience with AI, (47:10):
undefined
Josh:
a lot of them are just using like ChatGPT on their app or they have Grok on their phone. (47:13):
undefined
Josh:
And I think Gemini kind of has some underrated features that don't quite get enough attention. (47:17):
undefined
Josh:
So what I'd like for you to do is maybe just highlight one or two of the features (47:23):
undefined
Josh:
you shipped recently that you think is criminally underrated. (47:26):
undefined
Josh:
What should people try out that you think not enough people are using? (47:29):
undefined
Logan:
I think the one that continues to surprise me the most is deep research. (47:32):
undefined
Logan:
I think deep research is just like a, is the North Star for building an AI product experience. (47:37):
undefined
Logan:
And if folks aren't familiar with this, so you can show up with, (47:42):
undefined
Logan:
yeah, it's so, you can show up with like a pretty ill-defined question that's (47:45):
undefined
Logan:
like very open and vague. (47:50):
undefined
Logan:
And the model will traverse essentially across the internet hundreds or thousands (47:51):
undefined
Logan:
of different web pages, try to accumulate enough context and then come back to you with initially, (47:58):
undefined
Logan:
basically like a research report, could be like a 40 page report in some cases that I've seen. (48:05):
undefined
Logan:
You might hear a 40 page report and say, that's not very useful to me because (48:10):
undefined
Logan:
I'm not going to read 40 pages. (48:13):
undefined
Logan:
And I'd say you and me are exactly the same because I'm not reading 40 pages either. (48:15):
undefined
Logan:
There's a beautiful feature, again, if you've used Notebook LM, (48:19):
undefined
Logan:
this audio overviews feature. (48:22):
undefined
Logan:
The same thing actually exists inside of the Gemini app with deep research, (48:25):
undefined
Logan:
which you can just press that button and then get like a, you know, (48:29):
undefined
Logan:
10, 15 minute podcast that sort of goes through and explains all the different (48:32):
undefined
Logan:
research that's happened. (48:36):
undefined
Logan:
You can, you know, listen to that on your commute or something like that or (48:37):
undefined
Logan:
on a walk and not need to read 40 pages, which is awesome. (48:40):
undefined
Logan:
The part of this that makes it such an interesting experience to me is, (48:43):
undefined
Logan:
I don't know if other people have felt this before, but most AI products back (48:47):
undefined
Logan:
to that, Josh, that like blank slate problem or that like empty chat box problem. (48:51):
undefined
Logan:
You, you as the user of the product have to put in so much work in order to get useful stuff. (48:57):
undefined
Logan:
I talk to people all the time who are like, yeah, I use these models and like, (49:03):
undefined
Logan:
they're just not useful for me. And like, actually what's happening behind the (49:06):
undefined
Logan:
scenes is the models are super capable. (49:09):
undefined
Logan:
They're really useful. It just requires that you give the models enough context. (49:11):
undefined
Logan:
And I think deep research, there's this new emerging like prompt engineering (49:15):
undefined
Logan:
2.0 is this context engineering problem where it's like, how do you get in the (49:19):
undefined
Logan:
right information so that the model can make a decision on behalf of the user? (49:24):
undefined
Logan:
And I think deep research is this really nice balance of going and doing this (49:28):
undefined
Logan:
context engineering for you, (49:33):
undefined
Logan:
bringing all that context into the window of the model, and then (49:34):
undefined
Logan:
being able to answer what your original question was and principally showing (49:37):
undefined
Logan:
you this like proof of work up front I think about this proof of work concept (49:42):
undefined
Logan:
in AI all the time which is I have so much more trust in deep research because (49:46):
undefined
Logan:
as soon as I kick off that query it's like boom it's already at like 50 web (49:51):
undefined
Logan:
pages I'm like great because. (49:55):
undefined
Logan:
I was never going to visit 50 webpages. Like there's pretty much nothing that I'm researching. (49:57):
undefined
Logan:
I could be going and buying a car and I'm going to go and look at less than (50:02):
undefined
Logan:
50 webpages for that thing or a house. (50:05):
undefined
Logan:
I'm looking at less than 50 webpages. Like I'm just, it's not in the car. (50:07):
undefined
Logan:
At least this is maybe personal to me and other people are doing more research. I don't know. (50:11):
undefined
Logan:
But so automatically I'm like in awe with how much more work this thing is doing. (50:15):
undefined
Logan:
And I think there's, this is, again, this is the North Star from an AI product (50:20):
undefined
Logan:
experience standpoint. and there's so few products that have like made that (50:24):
undefined
Logan:
experience work and just every time I go back to deep research I'm reminded (50:29):
undefined
Logan:
of this and that team uh crushed it so. (50:33):
undefined
Ejaaz:
And it's not just deep research from a LLM context that is so fascinating about (50:35):
undefined
Ejaaz:
Google AI um you guys have created some of the most fascinating tools to advance (50:42):
undefined
Ejaaz:
science and I don't think you guys get enough flowers for what you guys have built. (50:48):
undefined
Ejaaz:
Some of my favorites, AlphaFold 3 is crazy. (50:54):
undefined
Ejaaz:
So, you know, this is this model (50:58):
undefined
Ejaaz:
that can predict what certain molecular structures are going to look like. (51:00):
undefined
Ejaaz:
And this could be applied to so many different industries, the most obvious (51:04):
undefined
Ejaaz:
being drug design, creating cheaper, (51:08):
undefined
Ejaaz:
more effective, curable drugs for a variety of different diseases. (51:10):
undefined
Ejaaz:
And then I was thinking about that random model that you guys launched, (51:15):
undefined
Ejaaz:
where apparently we could translate what dolphins were saying to us and vice versa um (51:20):
undefined
Ejaaz:
Kind of stepping back from all of these examples, can you help me understand (51:26):
undefined
Ejaaz:
what is Google's obsession with AI and science and why you think it's such an (51:30):
undefined
Ejaaz:
important area to focus on? (51:35):
undefined
Ejaaz:
Are we at a point now where we can advance science to infinity or where are we right now? (51:36):
undefined
Ejaaz:
Are we at our chat GPT moment or do we have more to go? (51:42):
undefined
Logan:
I'll start with a couple of cheeky answers, which Demis, who is the only foundation (51:45):
undefined
Logan:
model lab CEO to have a Nobel prize, uh, in the science domain, (51:51):
undefined
Logan:
which is, uh, for him, for him, chemistry, um, (51:56):
undefined
Logan:
had this comment, which is actually really true. (51:59):
undefined
Logan:
There's lots of people talking about this, like impact of AI on science and humanity. (52:02):
undefined
Logan:
Um, and there's very few, uh, if not only one, um, being deep mind research (52:07):
undefined
Logan:
lab, that's like actually doing the science work. (52:13):
undefined
Logan:
And I think it's this like great example of like deep mind and just being in (52:16):
undefined
Logan:
the like culture and DNA, DNA of like Demis is a scientist, (52:20):
undefined
Logan:
all of these folks around, um, around DeepMind are scientists and they like (52:23):
undefined
Logan:
want to push the science and, and push what's possible in this, (52:27):
undefined
Logan:
this future of discovery, (52:31):
undefined
Logan:
um, using our models. (52:32):
undefined
Logan:
And I was in London a couple of weeks ago meeting with Pushmid who leads our (52:34):
undefined
Logan:
science team and hearing about sort of like the breadth of the science that's happening. (52:38):
undefined
Logan:
Um, and how like Dolphin Gemma is like a great, like kind of like funny example, (52:43):
undefined
Logan:
cause it's, it's not super applicable in a lot of cases, but it's interesting (52:48):
undefined
Logan:
to think about, um, alpha fold, like if, if folks haven't, um. (52:51):
undefined
Logan:
Watched the movie, the thinking game, it's about sort of the early days of, (52:57):
undefined
Logan:
of, uh, Google deep mind. (53:00):
undefined
Logan:
And, um, they're talking about like folding proteins and why this is such an interesting space. (53:02):
undefined
Logan:
And I'm not a, not a scientist, um, but the, to like hit on the point really (53:09):
undefined
Logan:
quickly of like why Alpha Fold is so interesting. The the (53:15):
undefined
Logan:
historical like context is humans to fold (53:18):
undefined
Logan:
a single protein would take many humans millions (53:22):
undefined
Logan:
of dollars and it would take on the (53:25):
undefined
Logan:
order of like five years in order to fold a single protein the original impetus (53:28):
undefined
Logan:
and like why demas won the nobel piece the nobel prize for this in chemistry (53:33):
undefined
Logan:
was because deep mind was able to figure out using uh reinforcement learning (53:36):
undefined
Logan:
and other techniques They folded every protein in the known universe, (53:42):
undefined
Logan:
millions of proteins, released them publicly, (53:46):
undefined
Logan:
made them available to everyone. (53:49):
undefined
Logan:
And it was like, you know, dramatically accelerated the advancement of like (53:51):
undefined
Logan:
human medicine and a bunch of other domains and disciplines. (53:58):
undefined
Logan:
And now actually with isomorphic labs, which is part of DeepMind, (54:01):
undefined
Logan:
like actually pursuing some of the breakthroughs that they found and like actually (54:04):
undefined
Logan:
doing drug discovery and things like that. (54:08):
undefined
Logan:
So like overnight, you see that hundreds of thousands of human years and hundreds (54:10):
undefined
Logan:
of millions of dollars of like research and development costs saved through a single innovation. (54:16):
undefined
Logan:
And I think we're going to continue to see that like acceleration of new stuff happening. (54:22):
undefined
Logan:
A recent example of this, Alpha Evolve, (54:26):
undefined
Logan:
which was our sort of like geospatial model that came out and being able to (54:30):
undefined
Logan:
like fuse together all of this, the Google Earth engine with AI and this understanding of the world. (54:37):
undefined
Logan:
Like it's just so much cool science and so much is possible when you sort of (54:44):
undefined
Logan:
layer on the AI capability in all these disciplines. So I think to answer the (54:49):
undefined
Logan:
question, I think we're going to see this acceleration of science progress. (54:54):
undefined
Logan:
I think DeepMind is going to continue to be at the forefront of this, (54:58):
undefined
Logan:
which is really exciting. (55:00):
undefined
Logan:
And the cool thing for even for people who aren't in science is all of that (55:02):
undefined
Logan:
innovation and like the research breakthroughs that happen, it feeds back to (55:07):
undefined
Logan:
the mainline Gemini model. (55:11):
undefined
Logan:
Like we had a bunch of research work about doing proofs for math. (55:13):
undefined
Logan:
And it's like, oh, that's not very interesting at the face value. But like that research. (55:17):
undefined
Logan:
Fuels back into the mainline gemini model it makes (55:22):
undefined
Logan:
it better at reasoning it makes it better able to like understand these like (55:25):
undefined
Logan:
really long and difficult problems um which then benefits like every like agent (55:29):
undefined
Logan:
use case that exists uh because the models are better at reasoning through all (55:33):
undefined
Logan:
these like difficult problem domains so there is this like really cool research to reality, (55:37):
undefined
Logan:
science to like practical impact flywheel that happens at deep mind as. (55:43):
undefined
Ejaaz:
A former biologist this warms my heart. This is amazing to see this get applied at such scale. (55:47):
undefined
Ejaaz:
Okay, we can't talk about Google AI without talking about search. (55:53):
undefined
Ejaaz:
This is your bread and butter, right? (55:58):
undefined
Ejaaz:
However, I've personally noticed a trend shift in my habits. (56:00):
undefined
Ejaaz:
I've used a computer for decades now, and I've always used Google search to (56:05):
undefined
Ejaaz:
find things, Google Chrome, whatever it might be. (56:10):
undefined
Ejaaz:
But I've now started to cheat on this feature. (56:12):
undefined
Ejaaz:
I have started using LLMs directly to do all my searching for me, (56:16):
undefined
Ejaaz:
to get all my sources for me. (56:20):
undefined
Ejaaz:
And you've got to be thinking about this slogan, right? Is this eating the search business? (56:22):
undefined
Ejaaz:
Is this aiding the search business? Or are we creating a whole different form (56:27):
undefined
Ejaaz:
factor here? What are your thoughts? (56:31):
undefined
Logan:
There's an interesting form factor discussion. I think on one hand, the AI... (56:33):
undefined
Logan:
Sort of answer market is definitely distinctly different, it feels like, (56:38):
undefined
Logan:
than the search market to a certain degree. (56:42):
undefined
Logan:
Like, I think we've seen lots of AI products reach hundreds of millions of users (56:44):
undefined
Logan:
and, you know, search continues to be a great business and there's billions of people using it and. (56:48):
undefined
Ejaaz:
All that stuff. (56:52):
undefined
Logan:
There's also this interesting question, which is like, what's the obligation (56:53):
undefined
Logan:
of Google in this moment of this platform shift and all this innovation that's happening? (56:56):
undefined
Logan:
And I, You know, as somebody who doesn't work on search, but, (57:02):
undefined
Logan:
you know, is a fan of all the work that's happening inside of Google and has (57:06):
undefined
Logan:
empathy for folks building these products, it is really interesting. (57:11):
undefined
Logan:
And like my perspective has always been that search actually has this, (57:14):
undefined
Logan:
you know, as the front door to the internet has this stewardship position that (57:18):
undefined
Logan:
makes it so that they actually can't disrupt themselves for the right reasons (57:22):
undefined
Logan:
at the same pace that that sort of, (57:27):
undefined
Logan:
you know, small players in the market are able to do. (57:29):
undefined
Logan:
And my assertion has always been that like, actually, this is the best thing (57:33):
undefined
Logan:
for the world, the best thing for the world and for the Internet and for this (57:37):
undefined
Logan:
entire economy that that Google has enabled through the Internet and bringing (57:40):
undefined
Logan:
people to websites and all this stuff doesn't benefit by like, (57:45):
undefined
Logan:
you know, day one of the LLM revolution happening. (57:48):
undefined
Logan:
All of a sudden it's like a fully llm (57:52):
undefined
Logan:
powered search product and like feels and looks completely different (57:55):
undefined
Logan:
not only i think whether that throw you know users who (57:58):
undefined
Logan:
are still trying to figure out like how do i use this technology what is the (58:02):
undefined
Logan:
way that i should be engaging with it um what are the things that it works well (58:05):
undefined
Logan:
for and it doesn't work for not only to throw those people into a bad um perspective (58:09):
undefined
Logan:
from like a user uh from a user journey but i think it also has impacts on like (58:13):
undefined
Logan:
people who rely on Google from a business perspective. (58:18):
undefined
Logan:
So I think you've seen this sort of like gradual transition and like lots of (58:21):
undefined
Logan:
shots on goal and lots of experiments happening on the search side. (58:26):
undefined
Logan:
And I think we're now getting to the place where like they have confidence that (58:30):
undefined
Logan:
they could do this in a way that is going to be super positive for the ecosystem (58:33):
undefined
Logan:
and is going to create lots of value for people who are going and using these products. (58:37):
undefined
Logan:
Like the understanding of AI technology has increased the adoption (58:43):
undefined
Logan:
and the models have gotten better and hallucinations have gone down and all (58:47):
undefined
Logan:
this stuff and I think there'll be also some like uniquely search things that (58:51):
undefined
Logan:
like only search can do and I've spent a bunch of time with folks on the search (58:57):
undefined
Logan:
team like Robbie Stein as an example who leads all the AI stuff in search and, (59:02):
undefined
Logan:
There's all of this infrastructure that search has built, which as you think (59:06):
undefined
Logan:
about this age of AI, where the ability to generate content, (59:11):
undefined
Logan:
which actually like looks somewhat plausible. (59:16):
undefined
Logan:
Has basically gone to zero. Like it's very easy to do that. (59:19):
undefined
Logan:
Great search is actually more, it's like this premium is like more important than ever. (59:23):
undefined
Logan:
There's going to be a million X or a thousand X or whatever than X number of (59:27):
undefined
Logan:
like growth and content on the internet. (59:32):
undefined
Logan:
How do you actually get people to (59:34):
undefined
Logan:
the most relevant content from people who have authority who (59:37):
undefined
Logan:
have you know expertise and all this stuff it's a (59:40):
undefined
Logan:
really difficult problem and it's like it is the problem of (59:43):
undefined
Logan:
the decade that like search has been solving for the (59:45):
undefined
Logan:
last 20 years and is now a more important problem than ever um so i'm i've never (59:48):
undefined
Logan:
been more excited for for the search team and like i think they've never had (59:53):
undefined
Logan:
a bigger challenge ahead of them as they try to like figure out how to make (59:57):
undefined
Logan:
you know these internet scale systems that they build continue to scale up to (01:00:00):
undefined
Logan:
solve this next generation of problems, (01:00:04):
undefined
Logan:
while also becoming this frontier AI product experience where billions of people (01:00:06):
undefined
Logan:
are experiencing AI for the first time in a different way than they've done. (01:00:11):
undefined
Logan:
There's so many interesting use cases too, (01:00:15):
undefined
Logan:
even around like image search is a great example of this like new sort of, (01:00:19):
undefined
Logan:
it's like one of the fastest growing ways in which people are using search now (01:00:26):
undefined
Logan:
is showing up with an image and asking questions about it. (01:00:29):
undefined
Logan:
And just like the way people had just traditionally used search is already changed. (01:00:32):
undefined
Logan:
It's like different than it was five years ago or even two years ago. (01:00:38):
undefined
Logan:
And I think we're gonna continue to see that happen. I think search as the product (01:00:42):
undefined
Logan:
you see today will evolve to have things like multi-line text input fields as (01:00:46):
undefined
Logan:
sort of user questions change and all that stuff. (01:00:52):
undefined
Logan:
So there's so much cool stuff on the horizon for search that I'm really excited. (01:00:55):
undefined
Josh:
Yeah, as I'm hearing you describe all of these cool new things, (01:01:00):
undefined
Josh:
particularly funneling into a single model, like the science breakthroughs are unbelievable. (01:01:03):
undefined
Josh:
And I think that's what gets me personally really excited, like Ejaz, (01:01:08):
undefined
Josh:
is this is actually going to help people. (01:01:10):
undefined
Josh:
Like this is going to make a difference in people's lives. Right now, (01:01:13):
undefined
Josh:
it's a productive thing. It's a fun thing. It's a creative thing. (01:01:15):
undefined
Josh:
There's a lot of tools. But then there's also the science part and a lot of (01:01:17):
undefined
Josh:
this all funneling down to one amazing model. (01:01:20):
undefined
Josh:
I think it leaves us in a really exciting place to wrap up this conversation. (01:01:22):
undefined
Josh:
So Logan, thank you so much for coming and sharing all of this, (01:01:25):
undefined
Josh:
sharing the news about the new model, sharing all of the updates and progress (01:01:28):
undefined
Josh:
that you're making everywhere else. I really enjoyed the conversation. (01:01:31):
undefined
Josh:
For you, you also have a podcast called Around the Problems. (01:01:34):
undefined
Josh:
Is there anything you want to leave listeners with to go check it out or to (01:01:38):
undefined
Josh:
check out the new AI studio or the new AI model? (01:01:41):
undefined
Josh:
Let us know what you have interesting going on in your life. (01:01:44):
undefined
Logan:
I love seeing feedback about AI studio. So if you have things that don't work (01:01:46):
undefined
Logan:
that you wish worked, even for both of you, please send them to me. (01:01:50):
undefined
Logan:
Would love to make things better. for the new model as well. (01:01:53):
undefined
Logan:
Like if there's, I think this is like still, this is still early days of what (01:01:56):
undefined
Logan:
this model is going to be capable of. (01:02:01):
undefined
Logan:
So if folks have feedback on like edge cases or use cases that don't work well, (01:02:02):
undefined
Logan:
please reach out to our team. (01:02:07):
undefined
Logan:
Send us examples on Axe or Twitter or the like. (01:02:09):
undefined
Logan:
Would love to help make some of those use cases come to life. (01:02:12):
undefined
Logan:
And I appreciate both of you for all the thoughtful questions and for the conversation. (01:02:16):
undefined
Logan:
This was a ton of fun. We got to do it again sometime. (01:02:20):
undefined
Josh:
Awesome. Yeah, we'd love to. Anytime. time please come and join us we really (01:02:23):
undefined
Josh:
enjoyed the conversation so thank you so much for watching for the people who (01:02:26):
undefined
Josh:
enjoyed please don't forget to uh to like share it with your friends and do (01:02:29):
undefined
Josh:
all the good things and we'll be back again for another episode soon thank you so much (01:02:31):
undefined
Josh:
i have a fun little bonus for those of you still listening all the way to the (01:02:39):
undefined
Josh:
end the real fans when we were first going to record with logan we actually (01:02:42):
undefined
Josh:
had no idea that he would break the exclusive news of nano banana on our show (01:02:46):
undefined
Josh:
it was super cool so we wanted to kind of restructure the episode to prioritize that at the front, (01:02:49):
undefined
Josh:
we did record a separate intro where I said, hey, Google makes some really good stuff. (01:02:53):
undefined
Josh:
In fact, you guys have an 80 something percent chance of being the best model (01:02:58):
undefined
Josh:
in the world by the end of this month. (01:03:01):
undefined
Josh:
Can you explain to us why, why Google is so amazing at what they do? (01:03:03):
undefined
Josh:
And this was the answer to the question. So here's a nice little nugget for (01:03:06):
undefined
Josh:
the end to take you out of the episode. (01:03:09):
undefined
Josh:
Thanks for listening. I really hope you enjoyed and we'll see you guys in the next. (01:03:11):
undefined
Logan:
My general worldview of like why Google is in such a good place for AI right now. (01:03:13):
undefined
Logan:
There's many layers of this, depending on sort of what vantage point you want to look at. (01:03:18):
undefined
Logan:
I think on one hand, it's like, I think search is this like incredible part (01:03:22):
undefined
Logan:
of this story, which I think people have historically looked at Google search (01:03:28):
undefined
Logan:
as this legacy Google product. (01:03:31):
undefined
Logan:
And I think search is going through this transition and is actually like today (01:03:33):
undefined
Logan:
actually just announced as we're recording this earlier, that AI mode is rolling (01:03:38):
undefined
Logan:
out to 180 plus countries. (01:03:41):
undefined
Logan:
English supported right now, and hopefully other languages in the future and is a great example of. (01:03:44):
undefined
Logan:
Ai overviews and ai overviews sort of double (01:03:50):
undefined
Logan:
clicking into ai mode being this (01:03:53):
undefined
Logan:
product that actually like for many (01:03:56):
undefined
Logan:
people around for for billions of people around the (01:03:59):
undefined
Logan:
world is the first ai product experience that they (01:04:02):
undefined
Logan:
actually touch um and i think there's like something (01:04:05):
undefined
Logan:
really interesting where like google has been on this (01:04:08):
undefined
Logan:
mission of like deploying ai and like you know there's some (01:04:10):
undefined
Logan:
you know the the some naysayers on (01:04:14):
undefined
Logan:
twitter will be like you know google created the transformer and then (01:04:17):
undefined
Logan:
did nothing with it and it's actually uh very far from (01:04:20):
undefined
Logan:
the truth which is search has been this like transformer which (01:04:22):
undefined
Logan:
is the architecture that powers language models and uh (01:04:25):
undefined
Logan:
and gemini um has been powering that experience with this (01:04:29):
undefined
Logan:
technology for the last like seven years the product experience maybe (01:04:32):
undefined
Logan:
looks slightly different today than it did then um but google's been an ai first (01:04:35):
undefined
Logan:
company uh for as long as i can remember um basically as long as ai has existed (01:04:40):
undefined
Logan:
that's been the case and now we're seeing more and more of these product surfaces (01:04:44):
undefined
Logan:
like become these frontier AI products (01:04:48):
undefined
Logan:
as sort of Google builds the infrastructure to make that the case. (01:04:50):
undefined
Logan:
I think people also forget like it's not easy logistically to deploy AI to billions (01:04:55):
undefined
Logan:
of people around the world. And now as you look at like, I think Google has (01:05:00):
undefined
Logan:
like five or six billion plus user products. (01:05:03):
undefined
Logan:
So the challenges of like, even just making a small AI product work today, (01:05:08):
undefined
Logan:
if anyone's played around with stuff or tried vibe coding something like it's not easy, (01:05:12):
undefined
Logan:
doing that at the billion user scale is also very difficult um (01:05:15):
undefined
Logan:
so i i continue to be more and more bullish and (01:05:19):
undefined
Logan:
like part of the thing that allows us to do that billion user scale (01:05:22):
undefined
Logan:
deployment is the whole infrastructure story like if you're watching on video (01:05:25):
undefined
Logan:
i don't know if you can see but i have a couple of tpus sitting behind me um (01:05:28):
undefined
Logan:
yeah and like that tpu advantage which is our sort of equivalent to gpus um (01:05:32):
undefined
Logan:
is something that i think is going to continue to play out so there's there's (01:05:37):
undefined
Logan:
so many things that I get excited about, (01:05:40):
undefined
Logan:
and the future is looking very bright. (01:05:43):
undefined