Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
- Okay.
(00:01):
- Man, this is great.
We are now like fully in color on video.
So we are experimenting with a video podcast this time.
So-- - Going multi-modal.
- That's right.
So if you're listening,
you could be watching us on YouTube.
And also, if you're listening--
- Are we going full YouTube video or shorts?
(00:23):
- Maybe both, I don't know.
So we'll kind of see that.
- Yeah.
- But we're doing a lot of experiments here with the podcast.
So we just got this really cool software called Opus.
- Opus--
- So not like the anthropic opus,
but a different opus.
So this is Opus Pro,
which it'll take a full-length video.
(00:46):
And then it will chop it up into different clips
that it thinks is viral.
And then our goal is maybe you try to post those
all over the internet.
So we made accounts at YouTube.
We made an account on TikTok.
And then, well, I tried to make an account on Instagram Reels,
(01:07):
but it seems like we're having some issues.
So it immediately got banned.
So I'm going to try to figure that out.
- Interesting.
- Yeah.
Anyways, don't worry.
We will be there wherever you are.
We're going to try to make it so that you will be able
to find us.
- Yeah, I don't know what Mark has been doing on his Instagram
account, but we'll figure it out.
- And the service is actually amazing.
(01:28):
It not only generates clips, it transcribes the audio
and creates captions with highlighting
like those trendy short videos that you see
that pull snippets from popular YouTubers,
like Joe Rogan or whoever, and create little sound bites.
And the way it selects these clips is actually really cool too.
(01:49):
It picks them based on a couple different criteria
to maximize virality.
And gives you scores for maybe like 10, 15 different clips
that it generates from an hour long video.
So really amazing for creators.
It's very little effort.
It's like one click generates all of these videos
(02:10):
that have the potential to grow viral.
So really exciting.
- Yeah, it's super cool.
So we have this sort of like master plan.
So like both of us are working full time.
Like, you know, we don't necessarily have a lot of time
to be able to post daily clips.
- Yeah, we both just beat rush hour traffic
to get here to try to record a video for you guys.
- We have literally 30 minutes before we have to run
(02:33):
to the next thing.
So like we don't have a lot of time.
So like we really try to like optimize as much as possible.
But this Opus Pro or Opus Clips Pro, whatever it's called,
it has the ability to automatically post every day.
So what we're, are the very least
generate viral clips.
- That's right.
And I think automatically post them too.
- Okay.
(02:53):
- So at least I think it does.
So like the idea would be is like we can maybe do,
you know, 30 minute hour long podcast
with lots of little sound bites in it.
And then maybe post every day of the week.
So then you guys will be able to see us every single day.
Whereas like we may not actually be like
actually recording every day.
(03:14):
- Not that we really want to spam people
with a bunch of content that's not relevant.
But I think these short digestible bites are easier
to listen to as opposed to a long half an hour
hour of audio podcast.
- That's right.
Because if you don't want to hear us like ramble
for 30 minutes, which I mean like, you know,
(03:34):
thank you for listening.
But we try to keep this interesting.
But if all you have is 30 seconds,
we want to meet you where you are.
So anyways, Shashank, what's in the news this week?
It's been kind of a slow week I think.
- It has been a little bit of a slow week.
I think we've been getting a lot of competition
(03:56):
from other companies releasing smaller models
and bigger models and models of like all scale and sizes.
So we talked about Lama 3, which was, you know,
state of the art a week ago.
Now we have a new state of the art.
Microsoft released their 5, 3 models,
which are also open source.
I'm not sure about the license, but.
(04:19):
- Yeah, so anyways, we talked about the Lama 3 last week.
So, you know, for full context,
listen to their really brief in case like you're joining
this new Lama 3 is Facebook's new or meta's new
state of the art model.
They have a bunch of different sizes.
They're working on a super big one.
I think it's like a 400 billion parameter one
(04:41):
and then they're still training it.
- Yeah.
And then they've released a 70 billion parameter
and then they released a eight billion dollar.
And not a billion dollar, but they might lose their money.
But they released a eight billion parameter model
and then not to be left out.
Microsoft released a five three.
- Yeah, so for context, I think the Lama one was interesting
(05:04):
because what they try differently
among other things, I'm sure, is they just trained it
for much longer than what people usually do.
So the performance curve kind of like diminishes
over time or rather which the amount of training data,
(05:24):
but they found that it doesn't stop getting better.
It just like gets better less over time.
But they're like, you know, for a model
that's gonna be used a lot, that kind of makes sense
and for them, they're, shit, they serve,
they serve half the world's population,
(05:46):
like three billion users or something across all their apps,
WhatsApp, Instagram, Facebook, Messenger, et cetera.
And they started rolling it out to all of those apps
as far as I know.
I was looking at WhatsApp and I, you can ask it to imagine,
slash imagine, an image and a generate image,
(06:08):
it responds with text.
And I'm assuming it's powered by their latest
and greatest Lama 3 model.
- Yeah, yeah, it's, I would assume,
I would say, I mean, like why would they use it?
And it's crazy Facebook scale.
I mean, like, insane.
- Yeah.
- Yeah.
- Anybody who's like, I'm in and at,
or like even has computer,
(06:28):
probably has some sort of Facebook account.
- And speaking of people, organizations at scale,
Microsoft, another one, I don't understand
their position in this open source world,
but they're trying with 5.3 and they released
a much smaller model, three billion parameter model,
(06:52):
compared to all the other models,
which are like seven or eight billion from meta, Lama 3,
the Databricks, the snowflakes,
and a bunch of other competitors.
But this one's a tiny three billion parameter model,
and it's supposedly better than the Lama 3, eight billion.
(07:15):
So it's like, they're doing a bunch of experiments
with the quality of the training data,
the amount of time that they train these models for,
and we're getting all sorts of trade-offs.
So this one specifically, it's very tiny,
very quick, very efficient, and very good
(07:35):
at certain kinds of reasoning,
but what it's not good at is all this general world knowledge
that you get when you just shove a bunch of data
into these large models.
- Yeah, so I was,
going a bit of a tangent from that,
where you say it's not necessarily as good.
So I was listening to a podcast,
(07:57):
like yesterday, the other day,
where there were interviewing some researchers
from, I think it was anthropic.
It was the Dwarkesh Battalion.
- Yeah, so basically they set that,
another fantastic podcast.
- Great podcast.
- So yeah, really, really good.
They just, in a remark soccer bird,
(08:18):
like Polk's go.
- Anyways, amazing insightful podcast.
- So they kind of set this sort of interesting thing,
at least I thought of real estate industry,
and they set that, in a certain sense,
the bigger the model is, like the more parameters it has,
the more kind of data that it will be able to have
(08:40):
about the world, because really you think about it.
What these models are trying to do,
is they are taking all of the world's kind of written knowledge
and trying to compress them down into this,
like little tiny thing.
So like the three billion-term model,
I don't know how many of these are gigabyte or two.
So like super time.
So like, you think about it, like a gigabyte or two,
that's like the size of a movie, right?
(09:02):
So it's like, you either have like, you know,
like a Game of Thrones movie or whatever,
that's maybe even more than you can buy.
And then like something around the same size,
like on your computer,
would have like the entire world's knowledge.
So that's not feasible.
Right, exactly.
So what they mentioned is,
(09:23):
if you have a model that has like a long parameters,
you may have something like, okay,
like I'm gonna learn like all the different types of birds.
So it's like if you're gonna say like, okay, like,
I don't know, like a bald eagle or like a raven or sparrow,
whatever, right?
But then the smaller models,
(09:44):
they might not know all the types and all the details
about the type, but they might just know like, you know,
bird and like the meaning of bird,
but they might not like build and know
all the nuance in different types.
So like, in a certain sense,
like we really need like the bigger and bigger models
to be able to actually like know more information
about the world,
because really like these LLMs are a compression
(10:05):
of the world's knowledge.
- Yeah, they're good at compressing all this information,
but not very well.
So it's not like a zip file that compresses losslessly.
It's a very lossy compression.
And it tries to extract
higher level deeper underlying patterns
(10:26):
out of this surface level information.
So it's like a weird way to compress information
or compress the, how do I say it?
Compress knowledge that's contained in text.
- Yeah, but in a certain sense,
that's kind of what we do, right?
The LLMs are compressing data in a similar way
(10:48):
to the human brain, right?
Because if you think about it,
like we don't remember everything that we've ever learned.
Like people don't have a photograph of memory
except that maybe like really rare circumstances.
And like our brain is just like constantly filtering
(11:09):
everything all the time.
Like we remember everything.
We would just go insane, right?
Like forgetting things is a benefit.
But I think that in a certain sense,
like these LLMs when they're compressed
and they're smaller,
they're able to in a certain sense simulate the human brain
(11:30):
because like the compression is almost like forgetting.
- So maybe another thing that I was thinking of
that would be required to simulate the human brain
is to focus on specializing for different tasks.
So the snowflake model.
(11:51):
- Yeah.
They took this mixture of experts pattern
and just went wild with it.
- Yeah, so for reference, snowflake
is a company that just created this new open source model.
It's got a lot of parameters.
It's very, very big.
And a lot of people say a lot of mixtures in their experts.
(12:11):
But the model is actually not that big.
But it's like they have a lot of experts.
- Yeah.
- So in the mixture of extras model.
Anyways, yeah, go ahead.
- Yeah, so previously, I think the popular mixture of experts
have like eight or so less than 10 experts
(12:32):
that a organizer architect or a model router
would then decide which one of these tiny experts
would be best suited to complete this prompt.
So what snowflake did was to take that pattern
and build a mixture of experts model with 128 experts.
(12:55):
And that's like a crazy amount.
What some people mention is that then you run into trade-offs
where you have to focus on building a really good router
that decides which expert would be best suited
to complete this section of the prompt.
(13:15):
So I think they published a paper
and described their findings.
But we're having a lot of different companies
experiment with a lot of different approaches.
And snowflakes was the best model at the time,
but things changed so quickly.
(13:36):
They were surpassed by Lama III last week,
which was then surpassed by Microsoft's Buy2.
And we have Apple II that released another Apple release one.
Apple released another open source model.
I don't think it is the best in any of these verticals.
(13:56):
But what's interesting about theirs is that they,
there's Apple, there's this Apple.
Okay.
There actually there's this the least Apple
like Apple product that have ever seen.
Oh really?
It is the most open source model out there
with extremely permissive licenses
(14:18):
that allows anyone, anywhere of any scale to use it.
Whereas Meta's Lama III, most people can use it.
But if you're an organization,
be on a certain side as you're not allowed to use it
unless you talk to them and arrange some kind of deal.
Apple doesn't care.
They're like anyone who wants to use it,
you're free to do so.
(14:39):
Maybe they're taking a playbook out of Mark Zuckerberg
or play from Mark Zuckerberg where they're maybe trying
to become the industry standard for things.
I have no idea.
They're a bit lead to the game.
So they're trying something.
Well, I mean, I feel like Apple's always
going to be game for everything,
but they try to do it right.
But I don't know because like Ellens
are so experimental that like it's hard to say
(15:01):
to do it right.
So maybe they just need to kind of get it out
to the world and how people start banging on it.
Yeah, who knows?
Yeah.
Anyways.
So one thing that leads to experimentation,
is the Rabbit R1, the reviews, those are coming out.
So for those that don't know,
(15:22):
we talked a little about it before.
The Rabbit R1 is a device.
It kind of is about like the half the size of a phone.
It's about like, if you look my hands about like,
yay big, I'd say like that.
A little bigger than a deck of cards in the square.
It's orange.
And what it does is it is a device
(15:45):
that all it has is an LLM.
Just not even one or one, right?
Yeah, I did.
So I was listening to their CEO on a podcast
with Jason Calcanez.
And I think this week in startups or one of his podcasts.
And I just really liked the narrative
(16:06):
that their founder had.
He had worked on a previous generation
of voice smart assistants,
similar to Alexa and Google Assistant.
And he had worked with Teenage Engineering,
which is a hardware company that builds
really iconic, beautiful music, audio engineering equipment.
(16:26):
And they built this device or they showcased it
as like Kickstarter or something.
And they just went viral because the design was so iconic.
Very simple, elegant, little rectangular,
very, very orange cube with a little sliding wheel.
(16:48):
And for me, what the killer feature was,
they were focusing on a large action model.
So this was maybe two, three months ago,
a couple months ago, when agents were the hot new thing.
So people were like, okay, LLM's are awesome.
How do we push this technology further?
(17:09):
And everyone was like, okay, agents, let's build
these systems that get multiple flavors of these LLM's
to work with each other to accomplish a harder task.
So rabbit R1 is supposed to be a device
that can learn your actions in your workflows
(17:32):
and repeat them.
So say you're going on Reddit,
maybe trying to scour the latest news in a specific domain,
let's say medicine or something,
and trying to make a list of the best breakthroughs.
(17:52):
You do that once or twice,
train this little device, how you do this workflow,
and that it just magically lies.
- So it just watches you do a thing.
- Exactly.
So LLK, if I'm scrolling through Facebook
and I just keep doing scrolling, just keep scrolling down,
it'll just stuff, they're like, oh, I've learned,
that you just keep scrolling.
(18:13):
- Yeah, and LLM just keeps scrolling.
- Sure.
- As you read at the same speed.
So as opposed to just like, you know,
you haven't touched the screen, it could do it for you.
- I suppose it could do that,
but the pitch was supposed to be a little more targeted
towards product productive use cases.
- I mean.
- Specifically things where you do a lot of repetitive action,
(18:35):
that is a little tedious and extract some value out of that.
So for example, getting the ETA for your door dash order,
some food that you ordered, you're like, keep checking,
okay, how much longer is it gonna take?
How much longer?
Instead of having you do that manually,
you just ask this device, okay,
(18:56):
how much longer is it gonna take?
Show it how to do that, open up the door dash app,
scroll to your order, look at the ETA,
and then you have the response.
So you do that a couple times, you teach it how to do that.
And then it's learned this action,
and they're trying to build a large action model
that not only learns your specific actions
(19:18):
that you wanna teach it,
but then they wanna build an app store
for all these actions that people wanna teach it,
and then publish onto a store
that you can then share it with other people.
And the best part about this, it was a $200 device,
and that $200 gives you not only this device,
but also a free subscription for a complexity for a year,
(19:39):
which is also $200, and I was like,
- Oh wow, so it's basically free.
- This is a great deal, and for a complexity is an awesome website,
which not only is a good alternative
to popular search engines,
although search engines today are fantastic,
please use more search engines.
But,
I pull this claim right, I work at Google,
(20:01):
so it will definitely help me in the long run
if you keep using search engines,
but there are really cool alternatives
to the traditional search engines coming out.
And perplexity not only helps you search,
but they have this like co-pilot feature
that takes a search query and does like
a couple different steps of intermediate research,
(20:22):
and then comes up with the answer.
And the cool part about it is it allows you to access
all the state of the art models from their platform.
So if you wanna use anthropics,
cloud with the three or Google's Gemini Pro,
or meta is llama three or whatever,
you can access all that through this one subscription.
(20:43):
- You have to pay extra for that.
- One subscription.
- Wait, just like $200?
- Yeah.
- Or $20 a month or something.
- Yeah.
- Whatever is the standard rate,
same similar ballpark?
- Wow.
- That seems like a really good deal.
- It's a great deal.
But there's, I don't think they're the only ones doing that.
There's a couple aggregators that allow you
to pick and choose which model you want.
(21:04):
- And this is not sponsored by anybody.
- Not at all.
- We're not making any money off of this.
- In fact, we're losing this.
So, but it's okay.
- Yeah, so getting back to the rabbit R1,
this was maybe a couple a week or two after
the humane review, which was blasted
(21:26):
by all the tech reviewers.
- So the humane is like a pin that like,
what does it do?
It has like a little camera, it could watch you.
It has like a little laser that can go in your hand.
- Yeah.
- Just like a wearable LLM.
- Both of these are separate hardware devices
that have some kind of LLM built into it.
(21:49):
The humane is primarily a voice assistant.
I think it's very similar to like the Alexa or Google Assistant.
It answers back with your voice.
- Well, you know, I can't say the G word here.
I work on Google hardware products
and have a lot of G devices that get triggered
(22:11):
as soon as I say that word.
But yeah, so the humane AI pin is primarily
a traditional AI assistant with a small projector
that displays something on your palm if you wanted to.
And the rabbit R1 doesn't have a projector,
it has a tiny little screen that carries out actions
(22:33):
and also answers regular queries.
- Yeah, I mean, it seems like a pretty cool device.
- In theory, in theory.
- Yeah, and I know that like, you know,
a lot of the reviewers now said that like,
it's not like quite there.
- But they said it's awful.
- The humane AI pin was mentioned as the worst product
(22:54):
that they've ever reviewed.
And the rabbit R1 was not that far behind.
- See, but like, here's the thing.
I feel like if the rabbit R1 was released,
let's say even two years ago, people would be like,
this thing is amazing.
They'd be like, are you talking about just the hardware,
(23:15):
the LM, the action model, or everything?
- But I would say like as a package, right?
But like specifically like the LLAM,
like with the hardware, like if, if let's say,
here's 2024, let's say in 2022, right?
I came out with this little box, this magical box
(23:39):
could answer all your questions.
And people would be like, this is amazing.
Like, absolutely amazing.
- Yeah, I can talk to it, it talks back.
It's like, I mean, there's a lot of these things like,
Siri and whatever has been around for ever.
Right?
All the different devices.
But, you know, like I can ask Siri, like, you know,
(24:02):
what's the weather?
I can ask, you know, maybe it's set like in alarm,
time, right?
But like, it's not like quite as good
like conversational UX, I would say.
Right?
And I think like the grab that I won
since it's leveraging these large language models,
(24:23):
it's able to do that.
Now, like now, like I think we live in amazing times
to say that all the tech reviews are saying,
oh, this is terrible.
No, this is amazing.
It's a thing that goes and hits the internet
and comes back and is able to answer your question.
Like you could go, like take a picture of a plant
(24:44):
and it can tell you what the plant is.
You can go and talk to it, ask it,
like all the same thing you'd ask,
like Chachy Pt.
The goal posts have moved.
And like, I don't know, like it's like a,
you know, like old Louisie K. Bet,
where he'd go into like, oh, it's like,
it's like, I think it was like where he talks about,
(25:07):
like you're an airplane or like,
it's like, oh, like my phone smells like it's,
you know what I'm saying?
It smells like it bouts it around.
Yeah, why can't a scroll tick down on my airplane?
It's like, give it a minute.
I don't know.
So, yeah, to be fair,
the tech reviewers were taking into account
(25:30):
everything that's going on.
And they did mention that they have mad respect
for these companies,
for attempting something so bold, so unique
in a market that is like so uniform.
Every single cell phone is the same form factor.
It's a rectangular break that is looking
(25:52):
more and more indistinguishable from
every other manufacturer out there.
And we have these two companies that are pushing the envelope
for hardware, for consumer devices.
And it's really cool.
But it's not a finished product today.
The promise is really cool.
(26:12):
And that's one of the reasons why I got this.
And the price tag is very affordable without a subscription.
So the Robert R1 doesn't have a subscription,
which also makes it easier to digest.
Whereas the Humane AI pin not only is like a $600, $700 device,
but also has a monthly $20 subscription or something,
(26:32):
which is hard to swallow.
I don't know.
I don't know how I feel back in about like,
where are you?
I mean, like, are you a phone?
Like, I just talk to it.
And, you know, full disclosure, I also have friends
who work at Humane.
So I'm rooting for them.
I hope something good comes out of it.
Because it's not often you see new revolutionary hardware
(26:57):
companies come about.
And like, I hope they do something cool.
They do have some execs from Apple who
build an amazing hardware product.
So all the reviewers did mention that even though the end
to end experience isn't that great.
Like, there's a lot of lag.
They can fix that.
But like--
You can fix it in software.
You can.
(27:17):
Yeah.
A lot of the things, a lot of the complaints
that the reviewers had, latency was a big one.
They can improve that over time.
But the build quality-- oh my god.
This is a beautiful device.
I mean, we've heard.
Yeah.
I'm playing with it.
You mean, I trust these reviewers.
Yeah.
They've been consistently honest and forthright
(27:40):
with all the reviews that they've done pointing out
the good and the bad.
So it seems like they've given an unbiased review.
But what really convinced me about the Rabbit R1
was the App Store or Action Model.
And that's like something I can get behind something
I'm looking forward to.
If I can train this device to learn how to do things that
(28:02):
I'm doing day-to-day and offload
of that capability of the device, that would be fantastic.
That's super exciting.
That's amazing.
To be able to offload different actions.
I mean, think about all the repetitive tasks
that you do every single day.
And having an easy way to train this device,
just show it how you do something.
(28:23):
And then it magically learns.
As opposed to going into IFTT or Zapier
and manually dragging little plugins and endpoints
and hooking those up and making sure those are stable
across multiple updates of these different apps.
That is very tedious and very brittle too.
(28:46):
And this is something that not only can learn easily,
but can probably also withstand major updates
to these softwares and these applications
that it's running these actions on.
So that if this app, Uber, DoorDash, whatever
is changing their UI, updating the layout,
it can adapt to these little minor changes.
Yeah, I mean, I just kind of spit
(29:07):
also in the use case I can think of here.
Imagine you can have a car and you can get it to the office.
Every day, you can make it the same time.
It's just like I can learn that and then order Uber
the same time every day.
Or maybe the best bus route.
Yeah, it can order you the Uber every day.
Only when you're at your house.
(29:28):
That's a good, maybe test your location.
That'd be cool.
So what else?
Like order your lunch every day.
That's repetitive.
Maybe if you order your groceries online,
order the same things.
They're the same food.
Help us automate the podcast publishing process.
That's right.
That's right.
Yeah.
We use AI heavily in this podcast.
(29:51):
Not enough.
And we want to use more.
Yes.
Yes.
Because honestly, I think like us using the AI gives us
a use case for us to play around with it.
Because it's one thing, just look at the benchmarks.
But it's another thing to actually use it and like,
kind of sink your teeth into it.
You really get it.
I like that.
And we're trying to offload all the tedious parts
(30:12):
that anyone can do.
It's not something that's creative.
Uploading these files, formatting it,
leveling the audio and things like that.
Maybe you can even see if you can make the video quality better.
Because like, don't do it.
People do color correction stuff too.
You can find AI color correction.
(30:33):
That's the thing.
I don't know.
That's for the next episode.
Yeah.
Maybe.
Yeah.
Probably not this one.
But maybe in the future.
Because we have to run soon.
Time for the meetup.
Yes.
But anyways, thank you for listening.
And please, if you enjoyed this podcast,
or if you don't, please don't be back too.
(30:54):
Yeah.
Either one, it drives engagement.
But there are a couple things you can do.
One, if you're not subscribed, please subscribe
wherever you're listening.
YouTube automatically download it.
Let's both those numbers up.
And then if you already did that, please leave a review.
You can review up to five stars.
(31:16):
Most things, right?
And then so review.
And then subscribe.
And then if you've already done that,
please share this with a friend.
Anybody who you think might find value from this, please,
tell them about us.
Because we're not making any money off of this.
We're just trying to give something up in the world.
So again, thank you for listening.
(31:37):
And we'll see you next week.