1-28-25 *INTERVIEW* Theoretical Machine Learning with Amit Rege PhD Student in Computer Science - The Ross Kaminsky Show

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
All right, let's do something completely different.

Speaker 2 (00:02):
Now. You all know that I am a tech nerd,
and you know that I spent a fair bit of
time yesterday talking about deep Seek and even my own
experience because I signed up for the app yesterday before
apparently the DDoS attack that kind of sort of took
it down. To get a little experience with it, but
I'm barely a beginner at this stuff, and I wanted

(00:24):
to talk to an expert because, after all, the news
about this one AI engine took a trillion dollars off
the value of just like seven tech stocks, so this
is clearly a big deal, joining us to talk about it.
I'm at Reggae is a PhD student in computer science
at CU Boulder. He specializes in theoretical machine learning that

(00:45):
US non experts might call AI, and his research focuses
on the interpretability of modern generative models, whatever that means,
and maybe he'll explain it.

Speaker 1 (00:57):
Hi, i'm itt, thanks for being here.

Speaker 3 (01:00):
I asked good to be here.

Speaker 2 (01:01):
What does interpretability of modern generative models mean?

Speaker 3 (01:06):
So generative, so let's start with the generative parts. Generative
models basically mean that you're producing some kind of generative output.
So an example would be CHATGPT, which produces text. Another
example would be visual generative models, which produce image data
or videos. Interpretability means that you're trying to understand what

(01:26):
led the model to make a decision that you see
on the outside. So these models seem to be black boxes.
So I'm trying to sort of open the black box
and understand what's going on.

Speaker 2 (01:35):
Okay, So there's lots of aspects of deep seek to
talk about, but I think I'll start with something that
sounds like it's kind of sort of in the direction
that you're just talking about. I'm not going to open
the app right now and look at it, but I
asked it a few questions yesterday, and every time before
it answered the question, it first explained how it was

(01:55):
interpreting my question, which was very interesting. And I have
not seen that on chat g p T. One of
my listeners said that his his office, his work AI
does that, but I hadn't seen it before.

Speaker 1 (02:09):
Is that important and if so? Why? Yeah?

Speaker 3 (02:13):
It is. It is called something called it is called
technically chain of thought. It basically, you know, explains how
the model is getting at a particular decision, and it's
a very super modern sort of notion that has come
up in these kind of models in the last year
or so. Actually, if you use a modern version of

(02:35):
chat GPT, like if you pay for it, it allows
you to use like, you know, more advanced models, and
they do have you know, something called chaineoff thought as well,
where model thinks in quotes and you know, chat GPT
unfortunately doesn't show all of the chain of thoughts. HC
being open source, it shows you everything that goes along. So, yeah,
it is useful understand what's going on inside.

Speaker 2 (02:58):
How much does that particular kind of functionality impact whether
you think ordinary people in businesses and people will use AI,
will trust it and feel comfortable with it, and and
related to that, how much do you think that chain
of thought functionality can can help people feel that this

(03:20):
isn't like the terminator out to destroy them.

Speaker 3 (03:24):
Yeah, I think you know that. You know, that's a
great question. A large part of my research actually focuses
on some of our questions. I think the main problem
with you know, chain of thought right now is that
we don't know if you know, the model is actually
thinking what it's telling us it's thinking, right, We don't
know how faithful this chain of thought is to its
actual reasoning process. Right, So that's that's something which is

(03:48):
an active research direction. But in general it you know,
we have to be careful about trusting the model's thinking
behavior because we might think, I mean, we might look
at it and you know, gain some sort of action. Oh,
you know, this is how the model is thinking, but
it might be thinking in a completely different way. Right.
So in general, for the public, I think this is

(04:08):
a good step in the right direction because you know,
previously we had nothing. Now we at least have something.
But we have to be careful about, you know, not overly,
you know, ascribing human like behaviors to lease technical marks.

Speaker 1 (04:21):
Fascinating.

Speaker 2 (04:22):
We're talking with met Reggae from the University of Colorado,
and just one quick comment and then I'll move on
to another issue. So before you were born, when I
started with computers, I was programming in basic on like
a TRS eight. I don't know if you ever heard
of that. And Apple two plus was the first computer
I owned. And of course, when I wrote a program

(04:45):
in basic to do whatever it did, I would always.

Speaker 1 (04:48):
Know exactly what it was thinking.

Speaker 2 (04:50):
And if the program did something wrong, I could go
find the line of code that was giving me the
wrong answer where I accidentally put a minus sign instead
of a plus sign, or an and instead of an ore.
It's incredible that at this level of computer programming, even
the programmer doesn't necessarily know what the program is doing.

(05:10):
Can you just give me seventeen or no, thirty seven
seconds on that?

Speaker 1 (05:15):
Because I like.

Speaker 2 (05:15):
Prime numbers, by the way, that's why I'm giving you
thus numbers.

Speaker 3 (05:20):
Yeah, Yeah, that's an awesome question. I think the main
sort of technical challenge towards something like this, inductively speaking
at least, is that these models are so large and
so complex, and we don't know, you know, where this
functionality is arising from. And also, you know, a lot
of this functionality that we see on the outside actually
arises we don't actually train the model to explicitly do

(05:41):
something like this. And because we didn't program the model
to do explicitly you know, reasoning quote unquote it, and
because it arises, we don't know where it atizes from.
So it's almost like a reverse engineering job for us
to figure out where it aizes from. So that's where
it's a complexity.

Speaker 2 (05:57):
Yeah, that's just incredible. Okay, so now let's talk about
I'll ask this question. You can answer it any any
way you like. Why did the release of deep Seek's
AI app and the current functionality take a trillion dollars
off of the value of tech stocks yesterday?

Speaker 3 (06:19):
Yeah, I think, you know, this is a little out
of outside my expertise, but I'll try to answer it
as best as I can. So I think the main
sort of innovation, apart from the technical achievements, is that
deep Seek actually open source their model. You know, in
ordinary language, that means, you know, they put out the
entire model out on the internet for anyone to use.

(06:40):
And basically, you know, for the top us companies, the
business model for now has been, you know, we create
a large model and we provide access to people on
the outside by them paying for it. As here, these
these companies have put out their models for everyone to
use for free, and so so now people awarded that.

(07:01):
You know, if these kind of you know, highly performance
models out there were used to free, who's going to
buy GPUs, who's going to buy uh, you know, open
the eye services and things like that. So I think
that that is the overouching fear out there in the market.

Speaker 1 (07:16):
So, yeah, that's whine.

Speaker 2 (07:17):
I think you're right, and I think I mean chat GPT,
open AI is not is not public. So most of
the stuff that was getting crushed yesterday was on was
was hardware, but there was some sell off in Meta
and Google and you know companies that are also perceived
to be getting into the into the AI space. It
was mostly the hardware that got that got crushed. So

(07:39):
let's talk about that for a second. If you believe
deep Seek's story about how they created it.

Speaker 1 (07:47):
And I probably believe it. I'm not sure. I'm always
skeptical with stuff coming out of China, but I probably,
I probably believe it.

Speaker 2 (07:55):
They're saying that they did this with five or six
million dollars and I don't know, maybe give or take
five percent or something of the number of chips that
a lot of people are talking about normally needing to
develop a model like this. So, first, do you believe them?
And if it's true that they did it for six

(08:17):
million dollars and a single digit percentage of the number
of chips you would normally think about needing.

Speaker 1 (08:22):
To buy, what does that mean for.

Speaker 2 (08:25):
The whole industry, not the financial side, what does it
mean for the development of AI?

Speaker 3 (08:30):
Yeah, I think there are great questions. I think I
do believe them when they say something like this, because
I think most of the research focus, most of the
focus of the top companies has been on scaling these
models to become bigger and bigger. And one's really you know,
tried to constrain themselves within a fixed amount of computing
budget and try to figure out if there's like clever
optimizations to do. And what you know deep Seek has

(08:53):
figured out is to do precisely that, you know, figure
out clever optimizations. And I think or the broader sort
of industry in general, I think this is a great
step because it goes towards, it goes a long way towards,
you know, providing everyone access to these highly performant models,
because there was a fear out there saying that you know,
as these models become larger, ordinary people get left behind.

(09:16):
But now you know, the smallest versions of the models
that deep Seek has put out can be done in
your MacBook, which is a tremendous achievement compared to like
a couple of years.

Speaker 1 (09:26):
Yeah, I mean, just a quick comment.

Speaker 2 (09:28):
Yet yesterday there's all this panic and all the sell
off on the hardware side, because maybe maybe people don't
need to buy as many chips, but there isn't near
There hasn't been yet. Maybe it'll start today, And that's
probably part of the reason Apple is actually did well yesterday,
is doing very well today. I mean, AI is going
to be incredibly useful. AI is a transformative technology at

(09:50):
the level of how transformative.

Speaker 1 (09:51):
The Internet itself was.

Speaker 2 (09:53):
And if suddenly AI can be had for cheap or
for free and not be the you know, cloistered, elitist
thing that only a few billionaires can handle, and then
and then they will deign to sell you little bits
of it, that's an amazing thing. Do you want to
offer any quick comment on that?

Speaker 3 (10:10):
Yeah, I think, you know, it's it's an incredible achievement,
you know, to democratize AI in this way. And I personally,
you know, do not believe that hardware demand is going
to go down. I think, you know, with democratization, I
think more people will want hardware, and so I think
the set off is a little short sat in my opinion.

Speaker 1 (10:28):
That's that's really interesting.

Speaker 2 (10:30):
So instead of one hundred people, instead of one hundred
companies spending a billion dollars each, there could be ten
thousand companies spending ten million dollars each.

Speaker 3 (10:44):
Exactly. I think more people, you know, if you want
to start a company, because it's such a force multipler. Yeah,
people will start want to start their companies. It's fewer
people and more machines and therefore you know more machine.

Speaker 2 (10:56):
Last quick question, I think I saw many of your
web pages online and it mentioned if different cities. But
are you originally from Goa or from somewhere else?

Speaker 3 (11:04):
Now? My family is from Gola.

Speaker 1 (11:06):
I come from Omay, gotcha, all right?

Speaker 2 (11:08):
I've actually I was in both on my honeymoon. I
love Goa. Go is just such a wonderful city and
everybody should check it out. It's a very it's a
very unusual Indian city because it's one of the few
places that was a Portuguese colony. So you go to
Goa and there will be a lot of Indian folks
who have Portuguese names like Joao, and the food's a

(11:32):
little different, and it's on the beach and it's just gorgeous. Anyway,
I hadn't thought about Goa in a while, so I
read about it in your bio and I thought that
was cool.

Speaker 1 (11:40):
I'm at reggae is a PhD.

Speaker 2 (11:42):
Candidate at the University of Colorado in computer science. Has
been great talking to you. Thank you for doing this.
I hope you'll come back on the show in the future.

Speaker 3 (11:50):
Thank you sure.

All Episodes

1-28-25 INTERVIEW Theoretical Machine Learning with Amit Rege PhD Student in Computer Science

Episode Transcript

The Ross Kaminsky Show News

Popular Podcasts

On Purpose with Jay Shetty

Stuff You Should Know

Dateline NBC

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}1-28-25 *INTERVIEW* Theoretical Machine Learning with Amit Rege PhD Student in Computer Science

Episode Transcript

The Ross Kaminsky Show News .css-1q01m3q{margin:0 0 -2px 0;}

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}On Purpose with Jay Shetty

Stuff You Should Know

Dateline NBC

All Episodes