Luis von Ahn Explains How Computers and Humans Learn From Each Other

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:02):
Bloomberg Audio Studios, Podcasts, radio News.

Speaker 2 (00:17):
Hello and welcome to another episode of The Odd Lots podcast.
I'm Tracy Alloway and I'm Joe Wisenthal. Joe, you know,
I had a life realization recently.

Speaker 3 (00:27):
Okay, this should be good, go on.

Speaker 2 (00:30):
It struck me that I am spending a non negligible
amount of my time proving that I am in fact
a human being.

Speaker 3 (00:39):
It's getting harder and harder. I know what you're talking about.
So we're talking. You know, you go to a website
and you have to enter in the captcha and it's
like click all these squares that has like a crosswalk
on them or a truck, and like it feels like
it's just getting harder. And sometimes I'm like, no, trust me,
I'm a human.

Speaker 2 (00:55):
This is it. And every time it happens, I kind
of have a moment of self doubt whether or not
is it just me? Am I particularly bad at picking
out all the motorcycles in a set of pictures? Or
are they just becoming increasingly weird or perhaps increasingly sophisticated
in the face of new types of technology.

Speaker 3 (01:17):
It's not just you. I've heard this from multiple people
in fact, prepping for this episode, I heard people talking
about exactly this, But you know, it's like a big problem.
You know, we did that world Coin episode, like everyone
is trying to figure out, like how in a world
of AI and bods and artificial intelligence all that stuff,
how do you know whether someone you're interacting with is

(01:38):
in fact a person.

Speaker 2 (01:39):
Yeah, and I'm glad you mentioned AI because obviously part
of this dynamic is AI seems to be getting better
at solving these particular types of problems, but also they're
being used more right to train AI models. So at
this point, I think we all know why we're constantly
trying to identify bikes and a bunch of photos. But

(02:00):
the whole idea behind captures is or was that humans
still have an edge. So there are some things that
humans are better able to do versus machines. And one
of the things that we used to talk about humans
having an edge in was linguistics. So there is this
idea that human language was so complex, so nuanced, that

(02:23):
machines would maybe never be able to fully appreciate all
the intricacies and subtleties of the human language. But obviously,
since the arrival of generative AI and natural language processing.
I think there's more of a question mark around that. Yeah.

Speaker 3 (02:38):
I mean, look, I think like a typical chat bot
right now is probably better than most people at just
typing out several paragraphs. It's all sort of like seemed
to sort of as they say on the internet, kind
of mid curve to me. It never like strikes me
as like incredibly intelligent, but clearly computers can talk about
as well as humans, and so it raises all sorts
of interesting questions. You mentioned that part of capture is

(03:01):
part of this, like training computers. A big part of
these chatbots the so called like real life human feedback
where people say this answer is better then another, this
answer is better another, is they refine the models, et cetera.

Speaker 4 (03:12):
So I think there's like.

Speaker 3 (03:13):
An interesting moment where like we're learning from computers and
computers are learning from us, maybe collaboratively, the two sides
in a carbon and silicon working together.

Speaker 2 (03:25):
I think that's a great way of putting it. Also,
mid curve is such an underappreciated insult, like calling people
top of the bell curve is one of my favorite
things to do online. Anyway, I am very pleased to
say that today we actually have the perfect guest. We're
going to be speaking to someone who was very instrumental

(03:45):
in the development of things like Captcha and someone who
is doing a lot with AI, particularly in the field
of linguistics and language. Right now, we're going to be
speaking with Louis von On. He is, of course the
CEO and co founder of Duo Lingo. So, Louise, thank
you so much for coming on.

Speaker 4 (04:04):
On thoughts, Thank you, thank you for having me.

Speaker 2 (04:06):
So maybe to begin with talk to us about the
idea behind capture and why it seems to have become
I don't want to say a significant portion of my life,
but I certainly spend a couple minutes every day doing
at least one version.

Speaker 4 (04:21):
Yeah. So the original capture, the idea of a capture
was a test to distinguish humans from computers. The reasons
why you may want to distinguish whether you're interacting with
a human or a computer online for example, and this
is kind of the original motivation for it. Companies offer
free email services, and you know they have the problem
that if you allow anything to sign up for a

(04:43):
freemail service, like either a computer or human, somebody could
write a program to obtain millions of free email accounts,
whereas humans, because they are usually not that patient, cannot
get millions of email accounts for themselves. They can only
get one or two. So the original motivation for aptual
was to make a test to make sure that whoever
is getting a freemail accunt is actually a human and

(05:04):
not a computer program that was written to obtain millions
of email accounts, so, you know, and the way it worked,
there's there's many kind of tests. Originally, the way it
worked is distorted letters, So you would get a bunch
of letters that were predistorted and you had to type
what they were. And the reason that worked is because
human beings are very good at reindistorted letters. But at

(05:25):
the time this was, you know, more than twenty years ago,
computers just could not recognize distorted letters very well. So
that was a great test to determine whether you were
talking to a human or a computer. But what happened
is over time, computers got quite good at this trying
to deciphering distorted text, so it was no longer possible

(05:45):
to give an image with distorted text and distinguish a
human from a computer, because computers pretty much got as
good as a human at that point, these tests started
changing to other things. I mean, one of the more
popular ones that you see nowadays is kind of clicking
on the images of something. So you can see a grid,
like a four by four grid, and it may say
click on all the traffic lights, or click on all

(06:07):
the bicycles, et cetera. And by clicking on them, you know,
you're you're showing that you can actually recognize these things.
And the reason they're getting harder is because computers are
getting better and better at deciphering which ones are traffic lights,
et cetera. And by now, what you're getting here are
the things that we still think computers are not very

(06:30):
good at. So the image may be very blurry, or
you know, you may just get a tiny little corner
of it and things like that. So that's why they're
getting harder, and I expect that to continue happening.

Speaker 3 (06:41):
So you are the found You founded a company called
recap Show, which you sold to Google and several years ago.
Is there gonna be a point where I mean, I
assume computer vision and their ability to decode images or
recognize images is not done improving. I assume it's going
to get better, whereas humans' ability to decode images. I

(07:03):
doubt it's really getting any better. We've probably been about
the same for a couple thousand years now. Like, is
there going to be a point in which it's impossible
to create a visual test that humans are better at
than computers?

Speaker 4 (07:15):
I believe that will happen at some point. Yeah, it's
very hard to say when exactly, but you know, you
can just see at this point it's getting you know,
computers are getting better and better. And you know, the
other thing that is important to mention is this type
of test has extra constraints. It also has to be
the case that it's not just that humans can do

(07:36):
it. It's like, really, humans should be able to do it
pretty quickly and you know, success.

Speaker 3 (07:43):
Quickly, and on a mobile phone and a very small
screen in which like my thumb is like half the
size of the screen.

Speaker 4 (07:49):
Yeah. Yeah, And it may not be you know, quickly.
I mean it may take you, I don't know, thirty
seconds or a minute. But we cannot make a test
that takes you an hour. We can't do that. So
it has to be quick. It has to be done
on a mobile phone. It has to be the case
that the computer should be able to grade it. Computer
should be able to know what the right answer was,
even though it can't solve it. So because of all
of these constraints, I mean, my sense is at some

(08:12):
point this is just going to be impossible. I mean,
we knew this when we started the original capture that
at some point computers were going to get good enough,
but we just had no idea how long it was
going to take. And I still don't know how long
it's going to take. But you know, I would not
be surprised if in five to ten years there's just
not much that you can do that is really quick

(08:33):
online to be able to differentiate humans from computers.

Speaker 2 (08:36):
Yeah, that's when we get the eyeball scanning ORBS. But
I mean you mentioned that you can't have a test
that takes an hour or something like that. But this
kind of begs the question in my mind of why
are people using these tests at all? So, like, Okay,
obviously you want to distinguish between humans and robots, but
I sometimes get the sense that these are basically free

(08:59):
labor AI training programs, Right, So even if you can
verify identity in some other way, why not get people
on a mass scale to spend two minutes training self
driving cars.

Speaker 4 (09:12):
Yeah, I mean, this is what these things are doing.
That was the original idea of Recapture, which was my company.
The idea was that you could, at the same time
as you were proving that you are a human, you
could be doing something that computers could not yet do,
and that data could be used to improve computer programs
to do it. So certainly, when you're clicking on bicycles

(09:32):
or when you're clicking on traffic lights or whatever, that
is likely data that is being used. I say likely
because you know, I don't know what capture you're using.
There may be some that are not doing that, but
overall that data is being used to improve things like
self driving cars, image recognition programs, et cetera. So that
is happening, and that's you know, generally a good thing

(09:54):
because that's making basically AI smarter and smarter. But you know,
we still needed to be the case that it's a
good security mechanism. So if at some point just computers
can do that, then you know, that's just not a
great security mechanism and it's not going to be used.
And my sense is if we're gonna want to do something,
we are going to need something like real identity, Like

(10:16):
I don't know if it's going to be eyeball scanning
or whatever, but it's good. We're gonna you know, the
nice thing about a capture is it doesn't tie you
to you. It just proves that you're a human. Right,
We're probably going to need something that ties you to you.
We're probably going to need something that says, well, I
just know this is this specific person because you know whatever,
we're scanning their eyeball, we're looking at their fingerprint, whatever

(10:39):
it is, and it is actually a real person, and
it is this person.

Speaker 3 (10:43):
Why don't we sort of zoom out and back up
for a second. So currently you are the CEO of
Duo Lingo of the popular language learning app, publicly traded company.
Done much better sort of stockwise than many companies that
came public in twenty twenty one. I have expected, you know,
there was a boom when people a bunch of time

(11:03):
on their hand gone down. You also sort of one
of the most respected sort of computer sciences thinkers coming
out of the Carnegie Mellon University. What is the through
line of your work or how would you characterize that
connects something like captures to language learning a dual lingo.

Speaker 4 (11:20):
It's similar to what you were talking about smiling when
you were mentioning that. I mean, I think the general
through line is a combination of humans learning from computers
and computers learning from humans. And you know, capture had
that while you were typing a capture, computers were learning
from what you were doing. In the case of duolingo,
it's really a symbiotic thing that both are learning, in

(11:41):
that humans are learning a language and in the case
of due a lingo, due lingos learning how to teach
humans better by interacting with humans a lot. So you know,
dual lingo just gets better with time because we figure
out different ways in which humans are just learning better.
You know, humans are getting better with a language, and
do a linguos getting better at teaching you languages.

Speaker 2 (12:19):
Joe, have you used to a lingo?

Speaker 3 (12:21):
I haven't. Well, okay, I hadn't up until recently. So
last week, as it turns out, I visited my mother
who lives in Guatemala, which luis I Anderson You're from,
And oh, wow, yeah, she's she is. Uh, she's not
from there, but she visited a friend there eight years
ago and she loved it, and she's like, I'm just
gonna stay and she has a little never left. She

(12:42):
loved it so much, and so I visited her for
the first time at her house near Lake Atitlan, and
then I was like, oh, there's a great life and
maybe one day I'll even have that house. And I
should learn Spanish, And so I did, partly because of
that trip and partly to prepare for this episode. I
downloaded it and have started. I know a little bit
of Spanish, not much like I can, you know, ask
for the bill and stuff, but it's like, oh, I should,

(13:04):
I should start to learn it.

Speaker 2 (13:05):
That's funny because I also started learning Spanish right before
a trip to Guatemala. There you go with Duolingo, and
I'm not the best advertisement for the app. I'm afraid,
like the only thing I remember is basically like Kissierra
una hapatas personas. That's all I remember from.

Speaker 3 (13:23):
It's pretty good.

Speaker 4 (13:25):
Thanks, that's pretty good.

Speaker 2 (13:26):
All right, I need to get back on it. But
why don't you talk to us a little bit about
the opportunity with AI in this sort of language learning space,
because intuitively, it would seem like things like chat bots
and generative AI and natural language processing and things like
that would be an amazing fit for this type of business.

Speaker 4 (13:49):
Yeah, it's a really good fit. So okay, So you know,
we teach languages. We do a lingo. Historically, you know,
learning a language just has a lot of different components.
You got to learn how to how to read language.
You got to learn some vocabulary, you got to learn
how to listen to it. If there's a different writing system,
you've got to learn the writing system, you got to
learn how to have a conversation. There's a lot of

(14:09):
different skills that are required in learning a language. Historically,
we have done pretty well in all the skills except
for one of them, which is having a multi turned
fluid conversation. So we could teach you, you know, historically, we
could teach you, We could teach your vocabulary really well.
We could teach you how to listen to a language.
It's you know, generally just by just getting you to

(14:30):
listen a lot to something. So we could teach you
all the things, but being able to practice actual multi
turned conversation was not something that we could do with
just a computer. Historically, that needed us to pair you
with another human. Now we do a ling We never
paired people up with other humans, because it turns out
a very small fraction of people actually want to be
paired with a random person over the internet who speaks

(14:53):
a different language. It's just it's kind of too embarrassing
for most people. I never did that. Well, it may
be dangerous, yes, but it also it's just it's like
ninety percent of people just not extroverted enough, yeah to
do that. I just don't want to do it. So
we always, you know, kind of we did these kind

(15:14):
of wonky things to try to emulate short conversations, but
we could never do anything like what we can do
now because with large language models, we really can get
you to practice you know, it may not be a
three hour conversation, but we can get you to practice
a multi turn, you know, ten minute conversation and it's
pretty good. So that's that's what we're doing with du

(15:34):
A Lingo. We're using it to help you learn conversational
skills a lot better, and that's helping out quite a bit.

Speaker 3 (15:41):
There are so many questions I have, and I you know,
I think my mom will rely like this episode because,
in addition to the Guatemala connection, she is a linguist.
She speaks like seven languages, including Spanish, and like basically
you know all the others, not all the others, but
all the others, many many others. But you know something
that I was curious about, and maybe this is a

(16:03):
little bit of random jumping point, you know. I think
about like chess computers, and originally they were sort of
trained on a corpus of famous chess games, and then
with some.

Speaker 4 (16:13):
Computer they got better.

Speaker 3 (16:14):
And then the new generation essentially relearned chess from just
the rules from first principles, and it turns out that
they're way better. And I'm wondering, if you're learning through
the process of building out do a lingo improvement, Like
are there forms of pedagogy that in language learning, whether
it's the need for immersion or the need for roat drills,
or certain things that linguists have always thought were necessary

(16:37):
components of good language learning that when rebuilding education from
the ground up, like old dictums just turn out to
be completely wrong, And when you rebuild the process from
the beginning, like novel forms of pedagogy emerge.

Speaker 4 (16:53):
It's a great question, and it's a hard question to
answer for the following reason, at least for us we
teach a language from an app. Historically, the way people
learn languages is basically by practicing with another human or
being in a classroom or whatever. Whereas we teach from
an app, the setting is just very different for one

(17:14):
key reason, which is that it is so easy to
leave the app, whereas leaving a classroom it's just not
that easy. You kind of have to go. You're usually
forced by your parents to go to a classroom, and like,
you know, so generally, the thing about learning something by
yourself when you're just learning it through a computer is
that the hardest thing is motivation. It turns out that

(17:37):
the pedagogy is important, of course it is, but much
like exercising, what matters the most is that you're actually
motivated to do it every day. So like, is the
elliptical better than the step climber or better than the treadmill? Like, yeah,
they're probably differences, but the reality is what's most important
is that you kind of do it often. And so

(17:57):
what we have found with dual linguo is that if
we're going to teach it with an app, there are
a lot of things that historically, you know, language teachers
or linguists didn't think we're the best ways to teach languages,
but if you're going to do it with an app,
you have to make it engaging. And we've had to
do it that way, and we have found that we
can do some things significantly better than human teachers, and

(18:20):
something's not as good because it's a very different system.
But again, the most important thing is just to keep
you motivated. So examples of things that we've had to
do to keep people motivated are quote unquote classes, which
is a lesson undu a lingo. They're not thirty minutes
or forty five minutes, they're two and a half minutes.
If they're any longer, we start losing people's attention. So

(18:42):
stuff like that I think has been really important. Now
I'll say, related to your question, one thing that has
been amazing is that, you know, we start out with
language experts who you know, people with PhDs and second
language acquisition, who tell us how to best teach something.
But then it takes it from there and the computer
optimizes it, and so the computer starts finding different ways.
There are different orderings of things that are actually better

(19:05):
than what the people with phg's and second language acquisition thought.
But it's because they just didn't have the data to
optimize this, whereas now you know, we do a lingo,
we have it's something like one billion exercises. Is one
billion exercises are solved every day by people using dual lingo,
and that just has a lot of data that helps
us teach better.

Speaker 2 (19:23):
This is exactly what I wanted to ask you, which
is how iterative is this technology? So how much is
it about the AI model sort of developing off the
data that you feed it, and then the AI model
improving the outcome for users and thereby generating more data
from which it can train.

Speaker 4 (19:43):
It's exactly we're exactly doing that, and in particular, one
of the things that we've been able to optimize a
lot is which exercise we give to which person. So
when you start a lesson and do a lingo, you
may think that all lessons are the same for everybody.
They're absolutely not. When you use to a lingo, you
watch what you do, and you know, the computer makes

(20:04):
a model of you as a student, so it sees
everything you get right, everything you get wrong, and based
on that, it starts realizing you're not very good at
the past tense, or you're not very good at the
future tens or whatever. And whenever you start a lesson,
it uses that model specifically for you, and it knows
that you're not very good at a past tense, so
it may give you more past tense or it does

(20:24):
stuff like that. And that definitely gets better with more
and more data. And I'll say another thing that is
really important. If we were to give you a lesson
only with the things that you're not good at, that
would be a horrible lesson because that would be extremely frustrating.
It's just basically, here are the things you're bad at,
just that we do a lot more of that. So
in addition to that, we have a system that tries

(20:45):
to and it gets better and better over time. It
is tuned for every exercise we have on DUELINGO that
could give you. It knows the probability that you're going
to get that exercise correct. And whenever we are giving
you an exercise, we optimize so that we try to
only give you exercises that you have about an eighty
percent chance of getting right. And that has been quite

(21:06):
good because it turns out eighty percent is kind of
at this zone of maximal development where basically it's not
too easy because you're not getting Having a one hundred
percent chance of getting it right if it's too easy
has two problems. Not only is it boring that it's
too easy, but also you're probably not learning anything if
you have a hundred percent chance of getting it right.
And it's also not too hard because humans get frustrated

(21:28):
if you're getting things right only thirty percent of the time.
So it turns out that we should give you things
that you have an eighty percent chance of getting right,
and that has been really successful, and you know, we
keep getting better and better at at finding that exact
exercise that you have an eighty percent chance of getting right.

Speaker 3 (21:42):
Okay, I have another I guess I would say theory
of language question, and I think I read in one
of your interviews. You know, is part of the process
of making the dual lingo ad better, you're always a
b testing things like should people learn vocabulary first, should
people learn adjectives before adverbs or a verbs before verbs,
whatever it is, and that there's this constant process of

(22:04):
what is the correct sequence? Do rules about the sequence
of what you learn differ across languages. So let's say
someone learning Portuguese may have a different optimal path of
what to learn first grammatically or vocabulary wise, versus say
someone learning Chinese or Polish, because I'm curious about whether

(22:24):
we can undercover deep facts about common grammar and language
from the sort of learning sequence that is optimal across languages.

Speaker 4 (22:33):
Yes, they definitely vary a lot based on the language
that you're learning, and even more so, they also vary
based on your native language. So we actually have a
different course to learn English for Spanish speakers than the
course we have to learn English for Chinese speakers. They
are different courses, and there's a reason for that. It

(22:55):
turns out that what's hard for Spanish speakers in learning
English is different than it's hard for Chinese speakers in
learning English. Typically, you know, the things that are common
between languages are easy, and the things that are very
different between languages are hard. So just a stupid example,
I mean, when you're learning English from Spanish, there's you know,
a couple of thousand cognates. That's words that are the

(23:18):
same or very close to the same, so you immediately
know those We don't even need to teach you those words.
If you're learning English from Spanish because you already you
know them automatically because they are the same word. That's
not quite true from Chinese. Other examples are, you know,
for me in particular, i started learning German, and for me,
German was quite hard to learn because Spanish, you know,

(23:40):
my native language is Spanish. Spanish just does not have
a very developed concept of grammatical cases, whereas German does.
But learning German from like from Russian, that's just not
a very hard concept to grasp. So it kind of
depends on what concepts your language has, you know, also

(24:00):
not exactly concepts. But in terms of pronunciation, everybody says
that Spanish pronunciation is really easy, and it's true. Vowels
in Spanish are really easy because there's only really about
five vowel sounds. It's a little more than that, but
it's about five vowel sounds, whereas you know, there are
other languages that have, you know, fifteen vowel sounds. So
learning Spanish is easy, but vice versa. If you're a
native Spanish speaker, learning the languages that have a lot

(24:22):
of vowel sounds is really hard because you don't even
you can't even hear the difference. You know, it's very
funny when you're learning English from as a native Spanish speaker,
you cannot hear the difference between beach and bitch. You
cannot hear that difference, and you know, people make funny
mistakes because of that.

Speaker 2 (24:37):
But I think there are a lot of T shirts
that involve that at one point in time.

Speaker 4 (24:43):
Well, because really, if you're a native Spanish speaker, you
just cannot hear that difference.

Speaker 2 (24:48):
So one thing I wanted to ask you is the
type of model that you're actually using. So I believe
you're using GPT four for some things like your premium
subscription do a Lingo Max, but then you've also developed
your own proprietary AI model called bird Brain. And I'm
curious about the decision to both use an off the

(25:10):
shelf solution or platform and to also develop your own
model at the same time. How did you end up
going down that path.

Speaker 4 (25:20):
Yeah, it's a great question. I mean, I think the
difference is these are just very different the last since
since I don't know, two years ago, when large language
models or generative AI became very popular. Before that, there
were different just different things that AI could be used
for us. We were not using AI, for example for
practicing conversation. But we were using AI to determine which

(25:44):
exercise to give to which person that we built our
own that is the bird brain model is a model
that tries to figure out which exercise to give to
which person, you know, the last two years ago, for
the last two year stories. When people talk about models,
they usually mean langue which models, And it's this, it's
this specific type of AI model that what it does

(26:05):
is it predicts the next word given the previous words.
That's what a language model does. The large language models
are particularly good at doing this, and we did not
develop our own large language model. We decided it's a
lot easier to just use something like GPT four. But
we have our own model for something else that is
not a language model. That is an but it is
an AI model to predict what exercise to give to

(26:28):
which user, which is a pretty pretty different problem.

Speaker 3 (26:47):
Speaking of AI, all these especially the really big companies,
making an extraordinary show of almost bragging about how much
money they give to Jensen Wong and in video it's like,
we just spent you know, we're spending twenty billion dollars
over the next two years to just acquire h one
hundred chips or whatever it is, and it almost seems
like there's like arms race. And then there is also

(27:09):
this view that actually the best models will not necessarily
be the ones strictly with the access to the most
compute but the access to data sets that other models
simply don't have. And I'm curious sort of like you know,
you as dual lingo must have an extraordinary amount of
proprietary data just from all of your user interactions in

(27:33):
your experience. When you think about who the winners will
be in this space, is it going to be the
ones that just have the most electricity and energy and chips,
or is it going to be who has access to
some sort of data that they can fine tune their
model on that the other model can.

Speaker 4 (27:49):
It depends on what you're talking about. You know, certainly
we a stoolingo have a lot of you know, data
nobody else has, which is the data on how each
person's learning language. I mean that's not data you can
find on the web or anything like that. That is
just the data that we have that we're generating, and
we're going to train our own models for that. I
don't think there's enough electricity to train a model without

(28:12):
this data to be as good as ours with our data,
but it is for specifically language learning. If you're talking
about training a general model, that is going to be something,
you know, a language model that is general for being
able to have conversations, et cetera. Usually you can get
that from there's pretty good data there out there. You know,

(28:32):
YouTube videos that are free or a lot of kind
of Reddit conversations or whatever. There's there's a lot of
data in there. Probably a power is going to matter.
So it depends on what you're going to use your
model for. If if you're getting if you're using it
for a very specific purpose and you have very specific
data for that that is proprietary, that's going to be
better for the specific purpose. But my sense is that

(28:56):
you know both are going to matter. You know what
data you have and also how much electricity you spend.
But I also think that over time, hopefully we're going
to get better and better at these algorithms. And if
you think about it, the human brain uses something like
thirty watts for the human brain is pretty good and
we don't need you know, some of these models. People
are saying, oh, this is uses the the amount of

(29:18):
electricity that all of New York City uses. We use
that to train a model. You know, our brain uses much, much,
much less electricity than that, and you know, it's pretty good.
So my sense is that also over time, hopefully we'll
be able to get to the point where we're not
as crazy about using electricity as we are today.

Speaker 2 (29:37):
I'm glad our brains are energy efficient. That's nice to know.

Speaker 4 (29:40):
We've been talking a lot better than computers.

Speaker 2 (29:43):
We've been talking a lot about the use of AI
in the product itself, so improving the experience of learning
a language. But one of the things that we hear
a lot about nowadays is also, you know, angst over
the role of AI in the wider economy in terms
of the labor force, job security, and stuff like that,

(30:05):
as companies try to be more efficient. So I guess
I'm wondering, on the sort of corporate side, how much
does AI play into the business model right now in
terms of streamlining things like costs or reducing workforce. And
I believe there are quite a few headlines around Duo
Lingo on this exact topic late last year.

Speaker 4 (30:26):
Yeah, first of all, those headlines were upsetting to me.
Because they were wrong. You know, there were a lot
of headlines thing that we had done a massive layoff
that was not actually true. So what is true is that,
you know, we really are leaning into AI. You know,
it just it makes sense. This is a very transformative technology,
so we're leaning into it. And it is also true
that many workflows are a lot more efficient. And so

(30:48):
what happened late last year was that we realized we
have full time employees and but we also have some
hourly contractors. We realized that we need a fewer hourly
contractors and so for you know, a small fraction of
our hourly contracts, we did not renew their contract because
we realized we need a few of them for doing
some tests that you know, honestly, computers were just as

(31:10):
good as as a human and that's you know, that
may be true for something like a like an hourly
contractor force that was being asked to do. We were
basically being asked to do very rote kind of language
tasks that computers just got very good at. I think
if you're talking about you know, our full time employees
and people who are who are not necessarily just doing

(31:31):
rote repetitive stuff that's going to take a while to replace.
I don't think, and certainly this is not what we
want to do as a company. You know, I heard
a really good saying recently, which is, your job's not
going to be replaced by AI. It's going to be
replaced by somebody who knows how to use AI. So
what we're seeing in the company, at least for our
full time employees, is not that we're able or even

(31:52):
want to replace them. What we're seeing is just way
more productivity, to the point where people are able to
concentrate on kind of higher level cognitive tasks rather than
wrote things. I don't know. One hundred years ago, people
were being hired to add numbers or multiply numbers the
original quote unquote computers were actually humans who are being

(32:13):
hired to multiply numbers. We were able to mechanize that
and use an actual computer to do that so that
people didn't have to do that. Instead, they spend time,
you know, planning something at a higher level rather than
having to do the multiplication. We're seeing something similar to
that now. And the other thing that we're seeing is
that is really amazing. So we are saving costs because

(32:35):
it's a single person can do more, but also we're
able to do things much much faster, and in particular
in data creation. I mean, one of the ways in
which we teach you how to read is even read
short stories. We used to create and we need to
create a lot of short stories. We used to be
able to create short stories, you know, at a certain pace.
We can now create them like ten times faster. And

(32:58):
what's beautiful about being able to eat them ten times
faster is that you can actually make the quality better
because if you create them once ten times faster and
you don't like it, you can start over and do
it again with certain changes and then oh you didn't
like it, Okay, try it again, So you can you
can try ten times at this, you know, whereas before
you can only try once, and generally you don't have

(33:18):
to try ten times. You have to try a few
or times. So this is able to at the same
time lower costs for us, but also make the speed
faster and the quality better. So I mean, we're very
happy with that. In terms from the corporate side.

Speaker 3 (33:30):
Could you talk more about benchmarking AI, because there's all
these tests, right and you see these websites and they're like,
well this one got this on the l sads, and
this one got this on the SATs and I can
never quite tell. And a lot of it seems inscrutable
to me from your perspective, Like, what are sort of
your basic approaches to benchmarking different models and determining when

(33:51):
it like, okay, this makes sense as some sort of
task to employ AI instead of a person doing it.

Speaker 4 (33:59):
Yeah, I have felt the same as you have. There's
a lot of my senses and a lot of these
benchmarks are from marketing teams. You know, what we do
internally is two things. First of all, we just try stuff,
and then we look at it, and we look at
the very specific you know, it's nice that an AI
can pass the L set or whatever, but we're you know,
we're not in the business of passing L sets. We're
in the business of doing whatever it is we're doing,

(34:19):
you know, creating short stories or whatever. So whatever task,
we just try it and then we judge the quality ourselves.
So far, we have found that the quality of the
open AI models is a little better than everybody else's,
but not that much better. I mean two years ago
it was way better. It seems like everybody else is
catching up. But so far we have found that that's

(34:40):
just when we do our tests. And again, this is
you know, just an end of one one company. I'm
sure that other companies are finding maybe different stuff, but
for us, for our specific use cases, we find time
and again the GPT four does better. And I don't know,
of course, everybody's now announcing like there's going to be
GPT five et cetera, et cetera. I don't know how
those will be, but that's what we're finding. You know, generally,

(35:00):
would just do our own testing.

Speaker 3 (35:02):
Yeah, Tracy, I find that so fascinating, especially, I think
we've talked about this, like it definitely seems like TBD
whether like one model would just prove to be head
and shoulders better than the others, the way that Google
was just head and shoulders above everyone else for twenty
years basically and still is kind of like, it's unclear
to me whether that'll be the.

Speaker 2 (35:19):
Case with they ask right, the idea that we're in
the I don't know, the bing era of chat models
and eventually we're all going to migrate to something else. Luise,
One thing I wanted to ask you, and this is
sort of going back to the very beginning of the
conversation and some of the you know, older thoughts around language.
There used to be I don't want to say a consensus,
but there used to be some thinking that language was

(35:42):
very complicated in many ways, and so much of it
was sort of ambiguous or maybe context dependent, that it
would be very hard for AI to sort of wrap
its head around it. And I'm wondering now, with something
like due lingo, how do your models take into account
that sort of context dependency? And I'm thinking, you know,

(36:04):
I'm thinking specifically about things like Mandarin, where the pronunciation
is kind of tricky and a lot of understanding depends
on the context in which a particular word is said.
So how do you sort of deal with that?

Speaker 4 (36:20):
Yeah, I mean it's an interesting thing. You know, when
you meant when you were asking the question, I thought
of this thing. You know, I've been around AI since
the late nineties, and I remember just it's just this
moving goalpost. I remember. Everybody just kept on saying, look,
if a computer can play chess, surely we all agree
it has human level intelligence. This is kind of what
everybody said. Then it turned out computers could play chess,

(36:42):
and nobody agreed that I had human level intelligence. It's
just like, oh, very fine, it can play just next thing.
And it would just keep coming up with stuff like,
surely if a computer can you know, play the game
of go, or if a computer could do this, then
you know, And one of the last few things was
if a computer can whatever right poetry so well or
understand text, then surely is intelligent. And at this point,

(37:06):
models like GPT four are really good at doing things,
certainly better than the average human. They may not be
as good as the best poet in the world, but
certainly better than the average human writing poetry, certainly better
than the average human at almost anything with text manipulation. Actually,
if you look at your average human, they're just not
particularly good at writing.

Speaker 3 (37:23):
So many professional writers oh yeah, ye.

Speaker 4 (37:26):
Yeah, I mean just these models are excellent. And in fact,
you can write something that is half well written and
you can ask the model to make it better and
it does that. It like makes your text better. So
it's this funny thing that just AI. We keep coming
up with things that like if AI can crack that,
that's it, that's it. You know, I don't know what
the next one will be, but you know, we gep
coming up with stuff like that, you know, in terms

(37:47):
of the language, it just turns out that language can
be mostly captured by these models. It turns out that
if you make a neural network architecture and this you know,
nobody could have guess this, but it just turns out
that if you make this neural network at architecture that's
called the transformer, and you train it with a gazillion

(38:09):
pieces of text, it just turns out it pretty much
can capture almost any new ones of the language. Again,
nobody could have figured this out, but it just turns
out that this is the case. So at this point,
when you know, when you ask about you know, what
we do with context or whatever, it just works when
you're you know, some of it we do with handwritten
rules because we write the rules. But generally, if you're
going to use an AI, it just works. And you

(38:31):
can ask me why it works. And I don't know
white works. I don't think anybody does. But it turns
out that the statistics are kind of strong enough there
that if you train it with a gazillion pieces of text,
it just works.

Speaker 3 (38:43):
I just want to go back to the sort of
like you know where AI is going and you mentioned
that AI can generate thousands or ten, you know, very
rapidly numerous short stories, and then a human can say, Okay,
these are the good ones. We can improve and so
you not only get the efficiency savings, actually can get
a better higher quality for the lessons and so forth.

(39:03):
But you know, sort of like I'm moving up the
abstraction layer, like, will there be a point at some
point in the future in which the entire concept of
learning a language or the entire sequence is almost entirely
something that AI can do from scratch? Again, I'm thinking
sort of back to that chess analogy of not having
to use the entire history of games to learn, but

(39:27):
just knowing the basic rules and then coming up with
something further like, will AI eventually be able to sort
of like design the architecture of what it means to
learn a language?

Speaker 4 (39:37):
I mean sure, I think at some point EI is
going to be able to do pretty much everything.

Speaker 2 (39:41):
Right.

Speaker 4 (39:41):
It very hard to know how long this will take.
I mean, it's just very hard, and honestly for our
own society, I'm hoping that the process is gradual and
not from one day to the next, because if we
find that at some point AI really goes from if
tomorrow somebody announces, okay, I have an AI that can
pretty much do everything perfectly. I think this will be
a major societal problem because we won't know what to do.

(40:04):
But if this process takes twenty thirty years, at least
we'll be able to as a society figure out what
to do with ourselves. But generally, I mean, I think
at some point AI is going to be able to
do everything we can.

Speaker 2 (40:16):
What's the big challenge when it comes to AI at
the moment? I realize we've been talking a lot about opportunities,
but what are some of the issues that you're trying
to surmount at the moment, Whether it's something like getting
enough compute or securing the best engineers, or I guess
being in competition with a number of other companies that

(40:37):
are also using AI, maybe in the same business.

Speaker 4 (40:41):
I mean, certainly, securing good engineers has been a challenge
for anything related to engineering for a while. You know,
you want the best engineers, and there's just not very
many of them, so there's a lot of competition. So
that's certainly true in terms of AI in particular, I
would say that I don't know what depends on what
you're trying to achieve. These models are getting better and better.

(41:01):
What they're not yet quite exhibiting is actual kind of
deduction and understanding as good as we would want them
to do. I mean, so you still see really because
of the way they work, I mean, these are just
predicting the next word. Because of the way they work,
you can see them do funky stuff like they get
adding numbers wrong sometimes because they're not actually adding numbers.

(41:22):
They're just predicting the next word. And it turns out
you can predict a lot of things you know you
may not, So it doesn't quite have a concept of addition, doesn't.
So I think, you know, if what you're looking for
is kind of general intelligence, I think there's some amount
that's going to be required in terms of actually understanding
certain concepts that these models don't yet have. And that's,
you know, my sense is that new ideas are needed

(41:44):
for that. I don't know what they are. If I knew,
I would do them, but new ideas are needed for that.

Speaker 3 (41:48):
Yeah, it's still like mind blowing, Like you see the
AI produce some sort of amazing output or explanation and
then it'll like get wrong. Like a question of like
what weighs more a kilogram of feathers or kill of steel,
like something really led.

Speaker 4 (42:02):
Or yah because it doesn't Yeah, right, because.

Speaker 3 (42:05):
There's no one, there's no actual intuition. I just have
one last question, and it's sort of There are not
many sort of like cutting edge tech companies based in Pittsburgh.
I understand like CMU has historically been a bastion of
advanced AI research. I think at one point, like Uber
bought out like the entire robotics department when it was
trying to do self driving cars. But how do you

(42:27):
see that when it comes to this sort of recruiting
of talent and it's already scarce. What are the advantages
and disadvantages of being based in Pittsburgh rather than the
Bay Area or somewhere else.

Speaker 4 (42:38):
Yeah, we that a quarter in Pittsburgh's it's the beginning.
We've loved being there. There are good things and bad things.
I mean, certainly a good thing is being close to
Carnegie Mellon. Carnegie Mellon produces, you know, some of the
best engineers in the world, and certainly relating to AI.
Another good thing about being in a city like Pittsburgh
is that two good things. One of them is that
people don't leave jobs that easily. And you know, when

(43:01):
you're in a place like Silicon Valley, you get these
people that leave jobs every eighteen months. Our average employee
stays around for a very long time, and that's actually
a major advantage because you don't have to retrain them.
They really know how to do the job because they've
been doing it for the last seven years. So that
that's been an advantage. And I think another advantage that
we've had is in terms of Silicon Valley, there's usually

(43:22):
one or two companies that are kind of the darlings
of Silicon Valley, and everybody wants to work there, and
that the Darling Company changes every two three years, and
the kind of all the good people go there. The
good news in Pittsburgh is that fad type thing doesn't happen.
So there have been times. We're lucky that right now
our stock is doing very well, so we're kind of
a fad company. But there have been times when we

(43:42):
just weren't, but we still were able to get really
good talent. So I think that's been really good. You know.
On the flip side, of course, there are certain roles
for which it is hard to hire people in Pittsburgh.
Particularly product managers are hard to hire in Pittsburgh. So
because of that, we have an office in New York,
and we complement that, we have a pretty long our jofice
in New York, and we compliment that.

Speaker 2 (44:03):
All right, Louise went on from dual LINGO, thank you
so much for coming on all thoughts. That was great,
Thank you excellent, Joe. I enjoyed that conversation. You know

(44:23):
what I was thinking about when Louis was talking about,
it's not that AI is going to take her job,
it's someone who knows how to use AI is going
to take your job. I was thinking about just before
we came on this recording, you were telling me that
you used was it Chat, GPT or claude to learn
something that I normally do.

Speaker 3 (44:40):
Oh yeah. So for those who don't know, we have
a weekly odd lauged newsletter and we usually comes out
every Friday. You should go to subscribe and Tracy usually
sends an email to one of the guests each week
asking what books they recommend, you know, people like reading books.
And then she goes into ms paint and then like
puts the chatbooks of like the four books together, and

(45:03):
I did add because Tracy was out a couple weeks ago.
And I am not, like, I've never like learned Photoshop
or even MS paint, so just like I'm very dumb,
Like just like the process of putting four images together
was not something I exactly knew how to do. So
I went to Claude and I said, I'm putting together
four book images in an MS paint thing. Please tell
me how to do it and to walk through the steps.

Speaker 1 (45:23):
And I did it.

Speaker 3 (45:24):
Tracy, you were proud.

Speaker 2 (45:24):
Of me, right, I was very proud. I do think
it's somewhat ironic that the pinnacle of AI usage is
teaching someone how to use MS paint, But it's fine,
I'll take it. Yeah, No, there's so much to pull
out of that conversation. One thing I'll say, and maybe
it's a little bit trite, but it does seem like
language learning is sort of ground zero for the application

(45:46):
of a lot of this natural language and chat bought technology.
So it was interesting to come at it from a
sort of pure language or linguistics perspective.

Speaker 3 (45:57):
Yeah, I mean, I like, I feel like we could
have talked to Luist for hours, just on like theory
of language itself, which I find endlessly fascinating, and I
really I can only speak one language. I used to
be able to speak French, so I don't know if
I told you, but I did one semester in Geneva, Switzerland,
and I lived with a family that only spoke French,

(46:17):
and I'd never spoken a word of French before I
got there. And after one semester, I came home and
I passed out of four years worth of my college
requirements from that four months living there. And then I
didn't speak French again for twenty years and I lost
it all. But I was gonna go somewhere with that.
I don't really know.

Speaker 2 (46:33):
It's okay I to speak multiple languages poorly.

Speaker 3 (46:36):
But you know the other thing I was thinking about,
you know, so due LINGO has obviously been around for
quite a long time before anyone was talking about generative
AI or anything. And one of the things you hear,
and it sort of used pejoratively, is like some company
will be called like a chet GPT rapper, right, so
basically they're just taking GPT four whatever the latest model is,

(46:56):
and then building some slick interface to do a specific
task on top of it. And what's interesting about dual
Lingo is it feels like it's backwards or going in
the opposite sequence where they already had this extremely popular
app for language learning, and then over time they incorporate
more so rather than being starting off as a rapper

(47:18):
for someone else's technology, they already have the audience, they
already have the thing, and then they find more ways
that the AI can be used to actually like rebuild
the core app.

Speaker 2 (47:29):
Yeah, that's a really good way of putting it. And
also just the iterative nature of all of this technology,
So the idea that you know, you're sort of training it,
I know, again it's sort of an obvious point, yeah,
but also I didn't realize how customized a lot of
the duo lingo stuff is at this point. And the
idea that if you speak one language, the way you learn,
say German, is going to be completely different to someone

(47:52):
who grew up speaking another language. And I'm very intrigued
by the amount of data that's something like a duolingo
have at this point, and I guess maybe we should
have asked Louise about this. But also other business opportunities
in terms of like licensing that data or maybe I
don't know. I think they were doing a partnership for
a while with BuzzFeed where they were where the cap

(48:15):
show was like actually translating news articles or something.

Speaker 3 (48:19):
Right, there was going to be something like that, I think.
I recall it didn't really take off, but the idea
was BuzzFeed would get its news articles translated into Spanish
and other languages from the process of duo lingo users
learning that process. I forget why it didn't take off,
but yeah, absolutely.

Speaker 2 (48:35):
I also I find it funny like in some senses
that we're sort of I guess the thing that AI
is feeding off of now right, And like all those
minutes which I'm sure add up to days eventually of
going through Capsha, it's all sort of unpaid labor for
training our future AI overlords.

Speaker 3 (48:56):
So he mentioned that he was upset about headlines last
year implying that they had laid off a bunch of
people due to AI. But he did say that there
are people who they were contractors, so they weren't full
time employees. But it sounds like a very crisp example
of AI being able to do a job even if
they were contractors. That were done by humans. And I'm

(49:17):
generally skeptical of most articles and that I read where
a company says, oh, we're getting like cut all this
labor savings and we're gonna do AI, because I sort
of think that is often a smokescreen for just like
a business that wants to cut jobs and make it
sound like they're progressive. But here did sound like an
actual example in which there was some form of human
labor that is no longer needed because it is AI.

Speaker 2 (49:41):
Yes, AI will come for us all. Shall we leave
it there?

Speaker 4 (49:44):
Let's leave it there.

Speaker 2 (49:45):
This has been another episode of the All Thoughts podcast.
I'm Tracy Alloway. You can follow me at Tracy Alloway.

Speaker 3 (49:51):
And I'm Joe Wisenthal. You can follow me at the Stalwart.
Follow our guest Louis Vaughan on He's at Louis van On.
Follow our producers Carman Rodriguez at Herman Ermann dash Ol
Bennett at Dashbot and kill Brooks at Kilbrooks. Thank you
to our producer Moses Ondem From our Oddlows content. Go
to Bloomberg dot com slash odd Lots, where we have transcripts,
blog and a newsletter and you can chat about all

(50:13):
of these topics twenty four to seven in the Discord.
In fact, this episode came about because someone in the
Discord wanted to hear an interview with Luis van On,
So you can go there, you can talk about AI,
you can suggest future episodes.

Speaker 2 (50:28):
Check it out and if you enjoy all blots, if
you like it when we speak bad Spanish, I guess,
then please leave us a positive review on your favorite
podcast platform. And remember, if you are a Bloomberg subscriber,
you can listen to all of our episodes absolutely ad free.
All you need to do is connect your Bloomberg subscription

(50:48):
with Apple Podcasts. Thanks for listening

Speaker 4 (51:07):
In

All Episodes

Episode Transcript

Hosts And Creators

Joe Weisenthal

Tracy Alloway

Popular Podcasts

On Purpose with Jay Shetty

Las Culturistas with Matt Rogers and Bowen Yang

Crime Junkie

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Luis von Ahn Explains How Computers and Humans Learn From Each Other