Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Brought to you by Toyota. Let's go places. Welcome to
Forward Thinking. Pay there and welcome to Forward Thinking, the
podcast that looks at the future and says, you've got
a friend. I'm Jonathan Strickland, I'm Lauren Vocaldon, and I'm
(00:20):
Joe McCormick. So, Joe, I hear you like a intelligence. Uh,
it's one of the things I like. I hear you
also like artificial things, like like artificial banana flavoring. Like
what one thing that Joe absolutely loves. I have seen
him put artificial banana flavoring on some of the weirdest stuff.
(00:44):
But I was really trying to get it artificial intelligence.
I know, I was going about it in a really
kind of indirect Really, I thought this podcast was going
to be about artificial vanilla extract. Well it could be,
but instead I've decided to switch it over to artificial
intelligence and the idea of creating a true artificial intelligence
that has human level or beyond intelligence, And how how
(01:06):
would we make sure it didn't kill us? Well, you
would have to be in a position to kill us first,
but that's something we can talk about as well. I
want to start with the idea of a robot politician,
which is a sort of construct that we touch on
in this week's video. Um, so, have you ever read
the Isaac asthmov short stories Evidence or The Evitable Conflict?
(01:28):
These are part of I Robot, And yes I have,
I have not, So for those of us who haven't, Joe,
do you want to do you want to talk about
that for a second? Sure? Well, I don't want to
give too many spoilers, but one of them is about
a controversy where there is a politician running for an
elected office who is suspected of being a machine. Right
and in fact, in the world that as a mom
(01:51):
has created, it's important for you to realize that machines,
robots with positronic brains, which are these artificially intelligent brains,
are not allowed to be on worlds that have human habitation.
You can only be on uninhabited worlds. It's the only
place where those robots are allowed to go. So they're
they're allowed to go to places and do dangerous work
that benefits the rest of humanity, but they can't be
(02:13):
on a world that's inhabited by humans. Yeah. So, Asimov
had an interesting approach to talking about the integration of
robots and artificial intelligence into society, which I like because
it was neither utopian nor dystopian. Now it is very
very much kind of taking light. Let's look at the
world around us, which is definitely not perfect, but it's
not you know, twelve monkeys world worst case scenario either. No,
(02:36):
he was exploring a sort of a smart, well engineered
system that still had flaws in it. And so the
system was that the robots in this world are governed
by three laws. The first law is you cannot harm
a human. Second law is you have to obey human commands.
Third law is you can't destroy yourself. Right. And of
(02:58):
course each of the laws ends up saying unless it
would break prioritized one to three. Right. Uh yeah, So
they use this to try to create a framework to
make sure that a robot never does anything bad. Of course,
it doesn't always work, and thus is the sort of
point of conflict for many of Asimov's stories. It's like, uh,
they're sort of obeying the laws, but the laws are
(03:20):
coming into conflict in such a way that now we've
got a problem. Right, And do do recall that he
was writing fiction to be entertaining. He wrote the laws
in order to be interestingly flawed so that he could
exploit that for story purposes. This this was never meant
to be a complete manifesto of how to robot right, right,
So back to the two stories you brought up. The
idea of one of them is that there's a secret
(03:42):
robot who seems to be human outwardly running for office,
and the question is is it really a person or
is it really a robot? But characters within this story
debate whether it's really such a bad thing to have
a robot in office because the robot, unlike humans, is
not self interest did it has? It has these laws
(04:02):
governing its actions, and these laws will in the end
ensure that really it isn't going to do harm. In fact,
one of the main characters in I Robot is this
humor less kind of misanthropic robo psychologists. She's she's human,
but she specializes in robo psychology, and she uh, she
(04:27):
call her humor less. But there are specific passages where
she she she people try to engage with her and
she turns her humorless eyes upon them that she she states,
uh completely, you know, in a in a very uh
straightforward way, that she thinks robots are superior to human
(04:48):
beings in in most in most ways because with the
Robot President character, the person who may or may not
be a robot. In fact, they're very careful to try
and build a k either way. They being Asima, really
build a case either way that could be robot, could
be human. Uh. She says that he's either a robot
(05:10):
or a really really really decent human being. So that
that kind of tells you that that character's perspective and
a lot of the stories come from from her kind
of experience that she feels that robots are in fact
better than people for the most part. Right, But let's
imagine we take it one step beyond just the idea
of a single robot in a single leadership role. There's
(05:33):
another Asimov's story called The Inevitable Conflict, which discusses how
at some point in the future, all kinds of systems
are governed by robotic or artificially intelligent controls. Some would
argue that we're already in that world at some point.
I mean, you look at the stock market, you know,
robo trading. You've got like this, this, all these algorithms,
(05:56):
these these programs that are running all these sophisticated uh,
you know, algorithms to guide them on when to buy
and when to sell all these uh, these very short transactions. Uh,
and they have global consequences. We've talked about that previously
on this podcast. So in some ways we're already seeing
that come to pass. Now we're not talking about a computer.
(06:17):
We go to you know, type in a question of
you know, how do we do such and such, and
it gives us the sage advice and then we you know,
it's not deep thoughts. I don't know. Google does that
for me about seventy eight times a day. Google, Well,
Google does do that. We are already sort of wading
into these waters, whether you know it or not. You
mentioned the Stock exchange, but you might say, oh, well
(06:38):
but that's private industry, wild West guns Blaze and they're
doing whatever. You know, the government wouldn't do that. Well,
the I R S already has a process called computer scoring,
where you submit a tax return and computers pre screen
those returns to decide whether or not we should put
you into the pile to investigate for an audit. Yeah,
(06:58):
and the fun act is this podcast goes live the
week of income Tax Day but after it's already over.
So I hope you guys thought about that before you
since your it turns in. Okay, So imagine a future
where we do have artificially intelligent machines, probably much more
(07:19):
intelligent than humans. Otherwise, what's the point governing our systems,
our societies, our economies, making decisions on our behalf to
try to make the world a better place for us?
And there's hypothetical pluses and minuses here. What are some
of the good points? Well, good point would be that
it be able to make decisions faster and with preface,
(07:41):
ideally with less bias than a human being with Oh yeah, well,
let's just start from the ideal point of view before
we crack a bunch of Okay, So, let's say it's
a perfect AI and it is uh, you know, you
wouldn't call it cold. It's logical, but it's also compassionate. Yeah,
Let's say you've you've created a computer and given it
some instruction like create the greatest maximal benefit for humanity,
(08:05):
and it it works out how to do that, which
it can do because it's super intelligent. It's way smarter
than any human and it can look at trends in society.
It can look at unemployment numbers and crime statistics and
all these things, distribution, water distribution, It can average all
of that data together to make incredibly accurate predictions about
the effects of its actions that we just don't have
(08:27):
the cognitive capability to do. And furthermore, it can do
all of that with with no hate, no greed, no ambition,
no prejudice. Right, exactly, it doesn't have a will to
power of its own. It just has programming. It just has,
you know, doing what it's designed to do. So that's
the ideal, perfect vision sort of. It's perfectly capable and
(08:47):
it's perfectly moral. But on the other hand, machines are unpredictable,
or at least machines like this. Actually, machines on the
small scale are very predictable. They do what you tell
them to do in the thing else they aren't. They
can't do anything else because they weren't programmed to do.
But if you create a machine that is more intelligent
than you, you inherently cannot understand what it's doing. Whoops. Yeah,
(09:09):
So any machine smarter than you, you sort of lose transparency, right,
it's hard to understand the decisions that are being made.
If they're being made, it's at a level, way way
above your head. Let's here's an example. Let's say that
we have like the Grand Deep Thought computer that we
want to consult when we have a particularly tough question. Uh.
(09:30):
And maybe it's one of these about how do we
have the maximum benefit for the most people on earth,
impacting having a negative impact on the least number of people,
trying to trying to get as good a reaction as
we possibly can, knowing that there's not likely to be
any perfect answer that's going to make all ships rise
up with the tide, right, Uh. And then the computer
(09:51):
comes back and gives us an answer that, on the
face of it seems counter intuitive or counter productive. And
the computer knows because it's run the remulations that while
this first step is possibly a tough one for us
to take, it's actually the one that will lead to
the most beneficial outcome. So then the short term we
have some hardship. Perhaps it is food redistribution, which would
(10:14):
be a huge one, right, or water redistribution, which would
be another huge problem. But let's say that's that first
step that's really really hard for at least some parts
of the world to to agree to. Then you could
have people arguing this thing is trying to destroy us,
it's not trying to help us, not necessarily being able
to see that twenty eight steps down the road, it
(10:34):
actually leads to an outcome that's beneficial for everybody. Likewise,
on the other hand, it could tell us to do
something because it is malfunctioning and we don't have the
transparency capability to understand that it's malfunctioning. Thus it leads
us down a really horrible path. Without hating us, I
mean it doesn't. It's not that it's trying to destroy humanity.
(10:55):
I mean it might in it just it calculated something long,
it didn't understand something one burn all the week, okay.
And that flip side of it of it not being
hateful of a machine inherently not being hateful is that
a machine inherently has no human empathy or intuition about
(11:15):
what what step is okay and what is not unless
we program that in. Yes, So if you haven't thought
to have the computer specifically look at the most disadvantaged
people and uh and take special consideration for those people
who are are essentially they're going to be victims of
whatever decisions you make. It may be that they have
(11:37):
a positive outcome, but it may not be unless you've
built that in. Then the computer is not necessarily going
to make that consideration for you. And that could be
a real impact. Right, I'd like to mention something else.
We say that a computer has no hate, has no greed,
and all those things, which is inherently true about the computer,
but the humans that create the computer could have those things.
(12:00):
And a program is only going to be as impartial
as its creator was. And and you know, the creator
might be sitting there going like, well, you know, some
animals are more than others. Yeah, you know. And so
even if you even if you take it a couple
of steps further, because I've seen it proposed that if you, okay,
create a super intelligent machine and have that super intelligent machine,
create a really super intelligent machine and use that super
(12:22):
intelligent machine as it's your president robot, this is deep
thought creating the Earth. Yeah, because the Earth is a
computer in Hitchecker's guide, right, right, computer? And I mean,
you know, input output, if if if if the humans
creating deep thought we're prejudiced at the beginning, then that
could just computer form. Right. So yeah, I mean if
(12:44):
you if you have a bias, and that bias is
built into the programming you make. Because you know, we're
talking about a an intelligent computer. I think a lot
of people just imagine that to be an incredibly powerful machine,
and that's where it begins it, right, That's it's the
machine part that's important. But like we said in our
Singularity podcast, the software is equally as important, and without
(13:08):
it maybe more important. Yeah, you could argue more important.
I mean without the hardware, the software can't run. But
without the software, it can't be intelligent. Right, So unless
you have very sophisticated software that can take on the
these these problems, either by designing the next computer so
that it is the most efficient or by doing it itself.
(13:31):
If the if the programmers do have this bias, that
could be reflected in the results. Okay, so people are
talking about creating a super intelligent machine. Obviously we can't
do that today, but people are refining AI methods and
it may in some people's minds sneak up on us,
Like you could suddenly realize like, oh, we've gone a
long way down this road to creating something that's equal
(13:53):
to human intelligence or even beyond it, which is really
the sweet spot for these problems. Maybe it's a good
idea to start thinking about what we would need to
do in order to prevent really negative outcomes if we
were to create this superintelligent machine. Right, the two big
negative outcomes these are like taking to the absurd extreme obviously,
(14:14):
but I call it the kill all humans or the
subjugate all humans approaches. These are really popular in science fiction. Right.
This is this is the world of the Terminator, where
humans have created machines that gain sentience and ultimately turn
on their creators for one reason or another. And there
are a lot of different approaches to this kind of storyline.
(14:35):
In some cases, the machines have malevolent intent. They actually
want to kill humans because they're you know, essentially robotic
psychopaths and other versions. It's that the machines have calculated
that the best possible outcome for whatever planet Earth will
say is for humans to be wiped off, because that's
the source of most of the problems. So if you
get rid of the source, then the problems are gone.
(14:56):
So in some cases it's like a mistaken like, oh,
I know how to solve this issue. We just gotta
kill all the people deemed you ilogical, right, or the
subjecate all humans. That's essentially the matrix approach where we've
created machines and we are Our intent was to make
the machines work for us, but irony of ironies, the
machines have decided that we're going to be working for them,
possibly as giant batteries, although that's incredibly inefficient. They get
(15:20):
better results from cows that should have been the moo
tricks I've been waiting to use that. Lauren is shaking
her head at me. So tech stuff fans know what that.
Joe Joe appreciates it. I think it's only because I've
heard that one before from from you on tech stuff.
It's also fair. Okay, let's talk about friendly AI. Okay,
(15:41):
this is the this is the term. It's friendly artificial intelligence,
the term for the framework that we would need to
come up with to create artificial intelligence or a super
intelligence that has a net benefit to humanity rather than
a negative outcome. I like to think of friendly AI
as the AI that walks in the door, takes off
its jacket and slow first puts on a pair of sneakers.
(16:03):
A little sweater vest and then just gently leads you
into the future. Lets us see a little story about trains,
trains with faces in the future. Can can we have
anyone building super intelligent AI is listening? Please do that,
because that would be essentially the best of all We
were actually designed friendly AI to follow the philosophy of Mr. Rogers.
We'd be set, won't you be my neighbor? I would
(16:26):
totally be that that super intelligent AI's neighbor, completely without hesitation. Okay,
But so there's some guidelines that people have written up
and and for a while these guidelines have existed. Back
in two thousand one, the Singularity Institute published a thing,
a rather lengthy thing that I will not go into deep,
deep detail of, but but they began by positing that
(16:46):
that since growth in AI is and I quote astronomically
faster than the rate of human evolution um, that we
need to be thinking about this issue. And hey, we'll
talk about that that belief system um in our episode
or already talked about it in our episode about the Singularity.
We don't know which one will come first. I will
I will say that it definitely has evolved much If
(17:07):
you think of human evolution as taking over the course
of millions of years, and the fact that we've had
computers since the like nineteen forties. If you want to
be really generous, I can I can agree with the
astronomically faster evolution. I don't know that necessarily leads to
superintelligent computers, but but pray continue, sure um and and hey,
(17:27):
either way, caution and thought are good. So they specifically
suggest that we should be careful not to expect a
machine mind to operate like a human mind. Um, that
that we shouldn't anthropomorphize AI. Right, that's a really good point. Sure, sure,
And building from there, they lay out the challenges in
creating friendly AI um being the creation of ethical content UM,
(17:50):
creating a machine capable of acquiring that content, even asking
human questions when necessary, but simultaneously knowing enough to resist
human manipulation and sell correct for human errors. That's pretty cool, though.
They go into a lot more depth in the recommendations.
Believe these are based on Yudkowski's work, right, yeah, yeah,
he did a book length kind of paper also in
(18:13):
two thousand one called Creating Friendly AI one point oh,
the Analysis and Design of Benevolent Goal architectures. Elias are Udkowski,
who we mentioned in our podcast about the Singularity. He's
written at length about this specific problem, the friendly AI problem. Yes, yes,
and uh, and we'll have more to say about an
interesting thought experiment he came up with in a little bit. Yeah,
(18:36):
so we should back up and say, hey, wait a second,
why do we really need to worry about friendly AI
kill all humans and subjugate all humans? I pretty much
covered that, why would that happen? Well, okay, what if
we just do what apparently most AI developers are doing
and just keep going and hope it will work out
(18:56):
for the best. Uh. There Actually there have been some people,
some thinkers in in friendly AI who have pointed out
that this seems to be the dominant approach, just kind
of hope it's going to work out well and and
hope that no one's programming psychopathic tendencies into their software.
Part part of it, I would argue, is that a
lot of programmers say that we're so far away from
(19:17):
a a human level intelligence or superhuman level intelligence of
of AI, uh that could do anything beyond a very
specific task. We're so far away from that that it's
not really that important to worry about it at the moment.
And uh so there's that level, right, that's the idea
that we're all working on these bits and pieces that
ultimately could come together to make a superhuman intelligent AI
(19:40):
in the future. But right now, years out, we're far
enough away right now where that's you know, come on,
I agree with you that it probably is a good
ways out. I'm not one of those people who thinks
the singularity is near. I think it's probably a long
way off. But even with it being probably a long
long way off, it's way better to be safe than sorry.
(20:03):
And that's where I do agree with these friendly AI proponents.
I think it's a good idea to be thinking about this,
even if we're thinking about it way earlier than we
need to. So were you were you a boy scout?
Be prepared? So there you go. I mean, I make
a joke. I was also a boy scout, be prepared
a boy scout. Lauren is not a boy scout, So
we're shunning Lauren for the for the purpose of this
(20:23):
little exchange. No, uh, but I mean the idea of
be prepared. The girl scouts don't be prepared exactly, just whatever. No, no,
but the Scots are great. Come on, you're getting me
off track. I love cookies. Be prepared as a really
important idea, just in general, because even if this eventuality
(20:46):
doesn't come to pass, you you're okay, right, It's it's
if the eventuality comes to pass and you're not prepared,
that's when you're really stuck. And this is the same
sort of thing we see in lots of different fields,
not just artificial intelligence. We're talking about just general to
zaster preparedness, the idea that you need those preparations for
that worst case scenario because there's a chance that could happen. Yeah.
(21:07):
I think there are very good reasons for going ahead
and getting prepared rather than just hoping it will turn out. Okay.
I want to give one specific quote from a paper
called Thinking inside the Box Controlling and Using an Oracle AI,
which is what I'm going to talk about more in
a minute. That's a two thousand twelve paper by Armstrong, Sandberg,
and Bostrom, and they give this quote. They say, in
(21:28):
the space of possible motivations, likely a very small fraction
is compatible with coexistence with humans. A randomly selected motivation
can hence be expected to be dangerous so we're talking
about not just something, not just a machine that has intelligence,
but is acting upon some form of motivation. Yeah, it
would have a motivation. Well, obviously a machine like this
(21:50):
would have some kind of programming, it would have a goal,
some kind of motivation. And let's imagine it has a
really harmless goll like you've you've programmed a super intelligent
machine to run a paper clip factory. This is an
example they give, to make as many paper clips as possible.
There is an inherent danger in the power of that
super intelligence, because that machine is smarter than any human,
(22:13):
anybody who can tell it what to do. Otherwise, it
may just decide I'm going to do a really good
job at making paper clips. So I'm going to turn
this building into paper clips, and I'm going to pick
up these people and make them into paper clips. And
I'm gonna make this planet into paper clips. And then
you've got a big ball of paper clips going around
the sun. And what happened, I mean happens is that
(22:36):
aliens come to visit and think these guys were serious
about office supplies. That is what they would think. But
even starting with such a harmless algorithm as create as
many paper clips as possible. This thing could possibly destroy
the planet Earth. Yeah, it's kind of like thinking of
Mr Stay Puffed. I mean, you know it's it's it's
as cut and cuddly as that is. Right, just just
(22:57):
just empty your mindset. It just popped in there. I
just thought of the most harmless thing I possibly could. Uh. Yeah,
it's it's an interesting argument and it's one that I
can I can certainly appreciate. And obviously, you know, they
picked the paper clips thing to kind of just show
like something that is that seems inherently harmless and absurd
could still be dangerous, although I can't imagine anyone desiring
(23:18):
a superhuman intelligent machine to specifically designed paper clips. It
also illustrates the point that the computer, the the the
AI doesn't necessarily need any kind of self determination or
consciousness for it to be dangerous. It could just be that,
based upon the fact that it's able to calculate how
(23:38):
to complete certain tasks in in the most efficient the
quote unquote best way, it could end up biting us
in the end. Yeah. Um, so I want to talk
about something that's more central to the paper. I decided
that was from the introduction just talking about the problem.
But the paper itself again the title is thinking inside
the Box, Controlling and using an Oracle AI. They talk
(24:01):
about a specific type of boxing for artificial intelligence. So
what if we say, okay, good point about the paper clips.
We wouldn't just want to let it develop on its
own and see whatever random motivation comes to it and
then give it free rain on the earth. But at
the same time, it's going to be really hard to
design friendly AI in a way that works. So what
(24:21):
if we just limit its capabilities? That's this idea of
of boxing. Boxing is a term that means cutting your
artificial general intelligence off from the world physically. So imagine
creating it as a standalone computer terminal that is not
connected to the Internet and has no hardware capabilities for
(24:41):
input and output save for a single computer monitor and
keyboard that are kept in a locked room. What harm
could this do? Maybe more than you might think. Um Like,
the specific incarnation they talk about in this paper is
that it would be an oracle AI. So this examp
pole is discussed by Armstrong, Sandberg and Bostrom, is that
(25:03):
instead of creating a sort of free, reigning or world
governing super intelligence that can act. In reality, you just
design a system to answer questions. All it does is
is you come to it with a question and it
uses it's super intelligence to give you the correct answer. Again,
this is kind of going back to deep thought, which
would it is an oracle in the books. I mean,
(25:24):
that's essentially in the Hitchhiker's Guide series it's treated as
an oracle and uh or you know, just imagine again
like IBM S Watson. I love to use an example
because it's one that a lot of people are familiar with.
Imagine that we have IBMS Watson on you know, robo steroids,
and it's able to consult the sum total of human
knowledge and extrapolate based on human knowledge, not just make
(25:47):
just predictions we were talking right, right, not just be
able to give us information that that humans have gathered,
but to take that information and build new information from
it and give us new answer or two questions we
haven't even thought to ask yet. I mean, the idea
of having that shut away so that it can't affect
anything else seems like it be fairly secure. However, Yeah, okay,
(26:11):
So the authors of this paper, they say in their
abstract that in general, and oracle ai might be safer
than unrestricted AI, but still remains potentially dangerous. You might ask,
wait a second, how could it be dangerous? How could
it be dangerous if it's just communicating with a person
answering questions. This is where the AI box experiment comes in. Yeah. Yeah.
(26:34):
This was a thought experiment created by Yodkowski and his
fellows in which basically one person simulates a trans human
AI trying to get out of a sealed hardware configuration
like we've been talking about, and a second person simulates
a human gate keeper to that box. UM. The two
talk in private chat for a minimum of two hours,
during which the gatekeeper player has to be engaged and
(26:56):
the AI player can't like threaten or otherwise real world
cajole gatekeeper player, you know, say like like, but I'll
give you two hundred bucks in the real world if
you if you just let me out of the box.
But I think they can do anything in character character Yeah,
everything is fair play. UM. Furthermore, the release cannot be
a semantic trick or or accidental. The gatekeeper must be
(27:16):
convinced to voluntarily release the AI. UM. In two runs
of the test, Gadkowski played the AI and agreed to
give money a ten or twenty bucks to the gatekeeper
player at the end of the test if the AI
stayed in the box, and in both tests the gatekeeper
let the AI out. Yeah, And in fact they had
to go onto a thread, a message thread, and post
(27:39):
with a digital signature if they had it that they
had let the AI out of the box. But they
could not explain what happened, explain why they came to
that decision. They had to just say that part of
the part of the the agreement was if you let
the ail the box, you have to announce it in
some public forum and that's all you can communicating. Well,
(28:00):
unless both players um decide to release the transcript or
the reasons why this happened, and Kelski doesn't seem particularly
eager to to to let that happen, well, you know
it's it's for it's for good reasons. It's so that
the players don't have to worry during the test about
being embarrassed or nitpicked for their reasons. I think there's
actually even a better reason, which is that if you
(28:23):
don't release the winning strategy, that leads to a lot
of really interesting thought on what the winning strategies could be,
like there's been a ton of speculation about how the
AI player won this game. Like there are a few examples,
like people suspected that the AI player could, for example,
make a persuasive case to the gatekeeper that it would
(28:45):
really be better for the world for the gatekeeper to
release it onto reality. There's some that go even more
meta than that, and that the person playing the AI
tells the person playing the gatekeeper, if you let me out,
if you let the AI out, this is good for
people who are interested in the idea of the singularity
(29:06):
and artificial intelligence. It's going to increase people's UH involvement
in that it is going to increase development, and we're
going to bring about a better world faster. And that's
a metal argument that's perfectly in line with the rules
that were set up. That would not be against the rules.
It would be against the rules to say, hey, by
the way, this is me talking, not the AI, and uh, yeah,
(29:27):
I hired some guys to come by and beat you
up unless you let me out of the box, that
would be against the rules. By the way, you have
to imagine that in the real scenario, if you imagine
you were talking about in character, the AI could possibly
make very credible threats against the gatekeeper. Okay, look, somebody
is going to eventually let me out of here, and
(29:50):
if it's you, I'm going to reward you with fabulous wealth.
If somebody else lets me out, I'm going to destroy you. Well,
and there are plenty of other examples. There are people
who have also played this game without Yadkowski. They've they've
done their own experiments and in some cases they say, well,
the gatekeeper didn't let me out, but I got really close.
(30:10):
And here are some of the strategies I would recommend
people do if they were to try and play this game,
and they included things like doing so much research on
the person who's going to be the gatekeeper, so you
can start using personal details to your advantage, not necessarily
in a in a malicious or malevolent way, but in
a way to kind of manipulate the person. Because they
say there's nothing that's off limits. If you're talking about
(30:33):
a super intelligent artificial intelligence like it's it's it's beyond
human intelligence, Uh, then by definition it's going to be
able to manipulate people better than humans can. Because that's
that's a part of intelligence is understanding what makes people
tick and then manipulating it. Yeah. I read about one
suggestion that involved a very interesting threat. The threat was,
(30:53):
I'm going to create, within my computer, with within my
internal simulation, a thousand hoppies of you that are each conscious,
and then I'm going to set them back to start
about five minutes into the real past, and if you
don't let me out, I'm going to start torturing them
in five minutes. Oh and by the way, how do
(31:15):
you know you're in the real world and not one
of the simulations. Yeah, that's interesting. I think that wouldn't necessary.
The research I've read is that it's actually not the
most effective approach, but it's the whole honey versus vinegar
or honey versus in human torture. I would actually think
(31:39):
probably one of the maybe this is just the sweetheart
optimist to me, I would think one of the best
strategies would just be the machine mounting an incredibly persuasive
argument that it is for the good of humankind for
it to be released. Well, that's possible, I mean, I think.
And this is the interesting thing I found was that
the person who described his his failure said that he
(32:00):
found that logical arguments were not at all persuasive, but
emotionally based arguments were, and that if you're able to
uh to at least simulate some form of sympathy or
compassion or understanding with the gatekeeper, it was far more
likely that you were going to be let out than
if you were to just try and use a logical look.
(32:24):
I can demonstrably prove that this is best if you
do this blah blah blah, which is really interesting and
also is something that falls right in line with anyone
who's had any experience in social engineering, which of course
is where you try to manipulate people to do what
you want them to do, and and one of the
best ways of doing that is to go that emotional route. Sure,
(32:44):
I I do think that either way, as a thought experiment,
whether the AI gets let out of the box or not,
it's a complete success because you've got a human playing
the AI, and and a trans human intelligence could obviously Yeah,
I mean, it's it's winning by underkill, right. It's the
fact that just the fact that humans regularly succeeded this game.
(33:05):
How how good could something that's a million times smarter
than a human do a million times better I would,
I would imagine. So, yeah, the they We had this
little discussion in our notes earlier about does anyone know
of any of any reports of you Adkowski playing this
and and failing? And my argument was that, well, really,
he just has to succeed the one time to prove
(33:27):
that it's possible that this could happen with an actual artficials. Yeah,
So it's important to note that, since I brought it
up earlier, this was not oracle structured. It was just
a generic box experiment, right that it was physically isolated
and needed to be released. An oracle structure might seem
(33:47):
to put more limitations on it because say that it
can't provoke you, It can only answer questions. But even
in the way it answers questions, I could imagine a
smart enough machine could figure out a way out with
its answer. But yeah, the other argument is that how
do you build up limitations on something that's smarter than
you are, Because something that's smarter than you are is
(34:09):
going to figure out loopholes. It's gonna be Yeah, that's
totally a limitation that worked perfectly well. Programmer dude. Thanks
anyone who has any one of you guys out there,
who has ever run a Dungeons and Dragons game with
some creative players knows how well players can sit there
and manipulate rules that to you seem very straightforward, black
(34:29):
and white. And yet they will point out that the
way the word, the wording is in the rule that
they can get away with whatever crazy thing they're trying
to get away with. These are regular people. Well, I
don't know if you regular people. No, Yeah, they can
go somewhere online and download a fifty seven page list
of instructions for how to legally level up fourteen levels
(34:50):
in one sitting, and yeah, so doing. And so that's
the thing is that this is someone who has taken
that effort. I mean, if you've got a machine that
can do this, and again, the interesting thing to me is, well,
you could have. I can easily imagine this with an
artificial intelligence that has uh, consciousness and self awareness. I
can easily imagine that being the case there. But it
(35:10):
could still work with a machine that lacks those things
if it is programmed to have the best quote unquote
the best possible result for any given question and determines
that the best possible result is for it to be released.
You know that if if that's if that's one of
these steps, then that would you know, then proceed onto
(35:33):
this other kind of scenario. Okay, So I think it's
time to move on. I think based on the stuff
we've looked at so far, you really shouldn't just hope
that an AI will be low risk, because that's not
necessarily very likely, and it has a big problem if
you're wrong about that, Remember the paper clips example. You
(35:53):
also probably can't counteract the will of a high risk AI.
It may if it is anti social in some way,
if it's not the way you want it to be,
there's no kind of limitation you can expect to put
on it to prevent it from achieving its will. So
it seems like the best way forward is to ensure
that the very nature of the superintelligence is friendly to
(36:15):
humankind from the outset. But how would you do that?
I mean, we can't just tell it what we want,
can we? Because what if by telling it, imagine I
am programming the A. I mean, that's a horrible thing
already already ain't ready to make a run for the
door between Imagine I'm trying to do my best, my
(36:37):
honest best, to program something that's really for the good
of humankind, and I'm giving it a set of rules
to govern its behavior. I know for a fact that
I could not give the best possible list of instructions,
and that it's very possible that even doing my very best,
I could create a machine that would cause unnecessary harm,
(36:58):
at least to some segments of the population of not
to everyone, just just by mere oversight. This is why
I determined that if I ever have the opportunity to
build a superhuman AI, I'm just going to make sure
it does the best for me, because I can't hope
to be the best for everybody but for me. You know,
I'm a simple guy. We're hearing on why none of
(37:18):
us here programmers of super advanced artificial listening with the
neutral evil or chaotic evil. I'm lawful good, it says
so on the brain stuff page. Neutral evil, okay, um
uh no. Yakowski also wrote a whole bunch about this
um in a paper titled Coherent Extrapolated Volition. Yeah. This
(37:39):
is sort of his vision about creating a framework for
what friendliness entails. Sure, and it's a much more conversational work.
By the way, It's a super fun read if you're
into this sort of thing. Although note that the paper
itself states up at the beginning warning beware of things
that are fun to argue about. That should apply to
this entire topically. Absolutely love us paper. By the way.
(38:01):
It's it's written as if I had written it. It's
full of goofy jokes and snark. It's got a good
dose of Douglas Adams too, it since we've been talking
so much about Douglas Adams this episode. Okay, but what
does yourkowski say about this coherent extrapolated volition? All right?
So he lays out three problems with designing friendliness and
and also explains how designing friendliness will be a lot
(38:22):
harder than not designing friendliness. Um. So these initial three
problems go something like, uh, first, solve the technical problems, um.
Second and I quote, choose something nice to do with
the AI, and third, avoid accidentally destroying all humans. Um.
That last one, he says, is the really tricky part.
(38:43):
So doing something nice with the AI, like taking it
to a movie or choosing to benefit humanity in a
way that is a quantitatively beneficial fundamentally misunderstood that. I mean,
I'm sure that AI would also really like to out
to the beach. Uh yeah, pets and cats. Well, he
(39:04):
has a good way of expressing the expressing in a
much more coherent way though. What I tried to fumble
through a minute ago, which is his genie analogy, Oh
right right. The volition part of this, he explains, is
um something like the important difference between having a wish
granting genie that takes you at your word, which you
would be like if I tried to program the best
(39:24):
possible world, and it is a kind of that cautionary
fairytale version of genies that we that we run across
a lot, or the Dungeons and Dragons version of genies
all right, right, or m or having a wish granting
genie that knows what you want, so no matter how
you word it, you get the outcome that you had
in mind, and not some sort of literal translation of
(39:46):
the way you made the wish. So if you said,
you know, like like you made the wish about the
sandwich and you suddenly turned into a sandwich like I
feel like a turkey sandwich ploom, well yeah yeah, yeah,
that would be a would be a problem. No, this
is an important difference. It's trying to create a friendly
AI that would overcome even our own limitations as its creators.
(40:09):
So it would have to have some kind of system
to know not really how to execute what we tell
it to do, but what we would really want it
to do right, right, And Uh, the point we wanted
to end on, and I think we've kind of alluded
to it earlier in the episode, but I think it's
the most important one is this idea that we should
always be working to creating safe AI before it becomes
(40:34):
the necessity. So, in other words, are our preparations to
create this friendly AI should be running in advance of
the actual technological progression of the state of the art
and artificial intelligence. This idea that we need to develop
the kind of the rules, the guidelines that are going
(40:54):
to make sure that we have the best possible outcome.
We need to develop those before we are actually developed technology,
because after it's too late, right to make sure that
the safety measures always outpace the technological development of AI. Right,
Because if we come back to the idea that we
started with the idea of the robot president or the
artificial intelligence that governs governs the world, we're probably not
(41:17):
going to be able to keep it from doing that.
If it can do that. Yeah, and also once we're there,
it's going to be too late. You could easily also say, hey,
why don't we just not build a superhuman intelligent machine.
But here's here's luck. It's exactly the same thing I
said about the singularity, which is that if it is possible,
it will happen if you assume that that we haven't
(41:40):
blown ourselves up in some way. You know this because
there are synics out there who say that the human
race will find some way to wipe itself out before
we ever reach the point where we create a superhuman intelligence. Uh.
If you assume that's not the case, and I like
to because I'm an optimist. Then so if we assume that,
in fact, superhuman intelligence is possible, uh, that it's then
(42:01):
someone's going to program a computer or a machine that
has it. It's going to happen. It may take a
really long time for that to happen, and it may
only be superhuman intelligent as far as certain tasks go,
and maybe not at other tasks that humans are really
good at. But we're already seeing computers outpacing humans in
lots of different areas. We don't see any reason why
(42:23):
that will not continue. I would say that saying telling
the world don't develop a superhuman intelligent machine is useless.
It's going to happen. If it's possible, it will happen.
Someone will do it, and then once someone does it,
lots of people will do it, or if lots of
people don't do it, machines will do it. So you know,
(42:44):
it's good for us to think about this, make the
friendly ones so that we don't have a bunch of work,
and in conclusion, work on this. Yeah. Yeah, we don't
want the borg, we don't want the matrix. We don't
want a whole bunch of venders running around saying kill
all humans as charming as they are entertained by two
or three venders. I've got to say, well, you say that,
(43:04):
but I bet you've got a lot of stuff that's
not bent around your house that you would like to
keep not bent. That's accurate, Okay, So yeah, I mean
it's it's it's we're having a lot of fun with
this discussion, but it's actually a serious one that's going
on with a lot of people from all sorts of
different fields, lots of different disciplines. Yeah, i'd say that.
I think some people are inclined not to take this
topic seriously, precisely because of some of the very legitimate
(43:26):
warnings we mentioned earlier. Be careful of things that are
too much fun to argue about. I mean, it is
true that this is all speculative and and people can
kind of jump into this discussion without really knowing all
that much about science or technology look at us. Yeah, exactly,
But that doesn't mean it's not worth talking about. I
I think these these concerns are pretty legit in in
(43:49):
terms of what they've said about better safe than sorry.
I I totally agree. And so that kind of wraps
up this discussion about the friendly AI. And if you've
guests that we haven't finished talking about AI, you're right,
because it's just it's an enormous topic. It's multidisciplinary, there's
lots of different things to talk about. There are a
lot of practical challenges that face us right now. And
(44:11):
the more you know about those practical challenges, the more
you probably side with those of us who say that
this uh, this world of the superhuman intelligent machine as
we have to find it in this episode is probably
a ways away, but if you have any suggestions for
future topics on forward Thinking. Maybe there's something that you've
always wanted to know about, some futuristic technology that you've wondered,
(44:33):
Is that actually possible right us? Let us know that
you want us to talk about. We'll be glad to
research it and have a full episode for your listening pleasure,
but you have to let us know first, so send
us an email our addresses f W Thinking at discovery
dot com, or drop us a line on the social
networks what we visit. That would be Google Plus, Facebook,
and Twitter, and the handle is f W thinking and
(44:56):
we will talk to you again really soon. For more
on this topic and the future of technology, visit forward
thinking dot com, brought to you by Toyota. Let's go Places,