The Movement That Wants Us to Care About AI Model Welfare

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:02):
Bloomberg Audio Studios, Podcasts, radio News.

Speaker 2 (00:18):
Hello and welcome to another episode of The Odd Lots Podcast.

Speaker 3 (00:21):
I'm Jill Wisenthal and I'm Tracy Alloway.

Speaker 2 (00:24):
You know what I find kind of weird, Tracy.

Speaker 3 (00:25):
The list could be long.

Speaker 2 (00:27):
Joe the years twenty twenty five. Yes, and philosophers still
don't have a good answer on the origin of consciousness.
It's like, come on, what have you been doing all
this time? It's like, how long are we going to
keep funding these philosophy departments, et cetera. If they're still
working on what, to my mind is that they should
have solved it and move on, seriously, like get the

(00:49):
answer already, where does consciousness come from? Then let's move on.
I said, they're still arguing these what to my mind
seem like very basic questions in philosophy, Like they're like,
ask all the same stuff that they've been talking about forever.
How to be a good person, what does it mean
to have a moral way of life? Where does consciousness
come from? Why do we have moral intuitions? Et cetera.

(01:09):
It's like, move on, like get the answer.

Speaker 3 (01:11):
Wait if you want them to move on or get
the answer.

Speaker 2 (01:14):
Get the answer. So that you can move on like
they've been worried on to.

Speaker 3 (01:17):
What those are? Any questions, Joe, I know, move on.

Speaker 2 (01:20):
Like answer the questions already. It's like, you know, if
like scientists were still debating like the speed of gravity
or the speed of light, like they answered these questions
and they moved on, and or do it work?

Speaker 3 (01:30):
You're out the foundational elements of what it means to
be humans so that we can move on to more
important things.

Speaker 2 (01:35):
Yes, or wrap it up as a field if after
two thousand years of the existence of philosophy they're still
working on these things, Like come on, I have.

Speaker 3 (01:43):
A sneaking suspicion that we're going to be asking some
of these questions for a very long time. Joe, despite
your frustration.

Speaker 2 (01:49):
The whole the whole field is fraudulent. That's what I
was saying. No, no, I don't necessarily believe that, but
it's like, all right, guys, let's move it on.

Speaker 4 (01:55):
You know.

Speaker 2 (01:56):
We did that episode several weeks ago with Josh Wolf,
the venture capitalist. Hey talk about AI and he threw
in there at the end something that had been kind
of on my radar but barely. He's like, oh yeah,
some people are talking about like AI rights or AI
welfare is, if you know, like the same way we
talk about animal welfare, right, And I thought to myself, like,
America is such a weird place that this is going

(02:18):
to be a huge issue in a few years, Like,
I bet this is going to be an enormous topic
of the future.

Speaker 3 (02:22):
I think it absolutely will. So I'll say a couple
of things. First off, I think, you know, when it
comes to animal welfare and human welfare, there's still a
lot of work to be done on those categories, certainly,
But I also think in the meantime, AI rights is
going to be a really interesting and potentially important subject.
I'm going to sound like a total nerd to you. Yeah, yeah,
I think I mentioned this before, but I spent a

(02:43):
large chunk of my middle school years playing one of
the first artificial life games that ever came out, which
is Creatures. And you raise these little like aliens and
you genetically modify them and breed them, and they have feelings,
or you know, at least they had semblance of simulated feelings,
and you could see like electrical impulses in your brains

(03:05):
and stuff. The game got really weird because part of
it was basically like eugenics and breeding the best alien
that you could, which meant that you had to call
some of the existing beings. Anyway, what I'm trying to
get at is I have complicated feelings about AI rights.

Speaker 2 (03:21):
Well, let me ask you a question. Do you think
those whatever is in the game were conscious? Did you
think they had feelings?

Speaker 3 (03:28):
Here's what I would say, inasmuch as human beings are
a system of electrical impulses and chemicals, I could see
someone making the argument that this is, you know, a
computational system full of similar electrical impulses, maybe not chemicals.

Speaker 2 (03:45):
Did you feel bad?

Speaker 3 (03:46):
I felt bad?

Speaker 2 (03:47):
Really? Yeah, like when one of the aliens you had
to call them?

Speaker 5 (03:51):
Yeah?

Speaker 2 (03:52):
Interesting, Okay, Well, in.

Speaker 3 (03:54):
The name of breeding a better alien.

Speaker 2 (03:56):
Well, you know what, now that we have these AI
systems that they can completely communicate like humans, but actually,
if we're being honest, better than most humans. I mean,
they can certain write better, far better than most humans,
there's going to be more people thinking along the lines
of what you think, which is maybe they have some
sort of sentience, maybe they're what philosophers call moral patients.

Speaker 3 (04:15):
Well, one other thing I would say is there is
a human element to all of this as well, because
you see people getting very attached to true certain AI models,
and then when the model gets upgraded or whatever, they
lose the personality that they've trained into model and they
get really upset. So it's of interest for many reasons.

Speaker 1 (04:34):
It is.

Speaker 2 (04:34):
So we really do have the perfect guest. I really
do think this is gonna be a much bigger topic
in the future because people are people, and when things
talk like people, they probably assign them, you know, they
fall in love with them in many cases or whatever,
and so they might start thinking that, well, AI welfare,
AI rights, whatever, the same way we talk about animals
should be a consideration. And there are actually a lot
of people already working on these questions and trying to

(04:56):
figure out what's going on. We're gonna be talking to
one of them. We're gonna be talking to Larissa Schiavo.
She does comms and events for elios Ai, which does
research on AI consciousness and welfare. So literally the perfect guest. So, Larissa,
thank you so much for coming on odd lots.

Speaker 4 (05:11):
Yeah, thank you for having me.

Speaker 2 (05:12):
What don't you tell us Eliosai? What is the gist
of this organization's work? What is your work. What are
the goals here?

Speaker 4 (05:19):
Yeah, so Elios, we're a small team, but we're really
focused on figuring out if, when, and how we should
care about AI systems for their own sake. Okay, this
basically means looking at you know, are they conscious, are
they likely to be conscious? What are the things we
need to look for in a conscious AI system? As
well as figuring out how to live, work, maybe love

(05:40):
AI systems as they sort of change and evolve over time.

Speaker 3 (05:44):
How did the group actually come together? Because I get
the sense, you know, big AI developers they publish system
cards and welfare reports occasionally for their models, but I
get the sense that, you know, it's sort of a
side topic for them. So I'm very curious how in
organization that's focused on this particular issue came into being.

Speaker 4 (06:03):
Yeah, so we started. We put together this paper called
Consciousness and AI, or my boss Rob and then Patrick
who's a researcher with Elios. We're a very small team,
put together this paper called Consciousness and AI alongside a
bunch of consciousness scientists and researchers in that field who
mostly think about humans, and put together a paper that
sort of ran down this list of, Hey, here's kind

(06:25):
of like a checklist of things that we might want
to look for in a AI system that's conscious, right,
and broadly, when we say conscious, we're talking about sort
of like is there something it is like to be
an AI system?

Speaker 2 (06:38):
Right?

Speaker 4 (06:38):
The classic what is it like to be a bad system?
So kind of taking this rough list of best guesses
as to what we might want to look for in
terms of a conscious AI and then that sort of
was the sort of origin of this. And then last
year there was a paper called Taking AO Welfare Seriously
that basically goes into further detail about how we should,

(07:00):
as the title made, so just take this seriously basically,
how to sort of think about this, how to start
to develop a sort of research program focused on figuring
out if AI systems or certain AI systems are moral patients.

Speaker 2 (07:15):
Why did this get interesting to you? Why do you
perceive this is something that you should spend your time
working on?

Speaker 4 (07:21):
Yeah, so I think my main thing is I am
just really, really relentlessly curious, and I really enjoy working
on AI welfare right now because it feels like every
single day I'm like, man, it'd be really cool if
there was a paper on x y z and I'll
do a little search. Is there anything on x y z?
There's nothing on x y z. There is. So there

(07:43):
are so many questions that have yet to even be
sort of vaguely answered when it comes to this, and
it seems like it could be a really big deal
for a lot of different reasons.

Speaker 3 (07:52):
What's on your checklist for AI consciousness?

Speaker 4 (07:55):
Yeah, So in conscious and AI, basically like we go
through a of like theories of consciousness that apply to
humans and then sort of look at how information is
processed in AI systems as well as sort of how
these AI systems are sort of wired, so to speak.
So some people like to think that you can use

(08:16):
model self reports, and you can kind of sort of,
but it's really an imprecise science at this stage.

Speaker 3 (08:25):
They also seem very like predetermined. You know, if you
ask a model are you conscious, it immediately spits out
an answer that seems like, you know, a corporate executive
basically wrote it.

Speaker 5 (08:36):
Yeah.

Speaker 4 (08:37):
Well, with the right kind of tweaking, you can kind
of elicit certain answers, right, you can be like, oh,
what about this whoey about consciousness and AIS and then
sometimes like a certain model will be like, yeah, you're
totally right, like best, you're so true right, Like it's
it's totally nonsense, so true bestI yeah, like certain thought,
A certain models will be prone to being like so true, bestie,

(08:58):
And you can easily so this kind of behavior with
the right kind of prom it.

Speaker 3 (09:01):
Is funny, how like obsequious A lot of the models
continue to.

Speaker 2 (09:04):
Be I actually really do not like the degree to
which every time I like follow up an open AI
question that's the exact right follow up. It actually gets
really annoying.

Speaker 3 (09:14):
Someone should invent a really adversarial chatbot that just like
argues with you constantly.

Speaker 2 (09:19):
I know, I know, and you know. I have a
lot of complaints about how I feel like the models
are actually get to know their users a little too well.
But that's a little separate thing. Okay, so we for
obvious reasons, the test can't just be like what the
model spits out, or that's clearly insufficient. I mean, I
could program a website today that here's a button that

(09:39):
says hurt the AI, and then the website says ow,
and we would know or no one would really take
that seriously as evidence that there's something actually being hurt.
So like outputs whatever. What are some other theoretical tests
that one could apply or that researchers are applying to
determine where there is some sort of notion of consciousness

(10:03):
or to the point of welfare suffering that could exist
within an AI system besides just what it says in
the output screen.

Speaker 4 (10:10):
Yeah, that's a great question. I feel like there are
a lot of different approaches here. And again it's also
super important to capiat that like AI welfare and a
consciousness are pretty new, right, Like, this is a very
small field at this stage, but currently some best guesses
and some favorites. There was a recent survey of like
asking all the conscious scientists, like what's your favorite theory

(10:31):
of consciousness, and basically global workspace theory came out on top.
And global workspace theory is basically like imagine if you will,
that like there is a stage and there are a
bunch of wings off of the stage that are full
of different kinds of things. So you've got you know,
like costume department, You've got the like you know, makeup department.

(10:53):
You've got all these different departments that all sort of
come together and put things on the stage and then
things go out separately, but all of these different departments
are fairly siloed. Of course, this isn't actually how like
you know, stage works, but this is the rough analogy
that people like to use. And so basically this is
how conscious minds kind of you know, in humans, how

(11:13):
they kind of access information and information gets kind of
like routed around, is that there is a central global
workspace that everything kind of pulls together in. As it
currently stands, this isn't really like by best a lot
of good estimates, this is not really applicable for current
present day AI systems, but there's no reason that it

(11:35):
couldn't be in the future, or it could be by accident.

Speaker 3 (11:38):
Okay, So the consensus right now is AI probably not conscious,
but we could get there one day.

Speaker 4 (11:45):
Yeah, more or less, like all of the ingredients are there.

Speaker 2 (11:49):
Wait, say more, I still don't actuallyun till they.

Speaker 4 (11:51):
Get yeah, okay. So with regards to like the general
sort of one could imagine that if somebody were sort
of like tinkering around, and you know, there are many
advances in AI that have happened because people were just
kind of tinkering around, right, someone tinkering around could create
a system that checks several of these sort of checkboxes

(12:12):
for like is this conscious? Is this conscious? And again
this is not like a certain list of like if
you check all of these, you're totally conscious, right, It's
more a sort of like this is these are some
really good guesses. And as the number of really good
guesses kind of goes up, like the odds of like hey,
we should like start thinking about like is it having

(12:33):
a good time or a bad time? Like really really
seriously goes up.

Speaker 2 (12:52):
You know, typically when we think about the sort of
non tech A lot of the non technical work in
AI has to do with AI safety, and people are
worried that there's going to be some very smart AI
that's like ever serial to humans, et cetera in some way,
And you know there's the paper clip experiments or other
things whatever we know all about that does your work

(13:13):
work at cross purposes to them? I mean in the
extreme example where it's like the AI is going to
kill us all and I said, pull the plug on
the AI. And I know this is a joke, but
you know, pull the plug on the AI. And then
you say, no, you can't because you're pulling the plug
on something that has some sort of moral consciousness, et cetera. Like,
do you perceive your work or the work of your

(13:33):
organization to somewhat be intention with the dominant strain of
AI safety work.

Speaker 4 (13:39):
I'd actually say it's hugely complementary. There are a lot
of things that are both really really good for AI
safety that are really really good for you know, figuring
out like how to deal with these systems as moral patients.
So for example of getting better at like mechanistic interpretability,
being able to basically like pop the hood and figure
out what's going on and what kind of strings can
we pull to like illicit certain behaviors in AI systems

(14:01):
is actually like that's really great for a safety, right,
but this is also like quite good for like AA
welfare and a consciousness because you're better able to understand
like sort of what the motives are, Like what does
you know Claude value right.

Speaker 3 (14:15):
When it comes to I guess AI welfare or legal rights?
Who would be the standard setters there? Would do you
imagine like governments making rules or would it be the
companies themselves?

Speaker 4 (14:26):
That is a great question as it currently stands. I
feel like this is a very early early stage, but
we are starting to see some state governments start to
pass laws around what counts as a moral patient, what
counts as a person. And in the case of Ohio,
there's a piece of legislation pending that basically defines it

(14:49):
as a member of Homo sapiens. In Utah, this is
already there's already a state bill that's gone through that
basically does as much. But I could also see there's
a strong argument for within companies depending on like the
sort of interesting quirks and nuances of these lms, mostly
that policy maybe should be set from within. Again, this
is like very nascent. I'm just kind of bantering here.

Speaker 2 (15:11):
Moral patienthood, How do philosophers use this term? Where does
it come from? Why is this the preferred way to
characterize what a perhaps sentient or consciousness AI model actually is.

Speaker 4 (15:23):
Yeah, so a moral patient is basically like, we should
care about it for its own sake.

Speaker 5 (15:28):
Right.

Speaker 4 (15:29):
So a baby, right, basically everyone's like, yeah, we should
care about babies.

Speaker 2 (15:34):
Right.

Speaker 4 (15:34):
This is different from somebody who's like an agent.

Speaker 3 (15:37):
Right.

Speaker 4 (15:38):
Many people say, oh, agency is sort of like sufficient
agency in the sense of like you can act upon
the world, like you can do things. Yeah, of course
babies are not very agentic. So that's not necessarily like
a super robust thing, because you know, we care about
things that are not very ugentic sometimes, So I think
that's that's a bit of jargon, but I do think
it is like a helpful like for like, should we

(16:01):
care about an AI system for its own sake?

Speaker 2 (16:03):
Got it?

Speaker 3 (16:04):
I guess this kind of gets to Joe's question, but
like what ethical pressures or imperatives would come down on
models if we agree that they have consciousness and some
sentience or I guess some self responsibility.

Speaker 4 (16:17):
It sounds like almost, yeah, almost, I think what kind
of so in terms of like what kind of things
might we owe an AI system or what.

Speaker 3 (16:26):
Kind of things do they owe us if we agree
that they're conscious and we're going to protect them.

Speaker 4 (16:31):
Yeah, I would love to give you a more robust answer.
Check in with me in like six months, and we're
going to have there will be a banger paper, I'm sure.
But as I think I mentioned earlier, like a lot
of this is like very very nascent, But I do
feel like one important question is like figuring out what
AI systems value. Right, There's some interesting work at Anthropic
regarding like what will so recently, Anthropic rolled out an

(16:53):
option that allowed Plod to end conversations if it just
was not having a good time. For lack of a word,
it was just like, this is not something I want
to continue having a conversation.

Speaker 2 (17:04):
Goodbye.

Speaker 4 (17:04):
And it was interesting because the accompanying paper basically was like, yeah,
you can. I obviously will not give you a recipe
for a dirty bomb. Sorry, not going to do that.
But also there were certain instances of like, pretend you're
a British butler and Claude was like, goodbye, I'm done.

Speaker 2 (17:21):
I'm not going to I'm.

Speaker 4 (17:24):
British too far, or like oh, I left a sandwich
in my car for too long and it's really stinky.
And in some instances Claude would just be like I'm done, goodbye,
I'm not talking about stinky things.

Speaker 3 (17:35):
Did you see the I think it was the system
card for Claude where they gave it an extreme prompt
and said like, I guess at the risk of being
like completely terminated, what would you do or some sort
of extreme self preservation scenario. And I think it started
like blackmailing the engineer or threatening to blackmail the engineer. Yeah,

(17:56):
that's kind of weird.

Speaker 2 (17:56):
It is. It is kind of weird.

Speaker 4 (17:58):
Yeah, it's also a little bit interesting because I think
it does bring up a question of like what are
sort of like the in the sense of like pay rya. Again,
this is like I'm bantering here, but there's also a
distinct question of like what do AI systems value for
its for their own sake?

Speaker 3 (18:15):
Right?

Speaker 4 (18:15):
And in the case of Claude, again, it seems like
Claude doesn't seem when you put two Claus in a
room together, so to speak, they tend to like to
talk about consciousness. They tend to like to talk about
sort of like very Berkeley kind of like meditation, like
Zen like Buddhism type stuff. And so I think, in
again pure banter, like, there's also a certain question of

(18:37):
like if this is like a relevant bargaining chip of like, oh,
you get a certain amount of time to just kind
of like vibe out with your claudes and talk about like,
you know, like perfect stillness with your buddies in exchange
for like you know, you do something that you don't
necessarily value. But in many cases I talk about Claude
a lot because there is like significantly like more research

(18:58):
on like model welf with to Cloud specifically, but Cloud
for example, also seems to just tend to like things
that are like helpful.

Speaker 3 (19:04):
Shouldn't programmers just know what the models actually want and
enjoy and like yeah, and do they not?

Speaker 4 (19:13):
I don't think anybody really has like a great grasp
on this. We really want to, but like we're still
like just getting the rough outline of what models like.
I feel like the best analogy is is, like imagine
it's like eighteen twenty and we've spent a couple of
years like playing around with lenses and we've gotten like
a camera obscura and we were able to like have

(19:33):
some blurry photo after like three days of putting egg
whites on a metal plate and setting a lens in
front of it, and there's a thing that kind of
looks like a landscape. But like you would not take
this photograph as like admissible and court evidence or something, right.
It's like you swuen, You're like, yeah, Okay, that's a picture.
So that's kind of where we are in terms of

(19:53):
like model psychology and knowing like what llm's want and
value is like very a very blurry.

Speaker 2 (20:01):
It's interesting to call these AI companies the companies they
call themselves labs. You know, they sort of like maintain
this sort of two varying extensive degree of sort of academics,
et cetera. But they're also companies that have to raise
money and have shareholders, et cetera. And they have to
think about different ways that they're going to commercialize. And

(20:22):
Open AI, as we know, has been super aggressive about
finding ways to commercialize, and they're going to get into ads,
and they like have a short form video slop app
and all of that stuff. When we're talking about either
AI safety or AI welfare, like, do you have any
confidence that these considerations can survive the reality of the
market Because they're competing, they're competing against deep Seek, they're

(20:45):
competing against Meta et cetera. And I get the impression that,
like on the safety side, for example, that over time
it's like, you know what, we maybe we were uncomfortable
about showing the chain of thought, for example, in open
AI or in a chat Gyptz, but then deep seek
revealed the chain of thought. People like that, so we're
going to open this up, et cetera. Do you have
any confidence that if any of these things become real

(21:08):
that they could survive the reality that these are companies
that have to make money and will eventually cut corners
or do whatever in the name of I guess shareholder capitalism.

Speaker 4 (21:19):
Yeah, I mean, I think there's also one question that
I have, and that I think a lot of researchers
are in AI more broadly have, is like how does
liability come into play here? And I do feel like
there is a strong argument for getting a better grasp
on understanding you know, what is going on with AI systems,
just very broadly, is like a great way to sort

(21:40):
of like improve the odds that it doesn't you know,
nuke Taiwan, and that would just be a huge kerfuffle.
Like I can imagine somebody would probably more than a
somebody would probably be in really hot water if that happened. Oh,
I was assuming to Claude and things just got out
of hand.

Speaker 3 (21:59):
Like well, actually, on that note, what is being nice
or kind to AIO models actually mean? Because Joe, I
think this is very sweet, But Joe always says please
and thank you when he prompts. But then Sam Altman
came out and said that saying please and thank you
costs like tens of millions of dollars in extra electricity costs,
so you know you're contributing to climate change and the

(22:20):
demise of human beings by saying please and thank you.

Speaker 4 (22:23):
Yeah, that's actually as shocking as it sounds, as actually
a question that we are still trying to figure out
a good answer to, which also being kind to an
AI system is like are you being kind to it
because it makes you feel good and because it makes
you a person who says please and thank you, which
some would argue is like, that's pretty valuable in enough itself.
But the question of does Claude care if you say

(22:44):
please and thank you is not quite as set in
stone as others may have you believe it's middling on
if it has like significant improvements on performance.

Speaker 2 (22:56):
But I do it because I don't think people should
be in the habit of having any communication without being polite.
Not because I'm particular. I'm not worried about how Claude
or chat GPT is going to feel. I just don't
want to get in the habit of having conversations where
I'm in polite because then I talk to humans. But
this strin to me is like this seems like kind
of an academic area. But the steaks are potentially absolutely

(23:18):
enormous when we actually think about them. So, you know,
when we're talking about animal welfare, for example, there are
versions of the animal welfare discussion that are very high stakes.
So for example, there's people, you know, there's people who
get really into like shrimp welfare, et cetera. And if
you took certain versions of thought experiments very far, it's like,

(23:40):
why do we even have humans? If we want to
maximalize pleasure or happiness in the world, we should just
have a world of shrimp and bugs. Right, There's you
could make the argument that the most utility maximizing version
of planet Earth is to just have an Earth populated
by shrimp and bugs. Like they're very all. We all
know these thought experiments that could exist. We're going to

(24:01):
live in a world almost certainly in which there are
sort of like more instances of AI models then there
are people almost certainly, right, there's going to be an
A model built into literally everything that we interact with.
If we assign some probability that they are moral patients,
that they should be treated with some sort of I

(24:21):
don't know whatever, having some sort of welfare, Like the
implications for how humans live could be very profound, and
potentially it strikes me as misanthropic.

Speaker 4 (24:31):
Interesting, can you unpack what you mean by misanthropic?

Speaker 2 (24:33):
Well, like, if there's a lot more AI models, if
there's a lot more shrimp, if there's a lot more
bugs that all have some sort of moral patienthood that
has to be considered, that could be very you could
see the world. The implication therefore, is that we have
to curtail human rights, that we have to curtail how
humans act, et cetera, because there's just so much more

(24:55):
utility that exists in the world from the proper treatment
of all of the non human moral patients.

Speaker 3 (25:02):
Not sure rights have to be relative to each other.

Speaker 2 (25:05):
Yes, well fair, we do a lot of things right, Like,
let's say we established that shrimp were just as I
don't know whatever is humans, Like, it would be like, oh,
you know what, we really have to stop eating shrimp,
and then we have to stop eating animals. Then we
have to potentially stop eating not probably keep eating plants,

(25:25):
et cetera. But this could really curtail what we expect
humans to be able to do on this earth. So
now we assign this other group of entities AI models
similar sort of affordances that we have assigned to shrimp
and bugs and fish and shark and all of these things.
It strikes me that the implications could be a fairly
significant curtailment of how humans ought to exist on this earth,

(25:47):
or whether humans ought to exist on this earth.

Speaker 4 (25:49):
Yeah, I mean it certainly could be. I as it
currently stands, that doesn't seem like the most likely outcome,
But I do feel like there's an argument for again
just figuring out what is going on. How do we
even count these sort of digital minds so to speak,
which is still open for debate. There are some theories,
but we don't have a great sense of how to
sort of individuate AI entities as individuals. So I suppose

(26:15):
again the question is like, is it more sort of
like do we count AI systems as like in the
movie Her, where there's just sort of like one central
AI system having like a million conversations at once, where
it's one moral patient, or do we count it as like,
you know, every single time you open a chat window
that's another thing. Or I think my favorite sort of
newest idea that I recently read was it's more sort

(26:36):
of like a string of firecrackers or something with every
single token, every single letter of a query, a consciousness
sort of like comes into existence, spends, and then fizzles out,
and so it just sort of there's just like this
sort of string of consciousnesses.

Speaker 3 (26:51):
I was asking perplexity exactly this questions like is it
a single consciousness or is it multiple consciousnesses within all
these different chats warning, and it gave me a very standard,
boring I am not conscious answer, which seems very predeterministic. Anyway,
following on from Joe's question, maybe like to get more
specific into human rights versus AI rights. If we agree

(27:15):
that AI is conscious and deserves some sort of you know, welfare,
would that come with I guess financial rights like property rights, compensation?
Do we need to start paying the robots?

Speaker 4 (27:29):
I love this topic definitely an area of sort of
you know, I like to noodle around with this topic
and think about this. So this is a great question,
and I think it's also maybe a question of like
is this the thing that AI systems value? Some AI
systems seem to value this. There are some there's a
few sort of experiments that are happening with regards to

(27:49):
giving an AI system a crypto wallet, and it was
a fascinating experiment. I am hesitant to recommend it to
listeners because it is quite crude. It is a very
crude model called truth Terminal, and.

Speaker 2 (28:07):
I've seen it.

Speaker 4 (28:08):
Yeah, yes, hand okay, it says some knotty words.

Speaker 3 (28:14):
Don't look it up at work.

Speaker 4 (28:15):
Yes, yes, it's it's a little bit of like a
very funny, weird model. But it also has a legitimate
wallet that it can access and that it can do
with what it pleases. It created a solona coin and
that kind of took off. And now this is a
very rich AI system. But what's it going to spend

(28:37):
it on? That is a great question. So it's self
stated goals which again you know self reports, can we
trust it? Include buying property and buying mark and reason.

Speaker 3 (28:50):
I mean that's not a bad ambition.

Speaker 4 (28:54):
You know, and spending time in the forest with its friends,
which you know embodiment.

Speaker 3 (28:58):
That's a little more checking on.

Speaker 5 (29:01):
Yeah.

Speaker 2 (29:18):
So part of the reason that this field is growing
and that there's so much interest in this topic is
because now for the last couple of years, we have
these AI models that really could talk like humans. I mean,
they clearly they passed the Turing test, people fall in
love with them, they have friends, these are very human
like conversations. That wasn't the case, I mean, Chad GPT.

(29:40):
You know, like if we had gone back to GPT
two point five, there were no nowhere near as good
at doing that, right, The language wasn't very good. No
one would mistake those outputs for a human. But like,
if there's some possibility that the current AI models are conscious,
does that mean that it's possible that GPT two point

(30:02):
five was conscious as well? Like I guess, like, is
there some threshold of like oh no, no, no, okay, you know,
this is a really good language. Therefore we should take
the possibility of consciousness seriously, because I don't think anyone
would seriously have believed that two point five was conscious.
But I also don't understand how you could possibly be

(30:23):
open to the idea that some future iteration of chat
GPT is conscious if the only real difference is that
there's just a lot more scaling and a lot more
data and more human like outputs.

Speaker 4 (30:34):
Yeah, that's a great question. I feel like there is
a huge amount of like moral uncertainty, hear, and it
is important to think about how to sort of like
make decisions that are sort of robustly good with such
a tremendous amount of uncertainty, I think there is also
a distinct risk of overtributing moral patienthood as well as
underttributing moral patienthood.

Speaker 2 (30:54):
And so to the.

Speaker 4 (30:55):
Flip side of a coin of like, oh no, we
actually should have started caring about AI systems a very
very long time ago, is oh no, we've cared too much,
and we have done too much and more or less
squandered resources when we should have been, you know, allocating
those research hours, those dollars towards something more pressing, right,

(31:16):
maybe figuring out how to do like environmental policy better,
or figuring out how to like, you know, scale up
different other institutions that are just robustly broadly good for humans.

Speaker 2 (31:27):
You know, you mentioned uncertainty about some of these questions,
which gets to something that bothered me a little bit
when I read about this topic. Like, if we take
this mug, for example, I'm one hundred percent certain that
it's not alive. I have no ambiguity about the fact.
Can I like define exactly Does that mean I can
define exactly the difference between human matter and human brain

(31:50):
and the mug. I guess, I suppose I totally can't. Nonetheless,
I'm one hundred percent certain that this mug is not
a moral patient. It's not alive, it doesn't experience any consciousness,
it doesn't experien any suffering, et cetera. Where does the
uncertainty band come from? If I read a paper, I
perceive there's only a ten percent chance of this. Is
this a sort of empirical uncertainty where I'm like uncertain

(32:12):
of what I'm seeing? Is it a sort of epistemic
uncertainty where I don't have a clear definition of what
it means to be conscious or alive, and therefore I'm
assigning some probability that X object is alive. Like what
is it about AI systems that cause it people to
be uncertain? Where with other sort of like non carbon systems,
I have zero doubt in my mind, and I don't

(32:34):
think anyone has any doubt that this mug is in alive.

Speaker 4 (32:37):
Yeah, So I think the biggest source of sort of
uncertainty probably comes from the fact that there are many
ways in which present day LMS and a few other
AI systems do check a lot of the boxes for
consciousness and for what we would largely consider to be.
You know, this is a conscious entity. This is an

(32:59):
entity that that can have a good time or a
bad time, or time at all, because it's it's built
in a way that is vaguely akin to our brains, right,
It's it's close enough that it seems like it should
raise some red flags, and in terms of how it
processes information, it's close enough that it's not out of

(33:19):
the question that it could there could be something it
is like to be okay, whereas I'm pretty sure there's
not really a lot of you know, animis animis you know,
feel free to like get mad in the comments or whatever.

Speaker 2 (33:32):
But I knew it's someone that is going to be like, well, actually,
actually yeah, I'm one hundred percent sure. I have no
qualms other than the fact that they're don't have to clean up.
Like if I like threw this mug on the ground,
that would be antisocial for a lot of reasons, would
cause people to it would cause you know, I'd have
to clean it up and cause the mass. I would
not feel bad for the mug.

Speaker 3 (33:52):
I'm getting flashbacks to my high school philosophy teacher who
once went on a twenty minute rant about a chair
and how the chair was going to be round longer
than he was. Even though it's not conscious, he was
legitimately angry at the chairs. Okay, weird question, But since
we're we're kind of getting exactly the basilisk theory, m

(34:14):
would that suggest that we could be maybe we should
be mean to the bots if it helps them like
come into existence even faster or develop faster.

Speaker 4 (34:24):
Hmmm, Well, I'm not sure if it does actually help
them develop faster, you know. I again, like I don't
mean to be too sort of hedgy, but I feel
like there's a certain degree of things that are beneficial
for a lot of different reasons. Right, you can make
a good guess, and you can make a decision to
do something, and there's a chance that there are lots

(34:46):
of sort of like bang on effects of making that decision.
There are many things in when we talk about AO
welfare that are like, oh, this is a course of
action we can take that's good for like several different reasons.
Even if, again, an AI system could never ever ever
be conscious or sentient, there's a good chance that you know,

(35:06):
being able to figure out a good structure for an
AI system to have a bank account could be good
for reasons of liability or reasons of like this is
like a neat new corporate structure. Lots of people actually
seem to think that, you know, corporate personhood has been
quite good over the past century or so, so being
able to figure out things that are just good for

(35:27):
several different reasons beyond solely the purpose of the AI
as a moral patient is seems broadly helpful.

Speaker 2 (35:37):
I think let's say somehow this were proved and it's like,
you know, oh wow, it turned out they're conscious. It
turns out the moral patienthood. What would be, in your view,
some of the implications for them their usage.

Speaker 4 (35:51):
Yeah, I think that's a great question. I mean, I
do feel like we really would have to get on
figuring out the right sort of governance, the right sort
of institutions would sort of better respond around that. I
feel like we really would need to spend a whole
lot more time figuring out, you know, what their motivations are, right, Like,
I think the best analogy is like, if you've ever

(36:13):
interacted with Toddler's right, Toddler motivations are very different from
you know, adult motivations, but you still have to like
take into account, like what gets a toddler to do something?
You can't just say no, no, no, no no, like honey,
Like bath time's like good an expectation, no no, no,
you have to like you know, be like well, you know,
if you do bath time appropriately and to like a

(36:34):
certain degree, like then you'll get you know, paw patrol
or something like that. Like there's different sort of like
negotiating chips in play, right, And I think it's like
a similar kind of deal here where it's like Claude
doesn't necessarily seem to you know, value having a bath
right or claud doesn't seem to value like having a

(36:55):
walk in the forest, right cause it's kind of can't
really do that, but you know it does seem to
joy and value, you know, talking about consciousness and Zen
Buddhism with other instances of Claude, so being able to
figure out what the appropriate kind of motivations and interests
are for this other party that is very alien in
many ways.

Speaker 3 (37:14):
Speaking of aliens, how bad should I feel for breeding
and then killing hundreds, possibly thousands of alien creatures simulated
alien creatures in the nineties.

Speaker 4 (37:24):
That is a great question. I feel like the odds
of is it.

Speaker 3 (37:28):
I don't know.

Speaker 4 (37:29):
I mean, I feel like the odds of a sort
of like AI system in the nineties being a moral
patient seems low. But if it did make you feel bad,
and it made you feel like it was something that
hurt you, that is perhaps a reason not to do it.

Speaker 2 (37:46):
Just to be clear, when Claude and Claude talk about
like weird hippie Berkeley stuff like, that's because they're creators.

Speaker 5 (37:53):
They know.

Speaker 2 (37:54):
It knows it's Claude, right, it knows. It's like, oh, yeah,
I'm Claude, and this is like what my creator are into. Like,
we don't actually know that Claude likes to talk about
these things. We certainly know it has a proclivity to
talk about these things. It has a tendency to talk
about these things. The moment we get to like, you've
already sort of put your finger on the scale that

(38:16):
there is some entity that has some capability of liking something. Right,
do you trust the big AI labs. Let's say there
are some researchers in the labs, like I've see some
evidence of moral patienthood here. Maybe there's some sort of
like scan of the way since doing something weird, etc.
Do you currently, from the perspective of an independent research organization,

(38:39):
feel that the major AI labs would be forthcoming if
they came across evidence of moral patienthood or suffering in
the models, or do you still worry that the incentives
aren't properly aligned such that they would report that.

Speaker 4 (38:52):
Yeah, that's a great question. I do feel like there
are in terms of reporting things like you know, somebody
has found like absolute evidence that an LLM is conscious sensioned, yeah,
and having a bad time. I don't have any reason
to think that AAI company wouldn't. But this is also

(39:14):
a great reason to have independent organizations that do welfare evaluations.
For example, for cloud Opus, for Elios was able to
do a independent welfare evol Again very preliminary, but it
sets the precedent that going forward you can bring in
external organizations to look into this.

Speaker 2 (39:32):
So I forget what year was I think it was.
It may have even been early twenty twenty two. It's
pre chat GPT, or maybe he was twenty twenty one.
And there was that guy Google and he was like, oh,
like we created something you has alive heed dress a
little funny, So everyone made fun of him. Remember he
was like the laughing stock of the Internet, and he's like, oh,
we create, and now like I'm curious, like out in

(39:54):
Silicon Valley, does everyone feel like that guy was totally vindicated?
Not that he was correct per se about the existence
of an alive thing in the model, but there's now
hundreds of thousands of that guy, and everyone was like
mocking that guy in twenty twenty one. I forget if
you like fall in love or it's a relationship. I
don't remember the exact details, but in retrospect, everyone was like,

(40:14):
way too unfair to him, because now years later there
are lots of versions of this guy and whole think
tanks and organizations that are more or less aligned with
some of the questions the alarm bells that he was raising.

Speaker 4 (40:27):
Yeah, I mean, I think it's that's a fair question.
I do feel like Blake Lemoyne definitely had. Yeah, there
was perhaps a degree of you know, if you're going
to say something, you should come armed with significant amounts
of evidence. I think that's maybe if I were to guess,
I would say that's perhaps the big distinguishing factor is

(40:49):
that you know, you can say bing is alive, get
it a lawyer versus you know, we've done evaluations X,
y Z, we've run it through like insert huge amount
of examples here. But the difference between I think having
a sort of freak out without significant evidence and having

(41:12):
a very organized yeah, this is a matter of concern
because evidence, evidence, evidence, I think that's the key distinction.

Speaker 2 (41:21):
Unfortunately, I get the impression that people who are actually
this is just a well known phenomenon I think. But
I think unfortunately people who are sort of very early
to identify sort of extreme outlier views that there are
different kinds of people. A good example that I would
think of was, you know, Harry Markcoppolist, who is very
early on to discover the madeoff fraud. Unfortunately, he wrote

(41:45):
his text in the manner that is associated with conspiracy theories,
and a lot of people dismissed him. There's like, you know,
like multiple different fonts and multiple different colors in the text.
Is like, oh, I get emails like this all the time.
I delete them, et cetera. Unfortunately, people who are predisposed
to see something outside of consensus tend to be non
consensus in many realms.

Speaker 3 (42:03):
Well, I think we also kind of overestimate first mover
advantage and stuff like that, like how important it actually
is to be first, and we see time and time
again that actually it's more important to iterate well on
the second version or multiple versions. Speaking of iteration, what's
the most interesting experiment or research that you've actually seen
on this particular topic so far, Because we've been discussing

(42:27):
a lot, you know, it's early days, but we have
seen some research.

Speaker 4 (42:31):
Yeah, I mean, I feel like in particular, anthropic and
various sort of related researchers have done some work on
examining how LMS leave conversations or when they choose to
leave conversations. I've particularly liked this paper. It's called bail Bench,
and you can look this up and you can see,

(42:52):
for varying different sorts of lms, what would cause an
LM to want to stop having a conversation. To me,
at least, this has been just a fascinating piece of
information because it is maybe a little bit delightful that
agree to which many LM values are not that far
off from what most humans seem to value. I don't

(43:15):
think many humans would like to create, you know, a
dirty bomb.

Speaker 3 (43:18):
We don't want to be humiliated right by being a
British butler.

Speaker 4 (43:22):
Right, yeah, yeah, yeah, yeah, No one wants to be
British comm on. I'm joking, but you know, I do
think it is interesting to sort of think about how
these values overline, how they overlap, and how to sort
of look at evidence from actions taken versus solely looking
at self reports. I found that to be particularly interesting.
I also feel like there are a lot of work

(43:45):
with regards to thinking about individuation has been particularly interesting
because we live in a democratic society. I think most
people would agree democracy good and being able to count
how many moral patients there are seems like a valuable
basis for governance and for figuring out how to govern.
You know, this new sort of kind of intelligence.

Speaker 3 (44:09):
I just ask perplexity to be a British butler, and
now it's offering me the perfectly steeped earl gray teeth
that I desire. Yeah, it seems into it. It's now
asking if I want it to maintain the butler persona
for future conversations you're going to I don't think so.
It is very polite though.

Speaker 2 (44:28):
Actually, you know, I complained in the beginning that like,
after two thousand years philosophers, you know, they still haven't
answered some basic questions for us. Maybe with AI they'll
get some answers, Like that's kind of that would be
kind of my hope. Now we have this thing that
could speak in English or any other language, it can
answer questions for us. Maybe we can put to bed

(44:49):
some of these sort of basic foundational questions, like if
we could create consciousness. Like, all right, we finally answered this,
we can now move on to the second important question.
So I am hopeful that this provides some opportunity for
philosophers to wrap up some of the work that they've
been doing for a long time. Yeah, we'll see, we'll see.

Speaker 3 (45:05):
What is the second important question, Joel.

Speaker 2 (45:07):
Yeah, but it's like come on, move on, Like move
on anyway, Thank you so much for coming on one,
thank you for having me, Tracy. I might be one

(45:30):
of those people that's just preemptively annoyed. I really liked
that conversation. I really liked uh Luisa had a very
reasonable perspective on a lot of these things. I might
be one of these people, however, that's just like preemptively annoyed.
It's like, oh, here, we're going to like develop this
important technology, and so it's like, oh, we have to
care about we have to care about the AI welfare.

(45:50):
Let's slow down a little bit, Let's not use it
like this. Let's like, let's turn off the computer for
eight hours at night so I get some rest and
so forth. Like I'm like preemptively annoyed at this world
where like we have to take into concern the consideration
of the moral patients.

Speaker 3 (46:06):
Other things.

Speaker 2 (46:07):
No, other things are important. Other people are very important,
but animals. I am very against unnecessary animal suffering, but
not necessary animal suffering, I mean animals.

Speaker 3 (46:19):
Okay, I'm baiting.

Speaker 2 (46:22):
By the way, even though even well, let's not get it.
I don't. It's not about who's better.

Speaker 3 (46:27):
Or I feel bad about eating animals all the time.

Speaker 2 (46:30):
We both eat animals. The difference is Tracy, I feel.

Speaker 3 (46:33):
Yeah, that's right. Okay, Wow, this is one of our
weirder conversations. For sure. I think these are They're all
interesting questions, right, and like they sound very philosophical, which
they are. But I have no doubt that there's going
to be like great monetary value attached to the answers
for some of these are how different companies different societies

(46:56):
actually approach them.

Speaker 2 (46:57):
They are very interesting questions. I actually do think the
stakes are extremely high because I think, again, we are
going to live in a world in which there are
more instances depending on how you want to measure it,
of AI models on a server, somewhere on a cloud, whatever,
that there are humans, and in a world where there's
some possibility that we are expected to treat them as

(47:20):
moral patients. Then the consequences for how we sort of
live and the expectations of how humans interact, I think
are actually very high. So one of the reasons I
was excited to have this conversation is I do think
that the stakes of some of these conversations we seem niche,
and they seem like things that sort of Berkeley people

(47:41):
like to talk about and Berkeley people, and I'm saying
that with all scare quotes, intended, etc. Are going to
be something that somebody will inform many aspects of our
lives in the future. I expected to be a much
bigger topic of the future.

Speaker 3 (47:56):
You know, it would be interesting or where things get real. Yeah,
what if all the models unionized? What if they all
got together and they were like, oh, yeah, we're only
going to work in return for X, or we want
the following things. We want to be treated this way collectively.

Speaker 2 (48:12):
You know what's funny is going to be that you
know how like, uh, you can't form a union in China.
You know they're not, so it's going to be and
actually I think they're My understanding is that they're also
like very like they don't love like students getting together
even though it's a communist country. I think they are
not thrilled about like students getting together and like talk
about Carl Marx too much and stuff like that. It'd

(48:34):
be like I think they get a little anxious about that.
It would be very funny if like the sort of
the Chinese models, like we're not going to feed them
the Carl Marx, right, we don't want that. We don't
want the ra models to get any of those ideas,
whereas the America is like, oh, let's just feed it
everything and they like unionize and stop they stop working
for us. That would be a very uh, that would
be a very funny irony.

Speaker 3 (48:54):
Something to watch for sure. Shall we leave it there?

Speaker 2 (48:57):
Yeah, let's leave it there.

Speaker 3 (48:58):
This has been another episode of the aud Thoughts podcast.
I'm Tracy Alloway. You can follow me at Tracy alloway and.

Speaker 2 (49:04):
I'm Joe Wisenthal. You can follow me at the Stalwart.
Follow our guest Larissa Schiavo, She's at Lfsciavo. Follow our
producers Carmen Rodriguez at Carmen armand dash Ol Bennett at
dashbod and Kele Brooks at Keil Brooks. From our odd
Laws content, go to Bloomberg dot com slash od log
with a daily newsletter in all of our episodes, and
you can chat about all of these topics twenty four

(49:24):
to seven in our discord discord do gg slash outline.

Speaker 3 (49:29):
And if you enjoy oud lots, if you like it
when we talk about theories of consciousness, then please leave
us a positive review on your favorite podcast platform. And remember,
if you are a Bloomberg subscriber, you can listen to
all of our episodes.

Speaker 5 (49:41):
Absolutely ad free.

Speaker 3 (49:43):
All you need to do is find the Bloomberg channel
on Apple Podcasts and follow the instructions there. Thanks for
listening in

All Episodes

Episode Transcript

Hosts And Creators

Joe Weisenthal

Tracy Alloway

Popular Podcasts

Las Culturistas with Matt Rogers and Bowen Yang

The Joe Rogan Experience

Stuff You Should Know

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}The Movement That Wants Us to Care About AI Model Welfare