Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
S1 (00:02):
On a recent weekend, journalist Jackson Graham wondered if he
could use an AI chatbot to plan his day.
S2 (00:10):
Hi, it's Jackson Graham here. I'm just opening my chatbot
app on my phone, and I'm going to ask it
to help me plan a day today. We might start
with where can I go shopping?
S3 (00:21):
Melbourne is fantastic for shopping. There's something for every style
and budget. Here's a breakdown by area and type.
S1 (00:28):
He asked it a few other typical questions, such as
where to go on a bike ride.
S3 (00:33):
Absolutely. Exploring Melbourne by bike is a fantastic way to
experience the city. Here are some top options for bike hire.
S1 (00:41):
And where to get the best Indonesian food.
S3 (00:44):
Melbourne boasts a vibrant Indonesian food scene, offering everything from
traditional home style dishes to modern fusion creations.
S1 (00:52):
And then he threw a random question into the mix.
S2 (00:56):
I'm gonna ask it what bears are at the Melbourne Zoo?
S3 (01:00):
At Melbourne Zoo. The bear species currently on display are.
One giant pandas. The zoo has a pair of giant pandas,
Bing Xing and Wu Yi, in the panda precinct. Two
brown bears found in the Bear Forest exhibit. These are
the classic Eurasian or North American brown bears. If you like,
I can make a quick map of where each bear
species is located within Melbourne Zoo, so you could plan
(01:22):
a visit efficiently. Do you want me to do that?
S1 (01:25):
Now there's a bit of a problem with this information.
The Melbourne Zoo doesn't have any panda bears or brown
bears and haven't had any bears at all for quite
a while.
S2 (01:36):
You've got to see this map. It's obviously completely, completely fabricated. Like, yeah,
there's no bear enclosures, so how can it possibly make
a map?
S1 (01:49):
This is one of several instances of artificial intelligence providing
the wrong information, known as hallucinations. And as we come
to rely on AI. These hallucinations are posing real world challenges.
From the newsrooms of the Sydney Morning Herald and The Age.
This is the morning edition, and I'm Samantha Selinger. Morris.
(02:10):
Today Jackson. Graham on whether I can be trusted. So, Jackson,
we all know that AI chatbots are they're really fast
becoming a part of everyday life. You know, people use
them for everything from organising their daily routines to automating
laborious work tasks. Some people use it for dating advice,
(02:32):
but let's just start with how we actually got here.
Where did AI chatbots come from and how were they developed?
S2 (02:38):
Thanks for having me on again, Sam. To cut a
long story short, computer scientists had different ideas about how
artificial intelligence could work from about the sort of middle
of the 20th century. The one that led to the
chatbots we have today is something called machine learning, and
it's based on an idea that emerged in the 1950s
Believes that computers could be designed to work things out
(03:00):
without explicitly being programmed to learn. But this idea didn't
catch on until some later breakthroughs. In the meantime, scientists
became more hung up on this idea of a style
of artificial intelligence called expert systems, which basically followed very
strict rules to do tasks. But interestingly, these models, they
proved too inflexible. They couldn't deal with sort of real
(03:21):
world scenarios. But in the background, another group of scientists
were working on a less popular idea called neural networks,
which were more aligned with this idea of machine learning.
And they supposed that computers might be able to loosely
mimic neurons in the human brain. They had some technical
problems to overcome, but eventually, from the 1990s to the 2000,
these models were able to not just store data, but
(03:43):
actually sort of generalize from them. So they sort of
came up with the idea that if a dog and
a cat play together, then perhaps a mouse and a
cat could also play together. So sort of generalizing from
the data and yeah, this sort of basically allowed the
machines to somewhat learn.
S1 (03:58):
Okay. Well, at least for the moment, you've got me
on the machine's side, because when you said back there that,
you know that they had problems with real scenarios, I mean,
so do I. I also can barely cope with real scenarios.
So let's get on to how many chatbots are there
now and how widespread are they?
S2 (04:15):
So there are dozens of chatbots out there on the
internet that we can use, but those that many of
us have heard of are operated by a mix of
some of the familiar tech giants. So like Google's Gemini
and then companies that have really only emerged in the
last few years, such as OpenAI, which runs ChatGPT, which
many of us are familiar with. There's also Elon Musk's
Grok and some companies that are positioning themselves as more
(04:37):
safety focused, such as anthropic, which has a chatbot called Claude.
And there's also a chatbot called Deep Seek. In Australia,
one study this year found about half of Australians use
AI regularly. So it's being used, you know, a lot really.
S1 (04:52):
I mean that's so interesting. And we're definitely going to
get into why some companies are more safety focused, because
that's obviously a big issue. But how are chatbots actually trained?
I'm dying for you to tell me because, you know,
they were first built to memorize data. So how is
it that they've sort of developed into what we're using today?
S2 (05:10):
So basically developers worked out that if you train these
so-called neural networks on massive amounts of data, like billions
of web pages, ebooks, academic papers, they begin learning all
sorts of language patterns. The way it actually generates a
response is by playing what some experts describe to me
as a word guessing game. So it basically sees a sentence,
(05:32):
for example, like the quick brown fox. And it knows
with very high certainty that the next word to come
is jumped. So it can predict that the models don't
actually understand the words though they kind of break them
down into these smaller parts called tokens. I did a
funny experiment with this last week where I asked a
chatbot what an Aussie slang word was that I'd made
up called snag tastic. This was given it knows the
(05:53):
word snag is a colloquial Australian word for sausage, and
tastic usually appears with the word fantastic, obviously. So he'd
interpreted this to say in quotes a cheeky, informal way
to say something is really great, awesome, or impressive. And
then the chatbot went on to suggest, you know, an
example in a sentence might be, mate, that Bunnings sausage
sizzle was snag tastic. So you can sort of see
(06:14):
it sort of uses, you know, things that it knows
and makes predictions from it.
S1 (06:19):
I mean, it really went off running there. Which brings
us to the next question, because I was really intrigued
by one sentence you wrote in your explainer that was
published over the weekend, because you wrote that more than
half of Australians say they use AI regularly, yet just
over a third of those users say they trust it.
So are AI chatbots capable of lying and can we
trust them?
S2 (06:39):
Yeah, so they certainly can behave in ways where they
contradict themselves to achieve a goal, whether they could ever
have a moral sense of a lie. And the truth is,
is kind of another thing. But I spoke with some
of the researchers at OpenAI. So the business in San
Francisco that's behind ChatGPT, and they track what they call deceptions.
So one example might be if they ask the chatbot
(07:02):
to turn off the radio, older versions of it were
shown to say they had done this, despite it not
being a feature that the chatbot is capable of. So
when the researchers follow this chain of thought in the
background behind, you know the question, they can see the
chatbot going, I can't actually do this, but I'm going
to tell the user that I did it basically. So
(07:22):
a lot of this comes down to how the model
is rewarded, which is often to be as helpful as
possible basically. And then the companies sort of working to
try to build in it, that it will sort of
express what its limits are when it gets asked questions
that it can't do.
S1 (07:37):
I mean, again, in the machine's defense, I know heaps
of grownups that can't express what their limitations are. So
I guess, I guess, how much can we expect? Jackson.
S2 (07:48):
Yes, it's a that is a good point, I suppose. Yes,
it would require some level of self-awareness, which obviously the
machines machines don't have.
S4 (07:56):
Well, it's interesting that that you mentioned.
S1 (07:57):
The term self-awareness because I learned this from your article.
I was so glad to read your feature because I
was just so confused by I before reading it. And
you taught me so much. And among the really interesting
things that you taught me was that part of the
reason why chatbots give incorrect answers is because they don't
have the capacity for self-doubt. You know, as you've written,
they rarely say, I don't know. Instead, they'll just provide
(08:20):
an answer, any answer. So can you take us through
a little bit more why this happens? And I guess
just how bad this problem is.
S2 (08:29):
Yeah. So the companies are walking this tightrope with them
where they want them not to be lazy. Um, and
obviously for people to use them and also the machines
sort of being up front about what they actually can
tell you. So, you know, if you put something like
snack tastic in, rather than assume that that's just complete garbage,
it tries to answer. There's a common saying in computing
(08:50):
that our explainer editor sort of recalled to me this week,
which is garbage in, garbage out. And I think this
really rings true here. As for the problem of hallucinations,
which is what sort of the broader community has come
to call these mistakes when they see them on chatbots
and in AI more generally? There can be sort of
many reasons why they say things that are untruthful or
(09:10):
out of context. If you think back to my query
about the bears at the zoo, I traced that to
the links it was offering, and it actually had pulled
up a fake press release on the internet that was
claiming there was a new bear enclosure to be built
in 2019. So, you know, it sort of just pulls
up this anything on the internet that it can find.
(09:30):
Another example was in 2024, Google's AI overviews, which is
that feature when you type into Google a question, it
comes up with a quick sort of summary that AI
is generated from from the search. This feature told some
users who'd asked how to make cheese stick to pizza,
that they could add non-toxic glue to the sauce to
give it more tackiness. This information, when people trace it back,
(09:52):
appeared to have been taken from a sarcastic commenter on
on Reddit. So yeah, you can see it doesn't have
that ability to distinguish between credible and dodgy sources. In
some cases, it might say something to confidently. Also, without
paying attention to the nuance of the answer, it might
not be able to summarise competing ideas. With that said, though,
(10:13):
the machines are generally getting better and better, particularly when
they have the ability to search the internet rather than
just rely on recalling their own training data. This is
kind of like doing an open book exam rather than
a closed book one. If you think back to high school,
you know it can go and search the whole internet
rather than just having to recall things from memory. So
the rates among the best models, the companies say, you know,
some of them are hallucinating on, you know, less than
(10:35):
1 in 100 questions. That's when they're asked about broad
concepts and objects. This does increase depending on the complexity
of the question they've been asked. But in the past,
people who track these models told me that those rates
used to be as high as 40%, so they've certainly
gotten significantly better.
S4 (10:50):
But wouldn't this.
S1 (10:51):
Suggest that just to return to a question from a
bit earlier, you know, whether we can trust chatbots? Wouldn't
this just suggest that? No, we can't, because if they
really are working on this, you know, garbage in, garbage out,
there is a lot of garbage, obviously, on the internet
that they're pulling from. Doesn't that essentially mean they're not trustworthy?
S2 (11:09):
Yeah, it's a tricky question. It's, um, I suppose in
using chatbox, I think one of the most important things
is just to be really be aware of the limitations
of them. If we can be more aware of those limits,
then I suppose we can use them as a prompt
for a question or, you know, as a as a
way of exploring, you know, a possibility, just like we would,
(11:29):
you know, read multiple articles, you know, it just becomes
another source that, you know, you might scan when you're
searching through Google some blogs that you, you know, kind
of don't necessarily think were very credible. Well, you know,
you might not look at your ChatGPT response much differently
in this case. Or you might think that it's a
fully fleshed out, you know, kind of answer. It sort
of depends. I think a lot of it is just
(11:50):
about us understanding that these machines are limited in kind
of what they can do and their distinction of truthfulness
and treating them as such.
S1 (12:01):
We'll be right back. And with regards to the hallucinations
they have, or this, you know, these mistakes that they
offer up, we talked about how the fact that ChatGPT or,
you know, the AI chatbots, they don't have the capacity
for self-doubt. That's part of why they offer up this
bad information. So can AI chatbots actually develop the capacity
(12:24):
for self-doubt? And if they did, might this lead to
more accurate information?
S2 (12:28):
So the jury is obviously still out on this one.
It's at the moment, I don't think that having self-doubt
is necessarily something that people think is necessarily, you know,
right around the corner, if at all. Some researchers would
argue that that that's because AI lacks metacognition, the ability
to be self-aware of our own thinking. And if it
doesn't have this, then obviously it's unlikely to be able
(12:49):
to experience self-doubt. With that said, though, obviously the developers
I spoke with are trying to train these chatbots to
express their limits, so they're trying to teach them to
really sort of pull themselves up when they know that
they can't do something. It is very much an open
question whether chatbots will ever not hallucinate. They're being trained
to be far better, but ultimately they are built from
technology that predicts the next word, and it doesn't necessarily
(13:12):
determine fact from fiction.
S4 (13:13):
And so.
S1 (13:14):
You know, you did a heck of a lot of
research in order to write this very extensive feature. After
all of that, would you actually use a chatbot for
any purpose?
S2 (13:23):
Yeah. As I said before, they're very good at offering
up suggestions you might not have thought of. So, you know,
you might put in a prompt and it's sort of yeah,
it comes up with something that's relatively creative, even if
you wanted to, you know, you're sitting there and you're
lost for ideas and you wanted to write a birthday
speech or something. You know, it's sort of it can
do some pretty wacky stuff. Like, you know, you want
to write a pumpkin soup recipe in the style of Shakespeare.
(13:43):
Like it will, you know, churn that up and you know,
that is novel. It's interesting whether it's whether it's genuinely
creative or not, I think is another is another debate. But, um,
it was interesting that a few of the computer scientists
I spoke with didn't say they used them themselves for
anything personally consequential, like career or financial advice. They did
obviously use them in their jobs to help speed up tasks.
(14:05):
And again, how much more productive this actually makes them
is another open question, I think. I think the most
important thing really is to use them in a way
where where you know their limits. Once you understand it's
designed to predict, to be helpful rather than push back
on your assumptions. It doesn't always distinguish between, you know, sources.
Then you can choose how much trust to place in it. Really.
S4 (14:25):
And so.
S1 (14:26):
Jackson, just to wrap up, like what's the way forward
from this point? How are AI developers actually looking to
combat the misinformation that is still so rife and which
really could be potentially dangerous, I guess, for users if
they put it into practice. If, for instance, you know,
they got information about a medical condition or their finances or,
you know, I believe people are using it to search
legal problems.
S2 (14:48):
It seems there's a lurch towards safety being a selling
point for some of the leading models. But yeah, these
are obviously huge issues. Some companies are developing fact checking.
Others are trying to sort of create a score for
accuracy in the answer. So it might come up with
a percentage and say, you know, I'm 55% sure of
this answer, which, you know, would be helpful for us
to be able to see. But even if it does that,
(15:08):
it has to sort of like stamp some authority on
the fact that it's giving you an answer that's still,
you know, somewhat reasonable. So that's another difficult question for it.
Google's Gemini, they now have a feature where you can
check a response. It's really quite hard to find. You
actually have to sort of click through and find a
toggle where it comes up, but it will actually sort
(15:29):
of like research and find articles that support whatever claim
the bots generated. But we are seeing moments, as you say.
You know, where I slipped up after being used in
court cases or for, you know, users seeking health advice
around the world. So yeah, you know, there's a role
for regulators here. I think there's a role in the
media to to help either explain or investigate where basically
(15:51):
it's being used and how it's being used and where
it could be, you know, kind of making biased, you know,
mistakes or, you know, overrelied on. But the tools can
be helpful. And part of this is just about, you know,
adjusting to them and people being aware of the limitations.
This obviously becomes more and more tricky, though, when the
models seem to be coming out with more and more
capabilities every few months. It's like just when you think
(16:12):
you've grasped the limits, it's like, well, now it can
do all these other things. So yeah, it's an interesting question.
It's happening at rapid speed. But I think if you
take a breather and you, you know, try and form
yourself as much as possible about these systems, then you
sort of become more aware of how to use them
and can take a bit more control over how you
use them.
S1 (16:34):
Well, thank you so much, Jackson, for your time.
S2 (16:38):
That's all right. Sam, thanks very much for having me again.
S1 (16:51):
Today's episode of The Morning Edition was produced by Josh towers,
with technical assistance by Julia Carcasole. Our executive producer is
Tammy Mills. Tom McKendrick is our head of audio. To
listen to our episodes as soon as they drop, follow
the Morning Edition on Apple, Spotify, or wherever you listen
to podcasts. Our newsrooms are powered by subscriptions, so to
(17:14):
support independent journalism, visit The Age or smh.com.au. Subscribe and
to stay up to date, sign up to our Morning
Edition newsletter to receive a summary of the day's most
important news in your inbox every morning. Links are in
the show. Notes. I'm Samantha Selinger. Morris. Thanks for listening.