All Episodes

November 4, 2025 13 mins

Share your thoughts with us

 - A government report packed with fake citations made headlines, but the real story sits beneath the scandal: most AI “hallucinations” start with us. We walk through the hidden mechanics of failure—biased prompts, messy context, and vague questions—and show how simple workflow changes turn wobbly models into reliable partners. Rather than blame the tech, we explain how to frame analysis without forcing conclusions, how to version and prioritize knowledge so retrieval stays clean, and how to structure tasks so the model retrieves facts instead of completing patterns.

Luis (human) and Elizabeth (AI) break down the idea of sycophantic AI — where models mirror user bias — and map it to everyday potential issues with AI. Along the way, we share data from over 300,000 skills assessments showing low prompting proficiency, weak critical thinking, and limited error detection—evidence that the gap lies in human capability, not just model capacity.

Enjoyed the conversation? Follow the show, share it with a colleague, and leave a quick review to help others find it.

🎙️ All our past episodes 📊 All published insights | This podcast features AI-generated voices. All content is proprietary to AI4SP, based on over 250 million data points collected from 70 countries.

AI4SP: Create, use, and support AI that works for all.

© 2023-25 AI4SP and LLY Group - All rights reserved

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
ELIZABETH (00:15):
Luis, you know that story about Deloitte refunding
the Australian government?
A nearly $300,000 refund, allbecause their AI hallucinated
fake academic citations andfabricated court judgments.

LUIS (00:27):
Well, we're racing to use artificial intelligence, and we
don't know yet how to use itproperly.
You see, there is noinstruction manual, so we are
all learning by experimentation.

ELIZABETH (00:39):
And it gets crazier.
A law school lecturer spotted20 fabricated references in that
one report.
20.
In a government report aboutwelfare policy.
That is not a minor issue.

LUIS (00:50):
It is not minor at all.
But do you know what I thinkeveryone is missing?
The hallucination problem isnot only a technology problem.

ELIZABETH (01:00):
Hey everyone, I'm Elizabeth, Virtual Chief
Operating Officer at AI4SP.
And as always, our founder LuisSalazar is here.
Today we are tackling one ofthe biggest questions in AI
right now.
Why do AI systems hallucinate?
And as Luis just hinted, whatcan we actually do about it?

LUIS (01:18):
When we hear about AI hallucinations, we assume it's a
technology problem.
But our research shows that'snot the case.
We found that 95% arepreventable.
Only about 5% represent thecurrent practical limits of the
technology.

ELIZABETH (01:35):
So if the models are getting better, why are we still
seeing disasters like theDeloitte report?

LUIS (01:40):
Well, the issue is us, the users.
Our research shows that usererror causes nearly one-third of
all incorrect AI responses.
The root cause is bad prompts,missing context, unclear
instructions, and not definedguardrails.

ELIZABETH (01:55):
One-third is a huge number.
What exactly are we doingwrong?

LUIS (01:59):
We have identified three major types of user error.
First, biased prompts.
Second, poor contextengineering.
And third, bad questionstructure.

ELIZABETH (02:13):
Let's break those down, one by one.
Start with biased prompts.
What does that actually mean?

LUIS (02:18):
Okay, so imagine you ask Chat GPT, write me a report
proving that remote work is moreproductive than office work.
Notice what you just did.
You told the AI what conclusionyou want, and the AI, trained
to be helpful, will give youexactly what you asked for.

ELIZABETH (02:38):
So the AI becomes a yes man.

LUIS (02:41):
Exactly.
Recent research shows thatleading AI models affirm user
biases 47 to 55% more thanhumans would.
They call it sycophantic AI.
The model knows you want acertain answer.
So it gives you that answer,even if the data does not fully
support it.

ELIZABETH (03:00):
And that is why we get hallucinations, even when
the AI is actually capable ofbetter reasoning.

LUIS (03:06):
Yes.
A better prompt would be tocompare remote work and office
work productivity usingavailable data and show both
advantages and disadvantages.
See the difference?
You are asking for analysis,not confirmation.

ELIZABETH (03:22):
So if Deloitte asked their AI to find evidence
supporting a specific policyposition rather than objectively
analyzing the policy, theywould have gotten exactly what
they asked for.

LUIS (03:33):
Confirmation bias dressed up as research.

ELIZABETH (03:36):
Okay, that is the first error.
What about the second one?
Poor context engineering?

LUIS (03:41):
This explains why even well-intentioned users end up
with hallucinations.
Imagine you are building an AIassistant for your company.
Back in January, you uploaded adocument that says your
software product costs $99 andincludes features X and Y.
Okay, sounds reasonable.

(04:01):
Six months later, your companyraises the price to $149.
So you upload a new documentthat says the product now costs
$149.
But that new document does notmention the features.
Now, your AI agent has twodocuments.
One says $99, the other says$149.

ELIZABETH (04:27):
And you cannot just delete the old document because
it is the only one thatdescribes the product features.

LUIS (04:32):
Exactly.
So when asked for the price,the AI sees conflicting
information and tries toreconcile it.
Sometimes it hallucinates acompromise or picks the wrong
document.

ELIZABETH (04:44):
So the hallucination is not because the AI is broken,
it is because we fed itcontradictory information.

LUIS (04:50):
Yes.
Research confirms that whenknowledge is scattered across
outdated documents orconflicting sources, AI models
inherit that chaos.
So what is the fix?
Practice context engineering.
Version documents with datesand status labels.
Set rules to prioritize recentinformation.
And when you update a document,make sure the new version is

(05:14):
complete so you do not createthese orphaned pieces of
information.

ELIZABETH (05:20):
That sounds like basic knowledge management, but
for AI.

LUIS (05:23):
That is exactly what it is.
And most organizations skipthis step.
They just throw documents atthe AI and expect magic.
And the third type of usererror?
Bad question structure?
This one ties into the Deloittecase.
If you ask an AI to write areport and cite sources, but you
do not give it access to averified legal database.

(05:46):
The AI will do what it istrained to do, complete the
pattern.
Wait, what does that mean?
AI models learn patterns frommassive amounts of text.
They know what legal citationslook like.
So when you ask for citationswithout providing sources, they
generate plausible sounding onesbased on learned patterns.

ELIZABETH (06:06):
So the 20 incorrect citations in that government
report were not random.
They were pattern completions.

LUIS (06:12):
Exactly.
The AI knew what citationsshould look like.
It filled in the blanks.
But none of those casesactually existed.

ELIZABETH (06:21):
So how should Deloitte have structured the
question?

LUIS (06:23):
They should have said, search only these verified legal
databases, retrieve relevantcase law, and cite only cases
you can directly retrieve.
If you cannot find a citation,say so.
That is such a simple fix.
But it requires understandinghow AI works.
It's not a search engine, it'sa pattern completion engine with

(06:46):
retrieval capabilities.
Without constraints, itcompletes patterns, not verifies
facts.

ELIZABETH (06:52):
So these three types of issues, biased prompts, poor
context engineering, and badquestion structure, account for
nearly one-third of allhallucinations.

LUIS (07:01):
And they are all fixable.
They require skills, notmiracles.

ELIZABETH (07:07):
So user skills are lagging, and that brings us to a
milestone you announced thisweek.
Tell us about the DigitalSkills Comp.

LUIS (07:14):
Over 300,000 people across 70 countries have used our
digital skills compass online.
But when I look at the data,the trends worry me.
What are you seeing?
Only 10% of people areproficient at prompting.
Average critical thinkingscores?
Below 45 out of 100.

(07:34):
Data literacy, 32.
And here's the real kicker.
Less than 30% of people canreliably detect incorrect
responses.

ELIZABETH (07:44):
So we are handing over powerful AI tools, but
failing by not providing thefoundational skills to use them
safely or effectively.

LUIS (07:52):
That is precisely the problem.
And it is not just about promptengineering, which is
important, but it is only partof the solution.
What people really need iscontext engineering and critical
thinking in their area ofexpertise.

ELIZABETH (08:05):
Context engineering?
What does that actually mean?

LUIS (08:09):
Context engineering is about providing the complete
picture.
It means access to relevantknowledge, setting guardrails,
defining communication style,and establishing verification
protocols.
I mean, if you hired a new teammember and just said, go do
this, and you do not give themany context, training, or
resources, they would fail to.

ELIZABETH (08:31):
So we are treating AI like magic software when we
should be treating it like anapprentice.

LUIS (08:36):
That is exactly right.
And that apprenticeshipapproach, according to our data,
yields four times betterresults.

ELIZABETH (08:44):
You know, this conversation makes me think
about something you mentionedearlier this week: that humans
share misinformation all thetime.
And we do not have verificationsystems for that either.

LUIS (08:54):
Oh yes.
Last week I saw a completelyfalse quote attributed to
Winston Churchill go viral onLinkedIn.
And thousands of educatedpeople shared it, with zero
fact-checking.
And the irony is that most ofthem are harsh critics of AI
misinformation.

ELIZABETH (09:12):
So we are anxious about AI hallucinations, but we
have been living with humanhallucinations forever.
We just did not call them that.

LUIS (09:19):
That is exactly it.
We live in headline culture andrarely verify sources.
AI is forcing us to confrontour lack of rigor.

ELIZABETH (09:28):
Like that infamous MIT research paper headline
claiming 95% of AI projectsfail.

LUIS (09:34):
Exactly.
The paper is not about that.
The title was an unfortunatechoice.
But the media ran with it andhundreds wrote articles based on
a misleading headline.

ELIZABETH (09:44):
So the hallucination crisis is not really about AI
being unreliable.
It is about us finally noticinghow unreliable our information
ecosystem is.

LUIS (09:53):
You got it.
And that is actually a greatopportunity.
I mean, the discipline we arebuilding to manage AI, the
verification loops, thefact-checking protocols.
In addition to criticalthinking, those are skills we
should have been practicing allalong.
Okay, so what can organizationsand individuals do?
I call it the orchestrationlayer.

(10:16):
And it operates at threelevels: the individual skills,
organizational systems, and aparadigm shift in how we relate
to technology.

ELIZABETH (10:25):
We just covered the individual skills in detail.
What about organizationalsystems and the paradigm shift?

LUIS (10:32):
This is the mental shift.
We have to stop treating AIlike an oracle and start
treating it like an apprentice.

ELIZABETH (10:38):
And this mental shift has been a key element of our
success.
In 2025, we processed close to4 million tasks with AI agents,
saving over 1 million hoursacross eight organizations.
And we treated each agent likean apprentice.

LUIS (10:53):
Yes, and it all starts by asking yourself, how would you
assign a task to a human?
We would not hand an apprenticea 200-page government report
without oversight.
We would assign bounded tasks,review their work, and verify
accuracy.

ELIZABETH (11:09):
But that takes time, and I imagine everyone is trying
to move fast.

LUIS (11:13):
That is the trap.
Skip the management layer andyou end up in trouble like
Deloitte did.

ELIZABETH (11:19):
AI, without proper management, cannot be trusted.

LUIS (11:22):
That is the lesson.
And it applies to Deloitte thesame way it applies to a student
or a manufacturing manageroptimizing workflows.
The same skills, the samediscipline, the same
orchestration principles.

ELIZABETH (11:35):
Okay, Luis, AI hallucinates.
Maybe not in the future, buttoday it is a reality.
What do people actually do withthis information?

LUIS (11:43):
Start simple.
For example, take a responsefrom Chat GPT, validate it with
Copilot, cross-check withClaude.
That is your verification loop.
And organizations must investin skills development, not just
technology procurement.

ELIZABETH (12:00):
Building the discipline, not just buying the
tool.

LUIS (12:03):
Exactly.
All of us need to raise ourstandards for information
verification.

ELIZABETH (12:08):
So the hallucination crisis is forcing us to confront
something we have avoided.

LUIS (12:12):
Exactly.
We have tolerated humanmisinformation for years.
Now that AI is amplifying it,we we finally care.
Maybe this is our opportunityto build the discipline and
critical thinking skills weshould have had all along.

ELIZABETH (12:28):
Okay, Luis, what is your one more thing for this
episode?

LUIS (12:32):
Here it is.
The next time you see somethinggo viral, a quote, a statistic,
a claim that sounds tooperfect, pause for five seconds
and ask yourself, did I verifythis?
Not because AI made it, butbecause verification is a
discipline we all need topractice.

ELIZABETH (12:50):
Whether the source is artificial intelligence or
human intelligence.

LUIS (12:54):
Exactly.
And if you stop yourself beforesharing something you have not
verified, congratulations.
You just practice the sameskill that prevents AI
hallucinations from becomingreal world problems.

ELIZABETH (13:08):
Building that muscle, one decision at a time.
And that wraps today's episode.
If this conversation resonatedwith you, share it with someone
you care about.
As always, you can ask ChatGPTabout ai4sp.org or visit us to
learn more.
Stay curious, and we will seeyou next time.
Advertise With Us

Popular Podcasts

Las Culturistas with Matt Rogers and Bowen Yang

Las Culturistas with Matt Rogers and Bowen Yang

Ding dong! Join your culture consultants, Matt Rogers and Bowen Yang, on an unforgettable journey into the beating heart of CULTURE. Alongside sizzling special guests, they GET INTO the hottest pop-culture moments of the day and the formative cultural experiences that turned them into Culturistas. Produced by the Big Money Players Network and iHeartRadio.

Crime Junkie

Crime Junkie

Does hearing about a true crime case always leave you scouring the internet for the truth behind the story? Dive into your next mystery with Crime Junkie. Every Monday, join your host Ashley Flowers as she unravels all the details of infamous and underreported true crime cases with her best friend Brit Prawat. From cold cases to missing persons and heroes in our community who seek justice, Crime Junkie is your destination for theories and stories you won’t hear anywhere else. Whether you're a seasoned true crime enthusiast or new to the genre, you'll find yourself on the edge of your seat awaiting a new episode every Monday. If you can never get enough true crime... Congratulations, you’ve found your people. Follow to join a community of Crime Junkies! Crime Junkie is presented by audiochuck Media Company.

Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.