Apple's AI Fumble: The Siri Crisis

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Hey there, curious minds, welcome back to the Deep
Dive.
We take that big stack of infoand hopefully give you those aha
moments.
Today we're diving intosomething sparking a lot of chat
, maybe even some strongopinions Apple intelligence.
We've got stuff from dense techpapers all the way to.
You know, pretty lively debatesonline, like on Reddit it's all

(00:20):
about Apple's big new personalintelligence system.
Reddit it's all about Apple'sbig new personal intelligence
system.
So our mission cut through thebuzz, the hype, maybe even some
of the frustration, to get youthe real story on what Apple's
building, how they're doing itand well, is their careful
approach the right play in thiscrazy fast world of AI?

Speaker 2 (00:37):
That's right.
We're going to unpack themodels Apple's built, look at
their really rigorous methodsfor training and safety, and
then we'll connect that techiestuff back to you know what it
actually means for users.
And, yeah, that intense debatearound Siri getting its big
upgrade or maybe a long over toupgrade.
What's really interesting isseeing Apple, famous for
polished launches, navigate thissuper fast, sometimes messy AI

(00:57):
frontier speed versus perfection, right.

(01:23):
Okay, let's get into Appleintelligence.
Ok, let's get into Appleintelligence Huge announcement,
talking AI tools fine-tuned foreveryday stuff, making your
writing better, handlingnotifications smartly,
summarizing long emails orarticles, even creating fun
images or, and this is key,actually doing things within
your apps to simplify tasks.
The sources mention a few keymodels.
There's AFM on device.
That's a smaller, efficient onerunning right on your iPhone or

(01:43):
iPad for privacy and speed.
Then there's AFM server, abigger model for more complex
stuff handled securely in thecloud, and they've got
specialized ones too, like forcoding and Xcode, and a fusion
model for, you know, images andmessages and stuff.
It's meant to be everywhere.

Speaker 1 (01:56):
So for us users, what's the deal?
Apple always bangs the drum onvalues like privacy.
It's kind of their thing.
How does that square with AI?
Because AI usually means tonsof data, right?
Is that privacy focus almost adisadvantage for them in this AI
race.

Speaker 2 (02:11):
That's a really important point, a critical one,
actually.
The documents are super clear.
Responsible AI principles guideeverything from design to
training to testing.
And, yeah, privacy is central.
They state very clearly we donot use our users' private
personal data or userinteractions when training our
foundation models.
That's a huge differentiator, adeliberate choice.
They pull it off with strongon-device processing so that AI

(02:34):
thinking happens on your phonemostly, and for bigger tasks
they use this thing calledprivate cloud compute.
It's a groundbreakinginfrastructure, they say.
Basically, it lets them usetheir servers for heavy lifting,
but in a way that keeps yourdata cryptographically private.
So they're trying to makeprivacy a strength, not a
weakness.

Speaker 1 (02:50):
Okay, that sounds impressive, like the dream,
right Privacy and smart AI.
But the big question is stillhow?
How do they build powerful AI,which usually needs mountains of
data, without using our data?
Give us a peek under the hood,but keep it, you know,
understandable.

Speaker 2 (03:09):
Absolutely so.
The foundation is this carefulthree-stage pre-training process
, then post-training.
Apple's research interestinglyfound that data quality is way
more important than just rawquantity for getting good
performance.
So they're not just scrapingeverything, they use their own
web crawler, Applebot, forpublicly available info.
But they're really careful.
They filter out profanity,personal info, PII.

(03:31):
They even deduplicate data and,crucially, they decontaminate
against benchmarks.

Speaker 1 (03:36):
Decontaminate, like making sure it hasn't seen the
test answers beforehand.

Speaker 2 (03:39):
Exactly Like ensuring a student hasn't just memorized
old exam questions.
They want the models togenuinely learn and reason, not
just recognize patterns in thetest data.
It's a very curated, almostmeticulous approach to data Very
Apple perhaps compared to themove fast approach elsewhere.

Speaker 1 (03:56):
OK, so that's the data.
Then comes the actual teachingright Getting the AI to
understand instructions, talklike a person.
That seems like the really hardpart.

Speaker 2 (04:05):
That's right.
That's post-training.
It involves things likesupervised fine-tuning, sft and
reinforcement learning fromhuman feedback, rlhf Heard those
acronyms a lot lately.
This is where the models learnto follow instructions properly,
understand context, have adecent conversation and what's
really interesting here a keyinnovation they talk about is

(04:26):
their hybrid data strategy.
They use human-labeled data,yes, where experts guide the AI,
but they also use a lot ofsynthetic data.

Speaker 1 (04:34):
Synthetic data like AI-generated data.

Speaker 2 (04:37):
Exactly For math problems, for instance.
They don't just find problemsonline.
They take some seed problemsand use their own AI models to
evolve them into a huge, diverseset.
Then they use other AI modelsas judges to check if the
solutions are correct.

Speaker 1 (04:50):
Wow, okay, so the AI is kind of helping teach itself
in a way.

Speaker 2 (04:53):
In a very structured way.
Yes, it's clever.
They do similar things forlearning how to use software
tools or write code.
It helps them scale uphigh-quality training data
without needing endless humanhours for everything, while
still controlling the quality.

Speaker 1 (05:07):
So it's not just about giant models.
It's about making them smartand efficient, especially for
our phones, right, which don'thave unlimited power or battery.
That sounds tricky.

Speaker 2 (05:18):
Definitely a balancing act.
And that brings us to theiroptimization tricks, adapters
and quantization.
Instead of having one massivemodel for every single little
task, which would kill yourphone's storage, they use LoRa
adapters.

Speaker 1 (05:31):
The LoRa adapter.

Speaker 2 (05:32):
Think of them like small specialized plugin modules
for the main AI brain.
The main model stays the same,but these little adapters let it
specialize instantly for dozens, maybe hundreds of different
tasks.
And they're tiny.
An adapter for the on-devicemodel might only be tens of
megabytes Super efficient.

Speaker 1 (05:48):
Okay, that's clever, and chronization that sounds
like shrinking things.

Speaker 2 (05:52):
It is.
It's compressing the modelsdrastically down to less than
four bits per weight on average,which is incredibly small.
This lets these powerful modelsactually fit and run smoothly
on your iPhone or iPad's limitedmemory.
But here's the really smart bitthey use special accuracy
recovery adapters to make surethat even after shrinking them
down so much, the models stillperform well.

(06:13):
They don't lose their smarts.
It's some serious engineeringto get that power onto the
device without making your phonegrind to a halt A big win for
users, potentially.

Speaker 1 (06:23):
All right, We've geeked out on the tech, the
privacy focus, the efficiencytricks.
All sounds quite impressive,but let's get down to it.
The real question how good isApple intelligence in the wild?
Because this is where thingsget well, really interesting and
maybe a bit messy.
There's a clear split in howpeople see it.
Apple's own tests sound great,but the public chat, especially

(06:44):
about Siri it, tells a verydifferent story.
Lots of frustration out there.

Speaker 2 (06:48):
You've nailed the central conflict.
Apple's internal reports, theirtechnical papers yeah, they
show strong results.
They use standard benchmarkslike MMLU for language
understanding yeah, they showstrong results.
They use standard benchmarkslike MMLU for language
understanding, gsm8k for maththings academics use.
But importantly, they say theyrely heavily on human
evaluations, trying to capturethat real-world user experience.
You know how helpful does itactually feel?

(07:09):
For example, their AFMon-device model, with the right
adapter, apparently beatscompetitors like Phi 3, mini,
lama 3, 8b, gemma 7B onsummarizing emails or messages,
according to their human testers.
And the bigger AFM server modelalso shows top results in
following instructions, writingquality math and especially
using tools within apps.
They claim it outperforms evenGPT-4 and Gemini 1.5 Pro in some

(07:30):
areas.
So on paper, based on theirtests, it looks very, very
capable.

Speaker 1 (07:34):
Okay, very capable on paper.
Benchmarks look great.
Human testers internally arehappy.
But then you go online, youlook at Reddit.
You talk to people, especiallyabout Siri, and the vibe is just
completely different.
You see things like Siri isuseless, flat out, or Apple
completely failed on deliveringwhat they initially wanted.
What is happening here?

(07:55):
Why is there such a huge gapbetween Apple's shiny internal
reports and this wave of userdisappointment?

Speaker 2 (08:02):
It's a fascinating disconnect, isn't it, perception
versus metrics, and maybe thesheer speed of AI progress
setting expectations.
That Reddit source reallyhighlights the tension.
You have users like 5pilla, theoriginal poster there, who
actually defends Apple.
They argue Apple is right towait on AI and Siri.
Their thinking is it's betterto be late and solid than early
and messy.

Speaker 1 (08:22):
The classic Apple approach, maybe Polish overspeed
.

Speaker 2 (08:24):
Kind of.
They appreciate Apple's focuson a smooth experience,
reliability and definitely theprivacy angle.
These users acknowledge that,yeah, ai on phones is still
early.
Lots of cool demos out there,but maybe not polished or useful
day to day yet.
So for them, apple's patienceis a virtue.
It's strategic.

Speaker 1 (08:43):
But, man, you can feel the impatience bubbling up
from others strongly.
Some folks thrown around baitand switch saying Apple
overpromised, underdelivered,especially marketing new iPhones
on AI features that now seempushed back to what?
Spring 2026 for some things.
This isn't just about fancybenchmarks anymore.
It's about whether theassistant on your phone actually
works well today.
What does this tell us aboutwhere AI assistants are right

(09:06):
now?
Is Apple's perfectionismbecoming a problem?

Speaker 2 (09:09):
Exactly.
You have other users likeCephas Sierra in that thread who
are just fed up.
They say Apple completelyfailed on the initial vision and
that Siri still sucks Harshwords.
They look at Google Assistant,they look at Gemini and they
argue those are light yearsahead in just basic daily
usefulness.
And the thread shows this isn'tnew frustration.

(09:30):
Siri's been around since 2011.
But many feel it's barelychanged or maybe even gotten
worse at simple things.

Speaker 1 (09:36):
Yeah, trying to get Siri to do something basic can
sometimes feel like pullingteeth.

Speaker 2 (09:40):
Right.
Meanwhile people point outAlexa is great for smart home
stuff.
Connects to everything.
Google Assistant is praised forbeing smooth on Android, giving
useful info proactively.
So the fight isn't just aboutwhose AI model is technically
best on a benchmark.
It's about the whole ecosystemhow responsive it is right now.
What features are actuallyavailable today?
There's this real tensionApple's careful, private

(10:02):
approach versus the publicwanting cutting edge AI.
Now the market isn't exactlywaiting patiently.

Speaker 1 (10:08):
Okay, this leads us straight to another critical
point Trust and safety.
With AI getting deeper into ourlives, this stuff matters A lot
.
Apple says responsible AIprinciples, inform all steps.
Sounds good, but what does thatactually mean in practice,
especially with a risk like AImaking things up hallucinations
or bias, or even just beingmisused?

(10:30):
How do they stop it going wrong?

Speaker 2 (10:31):
They seem to take it very seriously.
Beyond the no private user datafor training rule, which is a
big one ethically, they talkabout multiple layers of
guardrails.
This includes deliberatelytraining the models on
adversarial data, basicallyfeeding it tricky inputs
designed to fool it or make itmisbehave so it learns to resist
.
They also do a ton of redteaming.
That's where teams of humansand even other AIs actively try

(10:53):
to provoke harmful or biasedresponses.

Speaker 1 (10:55):
Like professional hackers, but for AI safety.

Speaker 2 (10:58):
Kind of, yeah, trying to find the weaknesses before
bad actors do.
And for things like codegeneration, say, in Xcode, any
code that AI writes is alwaysrun in a totally locked down,
isolated sandbox.
They use tech like Firecracker.

Speaker 1 (11:18):
They treat AI generated code as inherently
untrustworthy by default, whichis probably wise, and we've seen
why that caution is neededright.
That Air Canada chatbot casewas a real wake-up call.
The chatbot gave wrong infoabout bereavement fairs, a
customer relied on, it sued andAir Canada was held liable for
what its AI said.
That shows these aren't justabstract risks.
How does Apple plan to handlethat kind of real world
liability when AI is baked intoeverything?

Speaker 2 (11:37):
That Air Canada situation is the perfect example
of the stakes.
It's huge.
Apple's strategy seems to beabout carefully balancing being
helpful with being harmless and,importantly, tailoring safety
rules for each specific feature.
It's not one size fits all.
They do incredibly detailedhuman reviews focused on harmful
content, sensitive topics likehealth or finance, and they aim

(11:57):
for much lower rates ofproblematic responses compared
to other models.
Their internal tests showedtheir responses were seen as
safer and more helpful side byside.
Ultimately, it's about buildinguser confidence.
Especially if the AI is helpingwith important matters, you
have to trust it.
We're also seeing this trendelsewhere, right Specialized AIs
like Hippocratic AI forhealthcare using doctor input,

(12:18):
or LAQ for helping seniors.
These examples show theindustry moving towards tailored
, safer AI for critical uses.
Apple's careful approach fitsthat mold.
So, wrapping this up, we've gotApple's vision deeply
integrated AI, huge focus onprivacy, meticulously engineered
for efficiency and safety.
They're clearly proud of thetech, the benchmarks, the

(12:40):
responsible path they're taking,but we've also got that loud
chorus of users feeling they'rebehind that Siri just isn't
there yet.
That waiting for features isgetting really old, especially
when competitors have somethingusable right now.

Speaker 1 (12:52):
Indeed, it boils down to a big trade-off, doesn't it
for Apple and for you as a user?
Do you value that potentiallyperfect super private experience
enough to wait for it, even ifit means feeling behind the
curve today?
Or do the immediatecapabilities of competitors,
even if maybe less polished orprivate, win out because they're
useful now?
This deep dive really showsApple trying to walk that line

(13:13):
between innovation and theircore values, but it's definitely
a tricky walk in today's AIrace.
It leaves us with a prettyprovocative thought, doesn't it?
In this mad dash for AIsupremacy, where speed and
having something often seem tocount most, is Apple's
traditional late and solidstrategy aiming for perfection
and privacy still the winninghand in the long run?
Or is the market's hunger forthe latest thing right now

(13:35):
turning that famous Appleperfectionism into, maybe, a
liability?
It's a really fascinatingquestion and one will definitely
keep watching closely as allthis unfolds.

All Episodes

Episode Transcript

Popular Podcasts

New Heights with Jason & Travis Kelce

Dateline NBC

On Purpose with Jay Shetty

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Apple's AI Fumble: The Siri Crisis

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}New Heights with Jason & Travis Kelce

Dateline NBC

On Purpose with Jay Shetty

All Episodes

Apple's AI Fumble: The Siri Crisis

New Heights with Jason & Travis Kelce