All Episodes

April 21, 2025 • 27 mins

In this episode, I break down what I believe is the emerging structure of the AI-powered world we're all building—consciously or not. I call it the “Four A’s”: Assistants, APIs, Agents, and Augmented Reality. This framework helps make sense of recent developments and where it’s all headed.

I talk about:

1. Digital Assistants That Understand and Optimize Your Life
Your DA (like “Kai”) will know your goals, preferences, health, schedule, and context—and proactively optimize your day, from filtering messages to planning meals or surfacing relevant information in real time.

2. APIs and the Real Internet of Things
Everything becomes an API—from businesses to people to physical objects. Your assistant interacts with these APIs to act on your behalf, turning the world into a navigable ecosystem of services, tools, and resources.

3. Agents and AR Bringing It All Together
Agents act autonomously to complete multi-step goals, and AR glasses will display their outputs contextually as you move through the world. 

These systems will collaborate, search, and act—quietly transforming how we live, work, and perceive reality.

Subscribe to the newsletter at:
https://danielmiessler.com/subscribe

Join the UL community at:
https://danielmiessler.com/upgrade

Follow on X:
https://x.com/danielmiessler

Follow on LinkedIn:
https://www.linkedin.com/in/danielmiessler

See you in the next one!

Chapters:

00:00 - The AI Ecosystem We’re Building Without Realizing It
01:33 - Assistant: Your Most Powerful Digital Companion
03:08 - APIs: How DAs Interact with the World
07:54 - Agents: The Step Beyond Automation
11:00 - Augmented Reality: The Interface Layer of the AI Ecosystem
14:20 - Combining APIs, Agents, and UI for Real-Time Situational Awareness
17:17 - Summary: A Unified Ecosystem Driven by the Four A’s
23:36 - Industry Trends: How Companies Like OpenAI, Apple, and Meta Fit In
25:11 - Final Thoughts on Timelines, Winners, and Interpreting AI News

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
S1 (00:00):
Unsupervised Learning is a podcast about trends and ideas in cybersecurity,
national security, AI, technology and society, and how best to
upgrade ourselves to be ready for what's coming. All right,
in this video, I'm going to talk about the AI
ecosystem that I think everyone is actually building and moving

(00:22):
towards without even realizing that they're doing so. And I'm
going to break it down into four pieces, which are
assistance APIs, agents, and augmented reality. And I think once
you see this model, you're going to realize that the
news coming in from OpenAI and anthropic and all these
different companies, it's all moving in this direction toward this model.

(00:46):
And I think it's going to be really helpful for
you to just have that mental model of it. So
let's jump into it. So I actually broke this down
into a much longer explanation that you could see above, uh,
back in December of 2023. And I wrote a kind
of a little bit of a crappy book about it
in 2016. I really just wanted to capture the ideas,
which it was decent at doing that. It was called

(01:08):
The Real Internet of Things. You could put a link
to that in there. Don't really need to read the book.
This is actually much better. But, um, this stuff is
actually starting to happen now. And what I want to
do today is take the concepts from the book and
from that video above, and show how they're actually falling
into this model and how it's actually starting to happen today,

(01:29):
which we could see throughout the news from all these
different companies. So the first a here is assistant or
what I call a digital assistant. So a million different
companies are actually building this piece in various ways. Some
companies are building like digital companions or like smart assistants
or personal agents. But to me like this is all

(01:50):
actually kind of the same thing. It is basically a
piece of tech that is the most intimate to you
because it knows everything about you your preferences, your calendar,
your contacts like health information. Your finances is going to
be our best advocate, our best tutor, right? It's a
filter against like incoming stuff that you don't want to see.

(02:10):
And hopefully that filter and those filters are actually assigned
by you and not by someone else. But, um, yeah,
filtering out messages and emails and stuff you don't want.
It's basically figuring out exactly what you want and figuring
out how to make that happen all the time. Now,
if you're insecurity like me, you're probably thinking, well, that's crazy.

(02:31):
All this stuff on the screen here, like, knows everything
about you. History, trauma preferences, remembers everything, has multiple agents.
Like nobody's going to actually put that information into their Da.
But I think we already know that's not true. We
already know that people are doing this. That's why these
digital companion companies are doing so well already. This is

(02:51):
functionality that is just so powerful. Like it's just going
to happen, right? And this is kind of the centerpiece
of this whole model that I'm going to break down.
This is the first one. This is the assistant. So
let's call RDA Kai hours is going to be called
Kai okay. So the second A is APIs. So your

(03:13):
Da isn't that useful if it can't do stuff for
you right. So the way it'll do stuff is through APIs. Uh,
I didn't say in 2016 which of like the Da
or the APIs was going to come first because I
didn't know. And I still really kind of don't know,
but it seems like they're kind of happening at the
same time. Basically, your Da Chi is over here being

(03:35):
an agent for you, right? It's being like your advocate,
like we already talked about. It's it is an agent.
It has agents working for it. But ultimately it's like
one personality with a collection of agents behind it. It's
effectively kind of like one entity or one person, which
is why we give it a name. You know, we're
treating it like a person, like like a friend, right?

(03:57):
So ultimately it's encapsulated in one personality, in one sort
of entity. So Chi is constantly looking at your state
and trying to figure out how to make it better.
That's the core concept for the Da. That's what it's doing.
Are you hungry? Are you angry? Are you stressed out?
Do you have a meeting coming up? Right. That it

(04:19):
needs to help you prepare for. And all of this
is proactive. You haven't even asked it. Anything else yet.
So basically, these are going to do something I talked
about a couple of weeks ago, which is managing your
current state relative to your ideal state or your desired state. Right.
How could this current situation that I am in that

(04:39):
Chi is monitoring? How could that be better? So if
you're hungry, Chi will go find food. If if you're
worried they're going to find camera feeds to like see
around corners, like if you're worried about your security or
you're walking around or something. Or they'll listen to scanners
and see if there's, like police activity nearby. Or look
at crime stats for like the, the neighborhood that you're

(05:00):
in or whatever. Never. And that's actually what all these
APIs are over here that you see, right. These are
all the different things that Cai that your Da will
have access to. So APIs are essentially the representations of
people and companies and services. Basically everything becomes an API.

(05:23):
And we're already seeing this as we're going to talk
about we're already seeing this with MCC. This is actually
starting to happen. So so what I said in 2016
was basically everything gets an API. Every person objects, businesses
most importantly people and businesses, but also other objects. That's
why I called it the real Internet of Things. Basically,

(05:45):
everything gets an API and you have your Da navigating
those APIs for you on your behalf, right? So all
these APIs you see here, uh, except for there's going
to be millions of them, right? Eventually billions. But you
start off with thousands and then millions or whatever, but
every company will be an API. Every product will be

(06:05):
an API. People will be broadcasting APIs of ourselves, which
which I call demons is just a Greek word for
like soul, basically. And think of this like your own
personal like MCP server. And these are not designed to
be used by you or me. We can't read all
these APIs. You need help reading all these APIs. You

(06:27):
can't walk into a mall or walk into a city,
or walk down a road or whatever, and read all
the cars and the trees and all the people and
all the businesses. You can't do that. That's why you
need your Da to do that for you. Right. So
all these systems, all these APIs here and the agents
that sort of represent them, they're all designed to be
used by your Da, right? That's their purpose. So it's

(06:51):
like the interface to the world, like changes. It's no
longer about what we see with our eyes, like Google
like old Google now. It's about what do agents see?
What do DAC write? That's the world that starts to
matter a lot more. And a big part of this
is going to be a bunch of APIs that are
actually just concatenations or lists or directories of other APIs,

(07:15):
because one of the things that Midia has to do
is it has to ask, hey, what's the best restaurant
or whatever? And I've got a few here, right? Best food. Um, yeah.
Lookups or whatever. These are all just be, you know,
third party services that do nothing but crawl all the
other ones and rate them so that when Kai reaches
out and says, hey, I need to find the best food,

(07:36):
you know, within like three minutes, uh, close to this location,
but it can't have chicken in it or whatever. All
those different criterias, it can find the right one, right?
So there'll be a whole bunch of like lookup service
type of things like that. Okay. So that takes us
right into the third one which is agent. So we've
got a few agents here. And the way I like
to define an agent, there's lots of different definitions. I

(07:59):
think the agent should be super obvious, like from the definition,
like what it actually means and why it has value.
So I say it's an AI system component that autonomously
pursues a goal by taking multiple steps that previously would
have required a human. I think that is a really
good definition. Um, it's autonomous and it's taking a goal,

(08:20):
and it's pursuing that with multiple steps in a way
that only humans could do before. The part that makes
an agent different than automation, this is really important. This
is why I have it in the definition. The part
that makes it different is the fact that when a
human is trying to get something done, like say you're
an assistant for your boss or whatever, and you're trying
to get something done, like you call the first place.

(08:43):
They don't answer the phone, you call the first place.
The phone number doesn't work. Life is just broken, right? Like,
all these different steps are broken. Now, if you have automation,
automation is static, right? It's a whole bunch of if thens.
Agents aren't. If then they are. I have all these
tools available. I'm going to keep going. I'm going to keep,
you know, exhausting my resources, trying different things to try

(09:07):
to get it done right. I will, you know, maybe,
maybe none of the things work. So I'm going to
do more research to find another, uh, API that I
could use or another service to find this person. Pizza.
Sarah wants pizza. I'm going to get Sarah pizza, and
she's going to do you know, the Da is going
to do multiple things to make sure that she gets

(09:28):
that pizza. That's the difference between automation and agent. So
that's our definition here. And we see that our Da
actually has the use of multiple agents. These agents might
be like researchers. They might be like security bots uh,
to lock down your infrastructure. That could be whatever. But
they all kind of work for Kai, right? Kai is, like,

(09:50):
the centerpiece here. And this is going to be a
theme we're going to see throughout. Agents all over the place,
including inside of companies like we have over here with like,
United or whatever. It's the concept of you're talking to
one agent, but behind it, it has a whole bunch
of other agents. So you give it the goal and
it breaks that down into sub goals and gives that
to the smaller agents, which are then doing the other

(10:12):
things like building a marketing campaign, uh, hacking a website,
doing whatever it is. Right. So that's really the concept
of agents. And Google actually just released a thing called
agent to agent, I think was the name of it.
And what it does is it makes it so that
all these different agents here, they could talk to each
other with a common protocol, which is very similar to MCP,

(10:35):
where it's a common protocol for creating APIs for an
application or, you know, a company or whatever. So we're
starting to see the glue, the protocol glue that's going
to make all this stuff possible with this agent agent
protocol in MCP and stuff like that. So the final
piece here, so we've got we've got the Da, we've

(10:57):
got the assistant, we've got the APIs, we've got the agents.
So the final piece or the final A here in
the four A's is AR or augmented reality. And this
is the one you might be thinking is fringe or
it's like ten years away, but it's actually much closer.
Meta and Apple are currently fighting over this now. Tim

(11:17):
Cook just recently said, look, I'm not going to let
anyone beat us here. They want to beat meta at
this game. Meta already has really good glasses. Um, they're
not actually displaying anything inside of it, but you can
see out of it that takes pictures like it's pretty decent.
And obviously it's not big and heavy and super expensive
like the Vision Pro. So that is a battle that

(11:39):
is happening right now. So we are all eventually I
don't know how long this is going to take. It's
hard to make like specific predictions. Right. So 2 to
5 years, who knows. It's going to be something relatively soon.
Meta or Apple or maybe someone comes out of the
dark and just kind of crushes this. Who knows? But

(12:00):
the point is, we're all going to have these AR
glasses eventually, like contact lenses or something better than that.
But it's going to start off with glasses. And here's
the trick. This is how the whole ecosystem starts to
come together. Our Das are going to be showing us
stuff that is time based, and that is contextually relevant
to whatever we're doing at that moment. Remember, our Da

(12:22):
is trying to optimize everything according to our goals. It's
trying to get to our desired state from our current state.
So if we're walking down a street like this here,
we're walking down a street and like we think it's
kind of dangerous. Yeah, it's going to present this interface here,
which I've got over here coming from this, this demon
called bastion, which is really it's just a company. It's

(12:44):
a company. It's called bastion. And they have feeds called
get feed, poll cameras, poll microphones, query personal mics, get
local CCTV. Right. So maybe it could pull all the different, um,
people who are broadcasting their feed because people are going
to be wearing cameras as well. This is coming soon. Uh, basically.

(13:04):
Camera ahead of you. Camera behind. Behind you. And maybe
you sell your camera feed to Bastian. People will do this.
Trust me. It's going to happen. People are going to
sell their camera feeds to Bastian. Right. It's not going
to be for private stuff. Like it's going to get
turned off when you go home. Stuff like that. You
shouldn't trust that. You should also like cover the camera

(13:26):
or whatever. But the point is, if you're sitting in
Starbucks or whatever and say a fight altercation happens or
something like that, Bastian will be able to show that
to the police or show that to somebody else who's
worried about it. So my Da, while I'm walking down
the street, right? I'm walking down the street here, it's like, oh,

(13:48):
I this neighborhood feels unsafe. That's what I'm saying. I'm
saying this neighborhood feels unsafe or it hears me say
something in a conversation where I'm just like, I don't know,
it's kind of sketchy. I'm a little worried, right? I
say anything like that or even before I say it,
the da Chi goes out and looks at one of
these services to find the best security interface, the best

(14:12):
one for parsing feeds, uh, giving, you know, real time
HUD data and stuff like that. So it gets one back.
It's called bastion. So it starts pulling stuff like that,
it gets back the content. It then goes to another interface,
which is a whole separate company, which is the UI

(14:32):
for this content. Okay. You see these red. This is
a great example. But like let's say there's data here right.
Let's say there's like, oh, how many people are around. Um,
is anyone wearing a weapon. Let's do like gait analysis
to see if they're leaning because they're carrying a gun
or something like that. Right. All this stuff, all these
different individual pieces, different companies are better at, okay, somebody

(14:57):
is better at making this red, cool looking interface. Somebody
is better at doing voice analysis of microphones coming from
all around you. Somebody is better at doing camera analysis of, like,
all the different dangers on the street. All of those
are these right here. This is what every company becomes.

(15:18):
It becomes a specialized thing at doing a thing better
than everyone else, all judged by these indexing services, these
rating services which are marketing to your Da. It is
marketing to Chi. So when I'm walking down the street
and I say, hey, show me what's going on around

(15:39):
me or something like that, or I don't even have
to say it. It just knows I'm freaking out. Why?
Because Chi can see my heart rate. Chi can see
that we're in a place I've never been. Um, somebody
is laying on the street with, like, a needle sticking
out of their arm. Chi figures out this is kind
of seedy. It's a little bit dangerous. I don't like it.
And obviously, my principal Daniel, doesn't like it either. Therefore,

(16:04):
broom goes and searches, finds Bastian, finds a UI. The
best UI. Okay, the best UI is called UI Wizard.
Not too creative, but whatever it's called UI Wizard. UI
Wizard pops up. That's this red interface, and it starts
filling in data where the data come from. From Bastian,
it came from the Bastian service. Where does that go?

(16:27):
This interface is in these glasses, which is on my face.
So now watch this. We've got other scenarios here. Okay?
You start browsing for headphones. Your Da does this. It
uses these services and it gives you back a response.
So I'm looking for headphones. It goes and investigates all
these different things. You mentioned your friend that you're getting hungry.

(16:50):
It goes and looks researches all these different best food places,
parses all 713 different places in like a second and
a half. Gets back the results. Hey, you haven't had
Thai in a while. There's a great little place with
super high ratings if you take a right into blocks.
I can call you in an order if you want. Right.

(17:11):
This is the model. These four A's. This is the model.
This is where this is all heading. This is the direction, right?
So I'm telling you. I'm telling you this. This is
what's happening. It is absolutely exciting to see this starting
to unfold. Right. There's a million different companies working on

(17:34):
this part. There's multiple companies working on the AR glasses part.
Everything is turning into an API already. This is MCP
over here. This is the unification of it. And then
of course, over here we have what's happening on the
corporate side where agents are basically going to be doing
a whole bunch of work. You'll have a humans kind

(17:55):
of in charge of things. The leaders and the in
the extreme, SMEs will be human for quite some time,
I think. I mean, it's going to be pretty hard
to automate everyone away, but a lot of the work
that was getting done is going to be getting done
by agents and teams of agents. So that's agents inside
the corporate place. But as far as the consumer side,

(18:16):
as far as the stuff that you're seeing, like in the,
you know, the OpenAI and anthropic and most of the
stuff that they're talking about is mostly about the consumer
and stuff like that. This is it. This is the structure. Okay.
So another example of this is like let's say you're
in like a live conversation and you're having a conversation

(18:39):
with somebody and it's like somebody you've never met before
and you're considering whether to go into business with them
or whatever, and they're making a whole bunch of claims.
They're like, oh, yeah, I used to work with so-and-so
and blah blah, blah. And actually I helped him start
his business and, uh, yeah. Do you know Sarah? Yeah. Sarah.
You know, I went to college with her and blah, blah, blah.
So again, you're wearing glasses. Everyone's wearing glasses. The person

(19:03):
you're talking talking to is actually wearing these glasses as well,
and you're having this conversation. But in the whole time
you're wondering like, is this actually true? Is all the
stuff that this person claimed happened or the people that
they claim they know or whatever? Is this all true? Right.
So what will happen is you'll have like something going

(19:25):
off to voice analysis. This depends how many things you're
subscribed to. It depends how far along we are in
this cycle. You know what all your Da can actually do.
But this is this is all sort of being built
right now. So like, if there's tension in their voice, like, uh,
analyzing the claims that they're making, doing research on it,
did they actually go to college? Did did Sarah and

(19:47):
this person actually were they in college at the same time?
That should be on LinkedIn. Let's go find that out.
So if you're waiting on a delivery and this goes
back to the R side, if you're waiting on the
delivery you will see a timer timing down right. Just
like you have on on your phone. Now that will
be in your interface so you don't have to pick
up a phone. The whole point is with AR glasses

(20:08):
is to have to do much less with an actual
physical device that you pick up and have to interact with.
It'll be a lot more. It's visual here, and you're
just talking to your Da, and your Da is doing
most of the work yourself using all these different services, right.
So where is the data coming from? Right. How is

(20:29):
it being displayed in the glasses. That's exactly what we
talked about earlier. The data is coming from these services
moving through a UI being displayed in the glasses. Right.
So in the case of like trying to determine if
someone is lying, right. Let's say there's like a lie
o meter interface inside of this little UI here inside

(20:50):
your AR. Well, somebody is providing that lie o meter interface, right?
There is some one of these companies is actually the
voice analysis, uh, providing back data. And that's just like
Jason coming back. That is like, um, saying, like the
chances of them being lying about this particular thing, like,

(21:10):
according to voice analysis or like, according to the research
that was done. Right. All these things can be combined together.
Like that stuff could actually just be returned. Roar back
to Chi, and Chi could look at all that and
send another feed into the UI to update that Leo meter, right?
There's so many options here because because these are APIs,

(21:32):
this is just JSON or whatever. The protocol is going
to be flowing back and forth, right? That's the power
of this entire thing. So let's zoom out again and
let's just take a look at this entire thing. Okay.
Again just as a review here. Look at the four A's.
Chi knows what you want at all times. Constantly calling APIs,
making requests, summarizing things, creating reports for you, researching the

(21:55):
best options, anticipating your needs throughout the day and week
and month or year or whatever. Adjusting your ah, interface.
Constantly switching. Look, you're in the house. You maybe you
see your books. Maybe you see how hungry everyone is.
Like this. This interface is constantly changing. Chi is changing
it for you using these different APIs. Right. And the

(22:15):
interface that Chi is using will be coming from multiple
companies as well. Maybe, maybe Chi's generic interface actually gets
good enough. Maybe one of these companies is a generic
interface creator, so you don't actually have to use individual ones. Right.
And Kyle just switched using that one. So when you're
looking at books you're inside of a library. It's a

(22:37):
different HUD. You know, walking down the street it's a
different HUD talking to someone different HUD. Right. So if
you take a step back and you look at this
interface here, right, think of all the news that you've
heard from the last few years in. I think about
the latest news from OpenAI. They just added like long
term memory, where it's going to remember all your previous conversations. Right. Um,

(22:59):
think about digital companion companies, uh, digital helper apps that
will like go and do tasks for you. Siri and
Gemini on the mobile device. And of course, you all
heard the stuff about Siri. Right? Where they're trying to
make this thing a better assistant. They're trying to give
it access to more and more data about you. Right.
And they're trying to do it in a secure way, obviously.

(23:21):
And Gemini is competing in that space as well. Samsung
has their own version. Right. This is all like heading
in that direction of like the Da. All of them
are heading in this direction of the unified right. That's
the easiest way to see this. So now think about
all the news around MCP and APIs and how they'll
be able to talk to each other with the agent

(23:43):
agent protocol and all of those different things. Basically, that's
what we're talking about with everything that's an API. And
you already hear all the talk about agents. That's all
everyone talks about now. And then you think about the
news about Meta and Apple fighting about AR glasses. And
that's this piece over here. Right. So everyone is moving

(24:03):
towards these four A's. This is it. This is the
ecosystem that we're creating. It's absolutely insane. And it's starting
to fill in. And like again, now that you've seen it,
I think you're going to realize that all the new
news is just filling in pieces of this and eventually
getting us to this. And it's very sort of cyberpunk

(24:23):
y and very future oriented. But so many of these
pieces are already exist, like this is just HTTP going
back and forth like these protocols are not too difficult.
The only difficult part right now is just like the
hardware challenge of like the AR stuff is really difficult,
and that's probably the thing that's going to take the longest.

(24:44):
But agents are getting better. Like the AI itself is
getting smarter and smarter. Like context size is a huge
one and memory is a huge one that is dramatically
getting better. Like for one now has a million tokens. Um,
a couple of the Google ones now have 4 million tokens,

(25:04):
I think, for context. So all of this is starting
to come together. All right. So anyway, that's what I
wanted to share. And I think this is a really
powerful way of just interpreting what's coming in in terms
of the news and putting that into a context and
showing how it fits into a model. I think it's
just useful to be able to parse things in that
way and make sense of it. Keep in mind, we

(25:27):
don't actually know the details of any of these pieces, right?
Is MCP going to win? Who knows? That new agent
to agent protocol that Google came out with? Is that
actually going to be useful? Who knows. Like it might
not be useful at all. Like nobody could adopt it.
Like who's going to win with the glasses, right. Is
it is it going to be Apple. Is it going
to be meta? Is it going to be someone completely separate?

(25:49):
I have no idea. Right. These things are not super predictable.
I believe that my 2016 like outlook and predictions from
back then, it's human based. It's based on the fact
that I know what I want my agent to be
able to do. And you know, I've been in tech
for so long that I understand all these protocols and stuff,

(26:11):
so I just assumed this is the way it was going.
And it turns out to be happening. But you can't
predict the companies. You can't predict the timelines, you can't
predict any of this. So that's what makes it so exciting.
I just hope that this model helps you sort of
make sense of the news as it comes in, and
hopefully moves it in this direction towards this model, and
just makes it easier for you to parse the news

(26:33):
and make sense of it. So, uh, do me a
favor and subscribe and I'll see you in the next one.
Unsupervised learning is produced on Hindenburg Pro using an Sm7
B microphone. A video version of the podcast is available
on the Unsupervised Learning YouTube channel, and the text version
with full links and notes is available at Daniel Miessler newsletter.

(26:57):
We'll see you next time.
Advertise With Us

Popular Podcasts

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Therapy Gecko

Therapy Gecko

An unlicensed lizard psychologist travels the universe talking to strangers about absolutely nothing. TO CALL THE GECKO: follow me on https://www.twitch.tv/lyleforever to get a notification for when I am taking calls. I am usually live Mondays, Wednesdays, and Fridays but lately a lot of other times too. I am a gecko.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.