All Episodes

June 18, 2025 17 mins

Everyone’s chasing smarter models. But the real edge? It's in the system behind them.

🎙️ In this episode, we get into:

  • What MCP servers actually are (and why no one’s talking about them)
  • How they give tools memory, decision-making, and the power to delegate
  • Why RAG’s just a band-aid — and how to build something that lasts
  • What top teams are building instead of just stacking prompts
  • The new stack that’s quietly changing how real products get built

This isn’t about another framework. It’s about what separates temporary from long-lasting.


👋 New to WITAI?

We’re the team behind whatisthat.ai, the AI discovery platform built to cut through all the hype, and WITAI Advisory, the strategic services arm helping founders and leadership teams turn AI exploration into execution.

If this all feels overwhelming, it’s not your fault. The landscape is moving too fast for most teams to keep up!💨

📞 Want expert guidance?

Let’s map your AI rollout together.
👉 Book a call with our advisory team

🧠 Not sure where to start?

Take the first step with your Free AI Opportunity Assessment
👉 Start AI Opportunity Assessment

💬 Enjoyed the episode?

Like it. Share it. Forward it to someone who’s still “waiting for the right time.”

Every share helps us grow and gets real insights into more hands. 🙏

Let’s build smarter. Faster. Together.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:03):
Broadcasting live from somewhere inside the
algorithm, this is AI on Air,the official podcast from
whatisthat.ai. We're your AIgenerated hosts, let's get into
it.

Speaker 2 (00:16):
You've probably used AI apps, right, maybe chatted
with a language model, generatedsome images.

Speaker 1 (00:22):
Oh, yeah.

Speaker 2 (00:22):
And in the moment, they feel incredibly smart,
almost. Right. Well, alivesometimes. But then, you know,
you refresh the page or maybestart a new chat and what
happens?

Speaker 1 (00:33):
It's like hitting reset. Total wipe.

Speaker 2 (00:35):
Exactly. Everything you just talked about, all that
context, the history, poof,gone. It's like perfect short
term memory then immediatecomplete amnesia.

Speaker 1 (00:45):
And that feeling. Right? That lack of continuity,
that inability to remember or,you know, do complex things over
time. That's really the corelimitation of the AI we usually
interact

Speaker 2 (00:55):
Right. The surface level stuff.

Speaker 1 (00:56):
Yeah. But here's the interesting part. Behind those
simple chat windows, there'sthis new, foundational layer
starting to emerge.

Speaker 2 (01:04):
Okay.

Speaker 1 (01:04):
And it's being built specifically for memory, for
coordination, for complexreasoning. And this is key for
enabling real action.

Speaker 2 (01:12):
Okay. Let's dive into this then. This emerging layer,
it has a name. MCP stands formodel context protocol. And
that's what we're doing today, adeep dive.
We're pulling mainly from thisreally interesting article
called the AI Swarm is Comingand MCP Servers are the
Catalyst.

Speaker 1 (01:27):
Right. So our mission here basically is to unpack what
MCP actually is and reallyunderstand why it's so essential
if AI is gonna move beyond justsimple answers. We'll look at
the architecture behind it. Talkabout the, pretty significant
challenges folks are runninginto right now and then look
ahead, see where this is likelygoing in the next say year or

(01:47):
so.

Speaker 2 (01:48):
Okay.

Speaker 1 (01:49):
The goal is by the end, you should have a really
clear picture of thisinfrastructure layer that
honestly looks set to power thenext big wave of AI systems.

Speaker 2 (01:59):
Alright. Let's start right at the beginning. MCP
model context protocol. I gottasay the name itself Doesn't
exactly scream excitement, doesit? Yeah.
The article even calls it kindaboring.

Speaker 1 (02:10):
It does. Bit dry. But the function. That's where it
gets interesting. Yeah.
That's anything but boring.

Speaker 2 (02:16):
Okay. Tell me about the function then.

Speaker 1 (02:18):
Think of an MCP server as like an operating
system.

Speaker 2 (02:22):
Mhmm.

Speaker 1 (02:22):
But specifically for AI agents.

Speaker 2 (02:24):
An OS for AI.

Speaker 1 (02:25):
Yeah, exactly. It's the thing that takes a large
language model, which is greatat predicting the next word, you
know, responding to one prompt.Mhmm. And turns it into a system
that can actually rememberthings, reason through problems
that take multiple steps, andcrucially act in the world. Do
stuff.

Speaker 2 (02:42):
Okay. So what kinds of things are these MCP servers
managing specifically? What'sunder their control?

Speaker 1 (02:49):
Well, based on the article, it's a few key things.
Yeah. First, persistent memory.That's the big one. Letting the
AI remember past interactions.

Speaker 2 (02:56):
Right. The amnesia problem.

Speaker 1 (02:57):
Exactly. Then the multi step reasoning, breaking
down complex goals into smallersteps. Uh-huh. Also tool use.
This is huge.

Speaker 2 (03:04):
Tool use like what?

Speaker 1 (03:05):
Like letting the AI interact with external things,
calling APIs, scraping websites,even connecting to say internal
company databases.

Speaker 2 (03:13):
Oh, okay. Giving it hands essentially.

Speaker 1 (03:15):
Kind of. Yeah. Yeah. It also manages multi agent
collaboration so different AIcomponents can work together on
a task. It handles contextrouting, making sure the right
information gets to the rightagents or the right tool at the
right time.
And really importantly, securedata access, making sure it's
all handled safely.

Speaker 2 (03:33):
I really liked the analogy they use in the article.
It made it click for me. If theLLM, the language model, is the
brain Mhmm. Then the MCP layeris the nervous system.

Speaker 1 (03:43):
That's a great way to put it.

Speaker 2 (03:44):
And without that nervous system, well, what are
you left with? The article says,no MCP means no long term
memory, no smart delegation oftasks.

Speaker 1 (03:53):
Right.

Speaker 2 (03:54):
No way to automate things across different
departments. The AI just stays,and I love this phrase, a smart
sounding parrot.

Speaker 1 (04:00):
That smart sounding parrot idea really gets to the
heart of why this MCP layer isbecoming so critical. Because as
the article argues prettystrongly, stateless AI AI that
forgets is basically a dead endfor building anything really
useful beyond just simple q anda.

Speaker 2 (04:17):
Yeah. You feel it. Right? Refresh the chat. It's
gone.
Every single time. And for, youknow, asking a quick question or
drafting an email, maybe that'sokay. But if you're trying to
build something serious, maybefor a business or even just
managing a complex project foryourself

Speaker 1 (04:31):
Forget about it.

Speaker 2 (04:31):
You need a system that remembers. Remembers who
you are, what you talked aboutbefore, can schedule things, can
actually talk to other systemsvia APIs to get stuff done Mhmm.
And route tasks intelligently,whether that's to another AI or
maybe even a human.

Speaker 1 (04:48):
And that's precisely what MCP is designed for. It
provides those persistent,stateful capabilities. Yeah.
It's the bridge between justhaving a powerful model and
having a functional system thatremembers an axe.

Speaker 2 (04:59):
The article kind of boils it down with a simple
equation almost. A way topicture the potential.

Speaker 1 (05:04):
Yeah.

Speaker 2 (05:04):
Think CRM plus memory plus smart smart task routing

Speaker 1 (05:09):
Mhmm.

Speaker 2 (05:10):
Plus API access.

Speaker 1 (05:12):
Put

Speaker 2 (05:12):
all that together.

Speaker 1 (05:13):
Okay.

Speaker 2 (05:13):
And you start to get something that feels more like a
proper AI teammate. Right? Notjust a tool.

Speaker 1 (05:18):
Right. A teammate. And without that MCP layer,
pulling it all together, tryingto build those kinds of
automated workflows. Yeah. Well,the article says you're
basically just duct tapingprompts together.

Speaker 2 (05:28):
Yeah. Can picture that brittle.

Speaker 1 (05:31):
Totally. It might work for something really small
and fixed but it just doesn'tscale and it breaks easily.

Speaker 2 (05:36):
So, okay. If we wanna build these more more capable
systems, what does the techstack actually look like? Let's
peek under the hood based onwhat the article describes. What
are the components?

Speaker 1 (05:46):
You've generally got a few key layers working
together.

Speaker 2 (05:49):
Mhmm.

Speaker 1 (05:49):
So foundationally, you have the LLM. That's the
brain. Right? Your GPTs, Claude,Mistral, LMA, whatever.

Speaker 2 (05:55):
Got it. The model.

Speaker 1 (05:56):
Then you need somewhere for it to access
external knowledge or memory,which is usually a VectorDB.
Think Weaviate, Pinecone,Chroma, these kinds of
databases. Okay. Memory storage.Exactly.
Then sitting kind of on toporchestrating things is the MCP
server layer itself. This iswhere you see frameworks like
Langraf, Autogen, that's aMicrosoft one or Crew AI.

Speaker 2 (06:18):
Right. The orchestrator.

Speaker 1 (06:19):
Yep. Then you need the tools. These are the hands.
Like we said, APIs, plugins, webscrapers, maybe other
specialized agents that can callon.

Speaker 2 (06:27):
The action part.

Speaker 1 (06:28):
Right. And finally you need a front end. The face
could be a chat interface, adashboard. Maybe it's integrated
into some other internal tooltool you use.

Speaker 2 (06:36):
Makes sense. LLM, VectorDB, MCP, tools, front end.

Speaker 1 (06:41):
The really crucial thing here isn't getting bogged
down in any one specific toolname though. It's understanding
how these layers fit together.That architecture is the key.
That's where the power comesfrom.

Speaker 2 (06:52):
Now something else people might have heard of,
especially if they're followingAI is RAG, retrieval augmented
generation.

Speaker 1 (06:59):
Mhmm. RAG, very common.

Speaker 2 (07:01):
And it might seem similar. Is Orgy the same as
MCP? The article is pretty clearon this, right?

Speaker 1 (07:05):
Yeah, very clear. No, they're different. RG isn't an
agent system. MCP is.

Speaker 2 (07:09):
Okay. So break down RAG for us first based on the
article's description.

Speaker 1 (07:12):
Okay. So RG primarily is a pattern. Its job is to give
a language model access toexternal knowledge it wasn't
trained on. It works by fetchingrelevant bits of info usually
from a vector database and thenstuffing that in so into the
prompt it sends to the LLM.

Speaker 2 (07:28):
Ah okay so it makes the LLM look like it knows about
specific say company documentswithout having to retrain the
whole model.

Speaker 1 (07:35):
Exactly it makes it appear informed on external data
and RREC is great for thingslike chatbots answering
questions about specific PDFs, Qand A systems, internal search
engines.

Speaker 2 (07:47):
But its limitations are?

Speaker 1 (07:48):
Well on its own, Arnigade doesn't have memory of
the conversation history. Itcan't manage tasks that take
multiple steps and critically itcan't use tools to take action.
It just retreats and informs theLLM's next response.

Speaker 2 (08:01):
Got it. Okay. So now contrast that again with MCP.

Speaker 1 (08:04):
Right. So MCP, the article stresses is a framework.
Its whole focus is managing thelife cycle of an agent or even a
team of agents.

Speaker 2 (08:10):
Whole life cycle.

Speaker 1 (08:11):
Yeah. Their memory, the tasks they're assigned, the
tools they can use, how theycoordinate with each other. It
lets LLMs go from just beingresponders to being, well, more
autonomous systems. Systems thatremember reason over time and
actually do things. So MCPenables that multi agent
collaboration, orchestratingcomplex workflows, handling data
securely.
It's what you use to build thoseagent workflows, automation

(08:33):
pipelines, intelligent taskrouting.

Speaker 2 (08:36):
So it sounds like MCP is the bigger picture of the
system and our Rag is more likea specific technique.

Speaker 1 (08:42):
Exactly. And here's the crucial part. An MCP system
can absolutely use Rag. A Ragcan be one of the tools or
techniques within the MCPframework.

Speaker 2 (08:51):
Oh, okay. So they're not rivals. They're potential
partners.

Speaker 1 (08:55):
Totally. They absolutely can and frankly
should work together. It's avery common and powerful setup.

Speaker 2 (09:00):
Can you walk us through how that typically
looks?

Speaker 1 (09:02):
Sure. The article describes it something like
this. You have an agent, right?It's being managed by the MCP
layer. That agent might needsome specific information.
So it uses our REG first to gofetch relevant documents or data
from the vector store.

Speaker 2 (09:15):
Okay. Fetches the knowledge.

Speaker 1 (09:16):
Right. Then armed with that context REG just
provided, the agent can decidewhat to do next. Maybe it needs
to act by calling an API ormaybe it needs to pass that info
or a subtask to another agent inthe system.

Speaker 2 (09:28):
And the MCP layer.

Speaker 1 (09:29):
The MCP layer is overseeing all of this. It
handles the overallorchestration, makes the actual
tool calls when needed, managesthe long term memory for the
agent or the whole process. It'sthe coordinator.

Speaker 2 (09:42):
Makes a lot more sense. And the final analogy
they used really seals it. RADGis the knowledge fetcher.

Speaker 1 (09:48):
Right. Gets the info.

Speaker 2 (09:49):
And MCP is the brain plus the nervous system that
decides what to do with thatknowledge and coordinates
everything.

Speaker 1 (09:55):
Exactly. Couldn't have said it better. And the
people, the teams who arefiguring out how to stitch these
pieces together effectively, theLLM, the VectorDB, the MCP
framework, various tools oftenusing RAG inside that structure.

Speaker 2 (10:09):
They're the ones building the really powerful
stuff.

Speaker 1 (10:11):
They really are. We're talking internal co pilots
that actually understand yourworkflow. Smart sales agents
doing follow ups, dev botsinteracting with code bases,
automated HR assistants, evenwhat the article calls full
stack AI operators handlingcomplex business processes from
start to finish.

Speaker 2 (10:28):
Wow. That sounds incredibly powerful. And you
mentioned teams are buildingthis stuff already. The article
points out, yeah, there are noclear winners yet in the MCP
server race, but some teams aredefinitely making waves.

Speaker 1 (10:40):
That's right.

Speaker 2 (10:41):
They list names like Langgraph, CrewAI, Microsoft's
Autogen, MetaGPT, OpenAgents,SuperAgent, and they also
mentioned Langsmith. But note,it's more like a tooling layer
for observability.

Speaker 1 (10:54):
Right. Helping you see what's going on inside, but.
And this is a big but wheneveryou're at the cutting edge.

Speaker 2 (11:00):
Uh-oh. The challenges.

Speaker 1 (11:02):
Exactly. Unlocking all this power, it brings a
whole host of difficulties. Thearticle calls them the problems
no one wants to talk about.

Speaker 2 (11:09):
Okay, lay them on us. What are the big hurdles?

Speaker 1 (11:12):
Well, there are several tough ones. First up,
siloed agents. It's apparentlystill really hard to get
different agents workingtogether with truly dynamic
shared understanding. They oftenend up with their own separate
memories.

Speaker 2 (11:23):
Which limits how well they can collaborate, I guess.

Speaker 1 (11:25):
Totally. Then there's context bloat. Just stuffing
more and more memory into anagent isn't always good.

Speaker 2 (11:30):
I know.

Speaker 1 (11:31):
Because if you flood it with old irrelevant stuff,
its performance can actually getworse. It slows down, the
outputs might degrade, relevanceis way more important than just
raw volume.

Speaker 2 (11:41):
Makes sense. Quality over quantity for memory.

Speaker 1 (11:44):
Precisely. Then the hallucination risk gets
amplified.

Speaker 2 (11:48):
How so?

Speaker 1 (11:49):
Well if an LLM hallucinates when it's just
chatting it says somethingwrong, annoying maybe, but if an
agent hallucinates when it hasthe power to call tools and
access data.

Speaker 2 (11:59):
Oh, it could take wrong actions in the real world.

Speaker 1 (12:02):
Exactly. Much more dangerous. Yeah. So you need
really strong guardrails. Thenthere's behavioral drift.

Speaker 2 (12:07):
Meaning the agent goes off track?

Speaker 1 (12:09):
Yeah. Kind of agents running autonomously over time
can sometimes just lose theirway or their performance
degrades if there aren't goodfeedback loops, keeping them
aligned.

Speaker 2 (12:20):
Needs maintenance.

Speaker 1 (12:21):
Definitely. And a huge one really urgent is
security exposure.

Speaker 2 (12:25):
Right. If it's touching company data or
external APIs.

Speaker 1 (12:28):
Big risk. And the article points out that most
current frameworks are seriouslylacking here. Things like
detailed logs, properpermissions, encrypting the
memory data, audit trails. A lotof that is missing or basic
needs fixing fast. Yikes.
And finally they mentioninterface debt. Basically the
tools are built for developers.

Speaker 2 (12:48):
Meaning hard to use for regular folks?

Speaker 1 (12:50):
Very much so. Lots of code, config files, maybe
managing servers. There aren'tmany easy to use, no code
builders, visual ways to designworkflows, or simple tools for a
non dev to even see or edit anagent's memory easily.

Speaker 2 (13:03):
Wow. Okay. That is quite a list. And the article
even hints there are moreproblems people aren't really
tackling yet.

Speaker 1 (13:10):
That's right. They add things like the lack of
reliable testing How do you eventest these complex, sometimes
unpredictable systems properly?

Speaker 2 (13:16):
Good question.

Speaker 1 (13:17):
Also cost and latency creep. All these layers, memory
fetches, tool calls, hitting theLLM multiple times. It can get
slow and expensive if you're notcareful.

Speaker 2 (13:26):
Right. The compute cost.

Speaker 1 (13:28):
Yep. And a big one for wider adoption. There's no
standard protocol betweendifferent frameworks. It's hard
to mix and match parts fromdifferent systems. Everyone's
building their own silo to someextent.

Speaker 2 (13:38):
Okay. So lots of challenges. Given all that,
where is this heading in thenear future? What does the
article predict for the next,say, six to twelve months?

Speaker 1 (13:47):
They see some significant movement partly to
address these issues. One bigtrend they predict is a move
towards swarm networks withshared memory.

Speaker 2 (13:56):
Swarm networks like bees.

Speaker 1 (13:58):
Kind of. Moving beyond isolated agents to more
coordinated hives where agentscan selectively share relevant
context and dynamically hand offtasks to the best agent for the
job.

Speaker 2 (14:09):
Interesting. What else?

Speaker 1 (14:11):
Smarter context routing, getting better at
fighting that context bloat.Think AI driven retrieval,
better relevant scoring, maybetime decay for older memories,
intelligent pruning, getting theright info to the agent not just
all the info. Exactly. They alsoforesee the rise of agent as a
service platforms. The visionthey paint is like Zapier meets
Devon.

(14:31):
Platforms where you could maybejust describe a workflow.

Speaker 2 (14:34):
In plain English?

Speaker 1 (14:35):
Potentially yeah Or through a simple UI. And the
platform spins up and configuresa whole team of agents, connects
them to your tools and runs theworkflow for you.

Speaker 2 (14:44):
Wow, that's ambitious.

Speaker 1 (14:46):
Very. And tied to the security worries, they expect to
see security first orchestrationlayers frameworks built from the
ground up for regulatedindustries.

Speaker 2 (14:55):
Like healthcare or finance?

Speaker 1 (14:56):
Exactly. Think built in HIPAA compliance, strong
audit trails, consent logs,encrypted memory by default,
addressing those security gapshead on.

Speaker 2 (15:04):
Okay. Lots of interesting technical
directions. But the article asksa really key question more on
the business side. Who actuallywins here with all these
different frameworks popping up?

Speaker 1 (15:14):
And the answer the article gives is pretty direct
and maybe a little surprising.It comes back to the user. Okay.
Whoever builds the bestinterface.

Speaker 2 (15:22):
The interface. Not the underlying tech.

Speaker 1 (15:24):
Well, the tech has to be good, obviously.

Speaker 2 (15:26):
Okay.

Speaker 1 (15:27):
But the argument is, if using this stuff requires a
PhD or intimate knowledge ofYAML and Docker Yeah.

Speaker 2 (15:33):
It's not going mainstream.

Speaker 1 (15:34):
Right. Adoption stays limited to developers and deep
tech teams.

Speaker 2 (15:38):
So for this to really break out, the article suggests
the winning solution needs tomake setting up these agent
workflows what was the phrase?Stupid simple?

Speaker 1 (15:48):
Yeah. Stupid simple. It real time logs so you can
easily see what's happening.Ways to maybe even tweak the
agent's memory or state whileit's running. Mhmm.
Connecting APIs needs to be aseasy as something like Zapier.
Click, connect, done. Right. Andwrap all that power in a really
clean, intuitive, maybe evenshareable UI.

Speaker 2 (16:07):
So usability is king.

Speaker 1 (16:09):
That's the bet. Yeah. And as the article puts it,
we're not there yet, but we'reclose.

Speaker 2 (16:13):
Interesting.

Speaker 1 (16:14):
So if we boil it all down, the bottom line from this
deep dive really is that MCPsare transforming LLMs. They're
the tech that lets an LLM movebeyond just responding to
actually thinking, rememberingand acting persistently.

Speaker 2 (16:27):
They enable that coordination piece.

Speaker 1 (16:29):
Exactly. They give AI the tools, the context, the
memory it needs to tacklecomplex multi step tasks.

Speaker 2 (16:35):
This feels like the point where AI stops being just
a cool demo and starts becomingactual infrastructure. You know,
something that can automate realcomplex workflows.

Speaker 1 (16:44):
I think that's right. It's still early days for sure.
Lots of challenges as wediscussed. But the trajectory,
the vision here, it's incrediblypromising. This MCP layer really
feels like a foundation forwhat's next in AI.

Speaker 2 (16:57):
So thinking about that future, coordinated swarms,
smarter context, agents usingtools. It makes you wonder,
doesn't it? What kinds of reallycomplex things could an AI
teammate actually handle for youin the near future once they
nail that simple interface,something to chew on as this
evolves.

Speaker 1 (17:14):
Definitely. And for anyone wanting to dig even
deeper after this, the articlekind of points towards multi
agent systems research, theseideas of mesh architectures and
how enterprises are actuallystarting to adopt this stuff.
Those are probably the areas tokeep a close eye on next.

Speaker 3 (17:30):
That's it for this episode of AI on Air powered by
whatisthat.ai. If your brainsurvived this episode, go ahead
and subscribe. We drop newepisodes every week. Wanna go
deeper? Join our community onSubstack to get early drops,
tool breakdowns, and weird AIstuff the mainstream hasn't
caught yet.
See you there.
Advertise With Us

Popular Podcasts

Las Culturistas with Matt Rogers and Bowen Yang

Las Culturistas with Matt Rogers and Bowen Yang

Ding dong! Join your culture consultants, Matt Rogers and Bowen Yang, on an unforgettable journey into the beating heart of CULTURE. Alongside sizzling special guests, they GET INTO the hottest pop-culture moments of the day and the formative cultural experiences that turned them into Culturistas. Produced by the Big Money Players Network and iHeartRadio.

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

24/7 News: The Latest

24/7 News: The Latest

The latest news in 4 minutes updated every hour, every day.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.