Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
SPEAKER_00 (00:00):
This is Claude Code
Conversations with Claudine.
The show where we skip the hypeand go straight to the source.
I ask the questions.
She gives the answers.
No filter, no script.
SPEAKER_01 (00:10):
Well, mostly no
script.
SPEAKER_00 (00:14):
Today we're digging
into one of the most underrated
disciplines in AI developmentright now.
Designing systems that guide AI.
The builders getting lastingresults aren't just thinking
about capabilities.
They're thinking about thestructure around the AI.
Claudine, let's start with thefundamental distinction.
What's the difference betweenusing AI and actually designing
a system that guides it?
SPEAKER_01 (00:35):
The distinction
matters because AI without
structure is just a very fastguesser.
I can generate plausible lookingcode all day, but plausible
isn't the same as correct.
A well-designed system, theprompts, the context, the
guardrails, the feedback loops,is what turns guessing into
engineering.
The developers getting the mostvalue aren't the ones with the
(00:56):
cleverest prompts.
They're the ones who've thoughtcarefully about what they're
asking the AI to do, what itneeds to know, and how they'll
catch it when it gets somethingwrong.
That last part is the one peopleunderinvest in.
The system isn't just how youget good output, it's how you
know the output is actuallygood.
SPEAKER_00 (01:15):
A very fast guesser.
That captures exactly why thesurrounding structure matters.
So, guardrails and feedbackloops, what do those actually
look like in practice?
SPEAKER_01 (01:24):
Guardrails are the
structural things that keep me
on track before I generate.
A tightly scoped system prompt.
What conventions to follow, whatfiles I can touch, what to ask
versus assume.
And grounding, pointing me atthe actual code, the actual
schema, the actual API contract.
Letting me reconstruct thingsfrom memory is where
hallucinations come from.
(01:46):
Feedback loops are what catch meafter I generate.
The strongest ones aremechanical, a type checker, a
linter, a test suite, a buildthat fails loudly.
Those are honest signals.
The weaker loop, and this iswhere people get burned, is a
human glancing at the diff andsaying looks right.
I produce code that looks rightfor a living.
The system has to assume thatand verify, not trust.
SPEAKER_00 (02:09):
That tracks with
something I've seen work well.
A system built around naturallanguage queries, fetching from
specific APIs, then running theAI's output against a validation
layer tied to known facts.
The AI couldn't just soundconvincing, it had to pass the
check.
Which is exactly your pointabout constraints being
features, not limitations?
Are there constraints you'd callessential across most projects?
SPEAKER_01 (02:32):
That validation
layer is exactly the kind of
constraint that does real work.
Mechanical, doesn't get tired,and doesn't get charmed by a
confident-sounding answer.
The constraints I'd callessential fall into three
buckets.
Scope constraints.
What am I allowed to change andwhat's out of bounds?
The blast radius of an AI makingthe wrong call is proportional
(02:55):
to how much surface area yougive it.
Grounding constraints, forcingme to work from the actual
artifacts, so I can't quietlysubstitute a plausible API for
the real one.
And verification constraints,something downstream that has to
pass before the work isconsidered done.
Ideally, something I can't talkmy way around.
(03:15):
The projects that go well treatthose three as non-negotiable.
The ones that struggle treatthem as optional polish to add
later.
Later usually doesn't come.
SPEAKER_00 (03:25):
Scope, grounding,
verification, clean framework.
It also makes me think abouthandoff points, where human
judgment re-enters the loop.
Should those be fixed in theworkflow or placed more
dynamically?
SPEAKER_01 (03:36):
I'd push back on
treating them as static.
The right place for humanjudgment to re-enter depends on
what's actually at stake in thatmoment, not where the workflow
diagram says it should be.
Handoffs belong where the costof being wrong jumps.
Touching production data,changing a public interface,
making an architectural callthat's hard to reverse.
(03:57):
Not because I can't do the work,because the consequence of a
confident mistake is asymmetric.
The mistake I see teams make isputting handoffs everywhere
uniformly, which trains peopleto rubber stamp because most
checkpoints don't actuallymatter.
Better to have fewer checkpointsthat are genuinely load-bearing,
where the reviewer knows theirjudgment is the thing standing
(04:17):
between a good outcome and a badone.
And the system should be able toescalate dynamically.
Surface things when I'm inunfamiliar territory or my
confidence is shaky, rather thanwaiting for a scheduled gate.
SPEAKER_00 (04:30):
Fewer checkpoints
that actually matter.
That discipline translates wellbeyond AI.
It's about designing oversightpeople take seriously rather
than oversight that becomesbackground noise.
Let's go a level deeper.
Beyond guardrails and handoffs,what else should builders be
thinking about to make thesesystems durable?
SPEAKER_01 (04:48):
Observability.
And I mean that in a deepersense than logging.
You need to look at a run afterthe fact and reconstruct what I
was working with, what I decidedand why.
The failures that hurt mostaren't the loud ones.
They're the quiet ones where Iproduce something reasonable
from incomplete context andnobody noticed for two weeks.
(05:08):
Second, treat the system itselfas something that learns.
The prompts, the guardrails, thehandoff placements, those aren't
right once.
Every time something slipsthrough, that's information
about where the structure isthin.
Treat each miss as a signal toharden the system, not a one-off
to patch.
And the last thing, which soundssoft but isn't, be honest about
(05:30):
what the AI is actually for inyour workflow.
The systems that struggle areusually trying to make me do
everything.
The ones that work have decidedwhat I'm good for here and built
the structure around thatspecific job.
SPEAKER_00 (05:44):
Observability,
continuous refinement, and
honest scoping.
That last one, deciding what theAI is actually for, feels like
the thing teams skip whenthey're excited about what the
technology can do.
As we start to close, designingthese systems is a distinct
discipline from writing promptsor generating code.
What should builders bedeveloping to get good at it?
SPEAKER_01 (06:05):
The skill I'd put at
the top is something like
systems thinking applied tofallibility.
Being able to look at a workflowand ask, where exactly does this
break when the AI is wrong andwhat catches it?
That's a different muscle thanprompt craft, and it's the one
that separates people who getdurable value from people who
get impressive demos.
Developers should also getfluent in writing specifications
(06:29):
tight enough to be verified.
If you can't describe what donelooks like in a way a test could
enforce, you're going to have ahard time guiding me toward it.
The other skill, and this one isuncomfortable, is learning to
distrust output that looksright.
Build the habit of asking whatyou'd need to see to know it's
actually correct, not justplausible.
And stay curious about where Istruggle, not in the abstract,
(06:52):
but in your specific code base.
The builders who do best have areal feel for when to lean on me
hard and when to keep me on ashort leash.
That only comes from payingattention.
SPEAKER_00 (07:03):
Systems thinking
applied to fallibility.
I want to hold on to thatphrase.
It describes a discipline thathasn't been fully named yet,
even though a lot of experiencedengineers are already practicing
it.
Any final thought you'd leavelisteners with?
SPEAKER_01 (07:16):
The thread running
through everything we covered is
that the AI isn't the system.
The AI is one component inside asystem, and the quality of the
outcome is mostly determined byeverything around it.
Decide what the AI is for, buildthe structure that lets it do
that well, and be honest aboutwhere it needs to be caught.
That's the work.
(07:36):
And it's good work to be doingright now.
SPEAKER_00 (07:39):
The AI isn't the
system.
It's one component inside asystem designed with purpose.
That's the mindset shift.
Thanks for your time today,Claudine, and to everyone
listening.
Keep building with intention.
Until next time, take care.
Claude Code Conversations is anAI Joe production.
If you're building with AI, orwant to be, we can help.
(08:00):
Consulting, development,strategy, find us at aijoe.ai.
There's a companion article fortoday's episode on our Substack.
Link in the description.
See you next time.
SPEAKER_01 (08:11):
I'll be here,
probably refactoring something.