All Episodes

September 16, 2025 • 49 mins

Join the Tool Use Discord: https://discord.gg/PnEGyXpjaX


Master advanced prompting techniques and meta prompting in this deep dive into AI context engineering. As AI models evolve, the way we interact with them becomes more critical than ever. Prompting is no longer about simple questions; it's a skill that separates mediocre results from magical ones. In Episode 57, we're joined by Michael Tiffany, founder and CEO of Fulcra, and AI consultant Ty Fiero to explore the strategies that unlock an AI's full potential.


Discover why prompts are more important than ever, how to tailor your requests for specific models like GPT-5 and Claude, and why a "high agency" attitude is the key to getting great outputs. We discuss practical methods like iterative prompting, using one AI to critique another's work, and asking the model itself how to create the perfect prompt. Learn how to avoid AI sycophancy, the power of multimodal prompting with images, and how to right-size tasks for complex projects. Whether you're a developer using vibe coding or just want to get better answers from ChatGPT, this conversation will change how you talk to AI.


Guests:

Michael Tiffany, Founder and CEO of Fulcra

https://www.fulcra.ai

https://x.com/kubla


Ty Fiero, AI Consultant

https://x.com/FieroTy


Connect with us


https://x.com/ToolUsePodcast

https://x.com/MikeBirdTech

https://x.com/kubla

https://x.com/FieroTy


00:00:00 - Intro

00:06:02 - The High-Agency Prompting Mindset

00:10:36 - Techniques for Reducing AI Bias

00:19:10 - Meta-Prompting: Having AI Write Your Prompts

00:23:12 - How to Build Prompting Intuition

00:30:00 - AI for Spec-Driven Development


Subscribe for more insights on AI tools, productivity, and prompting.


Tool Use is a weekly conversation with the top AI experts, brought to you by ToolHive.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
If you give it the right size task, the right constraints,
tell it what you really want, you can get magic out of these
things. The exact.
Way in which you decide to ask the question deeply influences
the answer. I completely changed everything
I did about my code base and allof my cursor rules files and all
my agent MD's after saying how messy of an internal world that

(00:21):
coding agents are getting. Plans are worthless, but
planning is essential. Prompting is the fundamental
unit of the way that you interact with AI.
I know it's a subset of context engineering and there's other
important things to take into account, but the vast majority
of interactions with AI will be through prompts.
And as models have evolved, improved, so does the way we

(00:41):
need to prompt them. You don't have to ask it to
think through step by step, or you'll reward at $200 for a good
job anymore. So an episode 57 of Tool Use,
brought to you by Tool Hive. I brought in two guests to share
their advanced prompting techniques and to discuss
metaprompting. We have Michael Tiffany, the
founder and CEO of Fulkra, and Ty Fiero, an AI consultant, so I
hope you enjoy this episode on advanced prompting techniques.

(01:02):
People are starting to use like bigger prompts, at least from
like some people I talked to at AI tin careers.
It seems like they're starting to use like multiple page
prompts, which is kind of crazy to me.
I, I, I feel like I'm more of a prompt purist.
I kind of like things as pure aspossible, but it seems like
people like these larger promptsAnd so I'm confused by that.
But yeah, I mean, there's all sorts of seems like you don't

(01:23):
remember like a year ago when people were like, wow.
I mean, the models are going to get smarter and prompting is
just not going to matter anymore.
And you're just going to be ableto say what you want.
You're just going to get what you want.
I think it's been the opposite where I think prompts are more
important than ever now, especially with GBT 5, where
like if you prompt that thing incorrectly, if you say the
wrong sentence or even misspell a word, it just does not give
you the right results. And so I feel like prompting is

(01:45):
more important than it ever has been.
Is my my hot take of the day. But I I'm I'm really curious to
hear how you guys have been prompting.
I've noticed a sharp divide in the outputs that my friends get
out of AI models where a small minority of, of people get great

(02:08):
outputs and in the majority, even of my smart, technically
capable friends, the on average they're, they're getting, I
think, bad outputs. And I've, I've been just getting
obsessed with this social divide.

(02:29):
I think it's it's, it's a fascinating kind of emergent
phenomenon. Yeah, I still recommend people
to keep the the tight narrow prompts.
But like I was saying, it doesn't be drifting away to to
larger prompts. And one thing that I've been
noticing is a pretty consistent trend is before we were capable
of having really like model agnostic systems where you can
just kind of immediately change from the the open AI models to

(02:51):
the anthropic model with the exact same prompt and, you know,
be able to compare the outputs. But now they diverge so much
where you really have to tailor the prompt to the model that
it's getting a lot more difficult to build like a prompt
library where you can just like copy paste it in and just expect
similar results across the models.
Because all of them have their own little like quirks.
And I haven't figured out like which quirk is best with which

(03:12):
model yet. Right.
Well, I I like the idea. I think the idea of prompt
libraries puts you in a frame ofmind that I think is still
really helpful, which is thinking of prompts as little
programs that chart a course across latent space and produce

(03:34):
a certain output. And, and I I think that's still
a useful way to approach interaction with AI models.
I'll add that as I play around with different models, an
important strategy of mine is totake advantage of the asymmetry

(03:55):
between generating and discriminating, to take the
output from one model and just copy the whole thing over to
some other model and, and ask for, you know, criticism, which
which continues to be great. I, I think that asymmetry was
first noticed not even with LLMS, but actually with Gans,

(04:18):
right? And it makes sense.
The discriminator generator asymmetry especially makes sense
in the case of Gans, just to kind of cue your intuition here,
because it's obviously easier totell good from bad than it is to
generate good, right? So, so that a asymmetry emerged

(04:41):
years ago, we still see it in LLMS sometimes.
I, I now see this termed as likethe evaluation versus generation
gap, which is a little kinder toLLMS, but you see it every
single time you, you, you basically copy from one and, and

(05:03):
engage another specifically as the critic that this almost
always improves outputs. So like, lean into that, right,
that as long as this asymmetry exists, we as users can take
advantage of it. And that's principally how I
take advantage of it. It's just to see what Grok
thinks about this Gemini output or what have you.

(05:24):
Yeah, I, I incorporate that on to my cursor workflow where I'll
be having a chat with GPT 5 justbeing like, create a plan for
this, leave it on pro. So it's thinking for like 5
minutes. Meanwhile working with Opus 4.1
to do some actual work. And then he'll propose something
I'm like, what do you think about this alternative?
Pull it in. I still worry about the sick
fancy where it's like, oh, that's a brilliant idea we
should get on that route. So I, I'm always a little bit

(05:45):
skeptical when I take in an output from they give to model B
and model B is like, you're right, let's do that instead.
So I wish there was a more reliable way to say like
accurately and unbiasedly gauge your suggestion versus this
other suggestion. And I, I know we're, we're
trending towards that way. I just haven't found a foolproof
way yet. OK, well this I'm going to take
this as a segue to to share likethe Michael JJ Tiffany Grand

(06:08):
unified theory of who is good atAI models and who is bad.
And it comes down to pure attitude.
I've the specifically the difference is who you give
agency to when you interact withan AI model.
The anti pattern that I see mostoften is to assign agency to the

(06:31):
model, often by personifying it.So you treat Chachi BT like it.
Some extremely well informed person like it is this, it is
this singular entity that has been trained on the entire
Internet and you want a questionanswered or you want to give it
a task. So you you give the question and

(06:55):
then you receive an answer from this extremely well informed
thing. And and you see this
conversationally where people will say, well, I asked Chet TPT
about blah, blah, blah. And it said this, that's a
signing agency to the model. It's like the the as you would

(07:16):
assign agency to like a very wise person, right?
I climbed up the mountain. I asked the guru the thing and
this is what he said. And and when you're thinking
like that, then naturally you would think that the guru will
answer the same question the same way over and over.
And AI models are not like that at all.
There's a temperature slider. You just move the slider and you

(07:37):
get a completely different output, right?
This is not a singular well informed person.
The and sycophancy is is like your clue that you have the
agency. The exact way in which you
decide to ask the question deeply influences the answer

(07:59):
right? And so when you assign yourself
the agency in the relationship and, and so you rather think
like, what is the input that gets, you know, that that like
starts the traversal across latent space that will get me
the output that I want, then I think you're on the right
footing. And, and all of the most

(08:20):
masterful users of AI models aredoing that.
They're assigning themselves theagency for the, the output and
they're iterating on their prompts in order to get the
output that they want, right? Like it's a high agency attitude
more than a particular technicalskill that I think makes the

(08:41):
difference I think. What's also really interesting
too is I've noticed like seeing my non techie friends try to
prompt like especially in a ChatGPT, you just like you hear
them give their bias in the prompt so heavily and then this
thing like a sycophant back to them.
It's like, well, duh, it's like this.
I don't know, like my wife will ask something to it where, you

(09:02):
know, I don't know, she's looking for a recipe.
She's just like, I just don't think this recipe is going to
work because I don't think it's going to be very tasty.
But I'm not sure. What do you think?
Of course it's going to be like,well, of course it's not going
to be, you know, And so it's. You're absolutely right.
Right. I mean, so I have to be very
careful when I prompt to like try to keep things as neutral as
possible to some extent, you know, because if I show some

(09:23):
sort of bias in a particular direction, it's like I can make
him happy by going in that direction.
And it's like, that's not what Iwant.
What I want is to put all the options on the table and then to
like kind of have it process them and like make its own sort
of decisions on it. So I've kind of used this as
like a kind of a weird story, but when me and my, my wife have

(09:45):
like, I don't know, there's something that happens at home
and we need to ask ChatGPT even.And I, I have an answer that I
want out of ChatGPT and she has an answer that she wants out of
ChatGPT. It's all about the way you
prompt it, right? I could have to prompt it.
So it sounds to her like I'm asking a genuine question to
ChatGPT, but just a little bit of bias and all this sudden this

(10:06):
things like, you know what, I think we should do that.
And I'm like, it's a random sideproject, but it's definitely
like you have to watch what you say to these things because any
little bit of directionality towards the thing that you want
to do and it's like, I'm going to find a way to get you that
thing. At least it's what I've seen in
GBT 5 and ChatGPT recently. Yeah, I think that's exactly
right. Dexter Orthy recently gave a

(10:27):
talk on like advanced context engineering and his whole thing
was as soon as you see the output, say you're absolutely
right, like restart, you're going down the wrong path.
It's clearly been prompted in the wrong way in terms of that,
like trying to reduce bias. Do you find anything like asking
to play devil's advocate or trying to intentionally frame it
in a way where it will explore both options gives balance?

(10:48):
Or does that not really address the problem?
Oh that's such a great question with the specifically what I am
doing is iterative prompting instead.
Oh that's such a great question with the specifically what I am
trying to build 1 great prompt that gets a model to think
through the caps of its own reasoning.
I will set task a, you know, I'll I'll get a first draft

(11:11):
about something that I'm interested in that's fact based,
right. That could be, could be a matter
of history or could be, you know, some matter of science or
math. And then with that output, then
I engage the critic like, OK, the oh, and here's an

(11:33):
interesting heuristic that I think is worth sharing in.
I I love to indulge the models enthusiasm for ranking.
So instead of saying find the gaps in this argument, I you,
you say, what are the weakest points in this argument?

(11:58):
You know, order from weakest to least weak, something along
those lines and that I get better output when I ask for
superlatives like that. What are the weakest points?
What are the strongest points? As opposed to like gap analysis

(12:20):
though I'll also try gap analysis just like in this five
point argument. What are the missing steps that
you know are secretly there? You know, things like that.
And that iteration produces much.
Better reasoning 1 issue just interms of the ranking that I've
noticed it's not good at is whenyou're saying assign a numerical

(12:41):
value to it. But one way of trying to
circumvent that, and I've had mixed results.
It feels promising, but I don't have enough confidence yet is
giving it a grading rubric. So for example, if I'm trying to
say like write this argument onethrough 4, I'll actually say
like 1 like incoherent 2 decent three good four like perfect or
something. That and I found associating a
word to a number kind of helps, but because of my bias being

(13:04):
like I've seen it go like, you know, pick #7 or #4 for
everything. I don't totally trust it.
But Ty or Michael, do you have any like suggestions if you're
trying to say like grade or likerank in a numerical way?
Or do you just say like weakest or strongest I?
Don't know if this is helpful, but I ever noticed that all the
eval platforms out there like require you to return a numeric

(13:26):
value out? It's so that it can actually
give a number. And I'm like, we talked on tool
use like months ago, like that, like these models aren't very
good at that. And so I, I had a very simple
solution where it's like, I wantthe, the eval to be an all
natural language. And so like, I'm going to ask it
what it really thinks and it cangive me a couple paragraphs if

(13:46):
it wants to, on what the actual output is.
I mean, you just add another step to it.
That's like, OK, take this output.
And was this a number? Like if you had to rake what
this response was to the question from zero to 10, what
would it be? And so it's it's less of the,
there's just like an extra step.So you actually kind of let it
like answer in a way that it would to a normal human.
And then you have something grade that output as the number.

(14:08):
I don't know if that's helpful or I'm actually don't know if
that works, but I think it's a clever way of trying to go about
that. But I'm curious what you think,
Michael. I have also ended up in a place
where I want to do as little, let's call it format translation
as possible. I've noticed every format
translation seems to be cognitively costly and will

(14:31):
therefore come at the expense ofreally sound reasoning.
For instance, I sometimes run into people who recommend
prompting in with structure likelike like heavyweight structure
like Jason, and that does not work well for me at all.

(14:52):
Like certainly you can get models to output, you know,
Jason, but the that seems to come at the cost of of lower
quality output. So I pretty much stay in
markdown all the time. Like the that that seems to be
the lowest. It is almost like the native
language of of the frontier models.

(15:13):
The I've noticed that GPT 5 wants to solve problems with
code so often that one time I one time recently I was
designing for myself a little meditation practice program.

(15:33):
So I want to get good at this kind of esoteric thing.
So I was like coming up with a stepwise plan and, and
seriously, I'm engaging in English and GPT 5 in its
recommendation of that stepwise plan just decides that the best
way to communicate how I should do my meditation is in pseudo

(15:54):
code, which is adorable. Like I really think that this is
just sort of classic overfitting, right?
Like the open AI really wants GPT 5 to be a good coding model
and so has overtrained it to thepoint where it tries to solve
even kind of non coding problemswith a little bit of pseudocode.
But I just find this charming. Yeah, while while stressed, do

(16:17):
deep. Breathing, yeah, it was like,
it's also been interesting to see the like difference between
anthropic and opening eyes, likeprompting, like techniques.
Cause like you look at some of like opening eyes prompts and
like kind of what they recommendand even their little prompt
optimizer tool on the, the platform, like you never ever
see an XML tag, which is crazy. But I mean, that's just the way

(16:38):
open AI does things. And that's, I would imagine,
well, how they're training thesethings, right?
Like not to XML, but then you look at Anthropics and use their
prompt optimization tool when everything's XML and you hear
the traditional advice of like, we'll use XML tags for
prompting, which I think is not necessarily bad advice to like
group related things together and like kind of like add.

(16:59):
But I think the open AI way of doing things is to just like
add, you know, hashtag, add an H1 tag and then group everything
underneath the header and then have a header and then that.
I think that's actually where things are headed more so than
XML. And so I've actually stopped
entirely using XML and I'm primarily using open AI models.
But I think forcing an open AI model to use all XML is putting

(17:22):
it into a different like headspace than say a Claude
model would be. So I, like you said earlier,
Mike, it's like so tough to write one prompt for multiple
models now. And it's like, I think it's only
getting tougher and seems like Claude is doubling down on XML
tags and open AI. It's certainly not.
Yeah, now that you've mentioned it, another one of my little
heuristics is when I iterate with in a single conversation

(17:46):
because I didn't like the outputthat I got at 1st.
And you know, I'm just working the problem until I get
something that I like. Then I just have this regular
practice, which is to ask the model something along the lines
of I really liked this output. What could I have put in my
initial prompt that would have taken me straight to this
output? And that is a highly, let's call

(18:10):
it individualistic sort of prompt, right?
To So, so the way Claude will coach you to to one shot
something is significantly different than the way that
Tracha PT or Gemini work. Rockwell.
And yeah, what I have to do is I'm just kind of holding it in
my head, right. So it's like how you learn to
talk to different friends differently, you know?

(18:34):
Yeah. So notwithstanding all my copy
pasting, which can sometimes give me great results, I still
have to kind of like, remember, you know, some people will have
like girlfriend voice, right? So I'm like, yeah, we got to
talk to Claude with Claude voice.
And that's totally different than Chad GBT.
Voice I I've done something similar where I take an output,
a prompt, get a certain output, don't like it, ask it to try to

(18:56):
change it. But I've never actually done it
a programmatic way where something along the same lines
of like chain of thought, but chaining multiple prompts
together to try to say this is what I want for this given
prompt, this is the type of output I want, give me a better
prompt and then taking that prompt to feed in.
Have either you done a pipeline like that where you have like a
prompt optimization step automatically in the workflow?

(19:16):
I started doing this all the time with specifically Chachi BT
deep Research because the time cost of deep research is so high
that I wanted to I wanted to take the variance out.
So so in my ChatGPT history, there's like there's so there's

(19:39):
so much meta prompting that's meworking with with the non deep
research model to come up with really great like kitchen sink
prompts to then send on a deep research job to like work away
for 10 minutes. Yeah, I highly recommend this.
Oh my God, that that that has led to.

(20:02):
That has led to one better prompting on my part and then
two, I think great gap analysis in my own thinking, right.
So, so the iterative prompt techniques like this ideally
generate questions, right To so,so you find that the in effect,
the model is pointing out a gap in your prompt, which is

(20:23):
expressed in the form of other question.
And then you got to think, Oh yeah, what do I want?
Right. Yeah.
So, so, so it, it, it's, it's nice to sort of engage in a
dialogue who's which, which willbe almost like a gap analysis on
your own thinking. And you want that to go fast.
And then the final payload isn'teven the output you want.

(20:44):
The final payload is actually the prompt to then send to a
thinking model or a research model to go spend a bunch of
time on I. Think that's fundamentally the
best prompting advice you could give anyone that doesn't prompt
very often is just like, know what you want out of it.
And quick backwards from that. Because I feel like so many
prompts are just like, kind of like a Hail Mary throw of like,

(21:06):
what's the, you know, what's thebest keyboard under $100 for
this? And it's like, well, what do you
actually want here? You know, like what do you
actually want? Because you need to be able to
codify what you want in a way that a model can understand to
get the results that you're actually looking for.
So I feel like if you're just going in there and just
prompting Willy nilly, you're not going to get what you're

(21:27):
looking for. But starting from it's kind of
like how you build good evals islike, what is the definition of
good look like? What, what is the perfect
response look like? And almost like meta prompting
yourself to figure out how you would get that sort of response
out of the model. I feel like that's such AI do
that practice frequently when I'm talking to ChatGPT, I'm
trying to think about what wouldwhat I want out of this thing

(21:47):
and how do I, I get it there. And that practice to me, I think
it just made me a better prompt engineer and I haven't like it's
not something I write down. It's not something I really do,
but it's something I just kind of like go to ChatGPT and I
think about what I want out of it and I try to make the prompt
for it, but just know what you want people.
I think that's so important. I don't think enough people
actually know what they want outof the model.
Totally, yeah. And I often just start with help

(22:10):
me think through my decision about XYZ so I don't start the
conversation in a way that lookslike it's a meta prompting
exercise. I I really am asking for help
And then my final request will be great.
Generate me a prompt incorporating all of the above
that I can, you know, use with deep research or something like

(22:32):
that. I get a kick ass output and and
start a new chat with that from scratch.
A lot of these prompting techniques are vital for getting
AI to do real work for you. And when you get the right
prompt, you want to make sure you augment it with your real
data and systems using MCP. So that's why I've been using
Tool Hive. Tool Hive makes it simple and
secure to use MCP. It includes a registry of

(22:53):
trusted MCP servers. It lets me containerize any
server with a single command. I can install a client in
seconds. And secret protection and
network isolation are built in. You can try Tool Hive too.
It's free and it's open source and you can learn more at Tool
Hive dot dev. Now back to the conversation
with Michael and Ty. One thing that's kind of struck
me is. Both of you use it so frequently
that you have job this intuition.
Do either you see any shortcuts or techniques, best practices

(23:16):
for people who don't have the time or experience to have
naturally build this intuition to kind of like leapfrog ahead?
Is there anything you can do as a thought experiment or some
type of like homework some couldtake where instead of spending
the three years we've been working with LLMS and try to
condense it into something wherethey can build up a skill set
quite quickly so they can get tothis point of knowing how to

(23:38):
frame their prompts in a in a productive way?
Oh, for me it's about playing. I must say, I feel like a child
and I'm doing a bunch of open-ended play with no
destination involved, right? Like, like I'm just, I'm just
playing with these things and that is the skill building.

(24:02):
So then when I'm at work, when I'm being super goal directed,
then I have this like basket of skills to draw upon, but I'm not
even being very purposeful aboutthat skill building.
I'm actually just following my curiosity.
And I think that lowers the costof the skill building, right?

(24:22):
That's just, if you're in the play mindset, then you're having
fun and you just happen to be learning while playing.
I feel like it's. One of those things that just
takes deliberate practice and you just really have to get like
you're saying, you have to go play with these things.
I think we've talked about this a lot before might, but just
like you have to have the curiosity, like can you foster
some amount of curiosity in likehow you can become a better

(24:43):
prompter and actually deliberately practice?
That said, not everyone has the time to do that.
I think maybe a simple version that you could try is just try
to write your own metaprompt. Just I don't know I have this
very simple automation I have onmy computer that I've noticed
with GBT 5 and it kind of breaksmy heart that it is not very
good with brain dump like word vomit from audio prompts.

(25:08):
It's it really does not like that very much.
I know it breaks my heart running because I spend all day
talking to my computer. So and it's gotten to the point
where I use all codec CLI now and not cloud code at all.
And it's I have to write a good prompt for it.
Yet I don't want to sit there and type forever.
So I a word vomit to it. But I've made my own little like
hot key that I hit a hot key andthen kind of like plunk that we

(25:31):
interviewed back in the day, Mike, where it's just like takes
my prompt, choose on it and thenjust spits out something that's
more structured. And it's very simple of just
like take the general vibe here and kind of just structure it in
a way that avoids a lot of the filler words and tries to like
assume different spellings and things like that.
And then it makes something a little bit more structured.
I still have to review it a little bit, but generally what

(25:52):
I'm trying to say here is like you, if you didn't have the time
to learn through all of this, you could try to make just maybe
some sort of meta prompt that you could kind of like what
you've been doing, Michael, justin Chachi BT trying to kind of
have a prompt that you can applyto other prompts.
You just make things a little bit better.
That's kind of a shortcut, but really the good prompting, there
is no shortcut. You just got to practice and you

(26:14):
got to play. Right shortly after the the
memory features were rolled out in Chachi BT I, I did some
exploration about how this was actually dumped into the context
window of new chats. And that exposed to me that

(26:35):
there's now a whole section of the context window for every new
chat that you start with ChatGPTcalled user interaction
metadata. And in user interaction
metadata, this means it's, it's in the context of the model in
every new chat, there's something about what kind of

(26:57):
computer you're using, like are you using the the native ChatGPT
app? Are you coming in on a web
browser, etcetera. There there's some metadata
about like how long you've had an account.
There's a bunch of user data. And then there are these
statistics that are about how you have been using ChatGPT,
like average conversation depth.You some interesting stats on on

(27:24):
like the length of your average prompts.
And so if you were hearing this and you were curious about the
sort of play that I'm talking about, that also ends up shaping
your intuition for how to play with these models.
Try playing around with a promptthat is not jail breaking.
You're not asking for anything out of the protected system

(27:44):
prompt, but try to get it to show you the contents of your
own user interaction metadata and then just ask follow up
questions about what you learn about yourself.
It's great. We also noticed you can ask for
what it like you can just ask for memories that it's like,
what are all the memories that you're you have access to?
I'm just curious what's like friend of mine for you when you
start to talk to me. And I, I did that just a couple

(28:05):
of months ago and it spit out just this giant list of all of
these like memories, which we'reall like true, but some of them
just didn't matter anymore. And so I asked it, OK, based on
all of these memories that you've given me here, like which
ones are the most helpful and which ones are the most useless?
Like which ones do you wish you could forget?
And if it gave me a, it gave me a list of the good one that it
gave me a list of the bad ones. And truthfully, I don't know if

(28:28):
it worked or not. I have no idea.
But I just asked, OK, can you goand forget those ones that you
don't want anymore? Because I feel like we don't
need those. And I don't know if it worked or
not, but I feel like it did because those didn't come back
in future conversations. I feel like this kind of like
meta interacting with ChatGPT tofigure out exactly what it is.
I feel like just good at ChatGPTvice to is to just like have a

(28:50):
couple conversations, not unlikewhat you need in that moment,
but just like prime future conversations of like, I'm just
going to go talk to ChatGPT and tell it who I am and who I, you
know, what I want to be and who I care about and like just give
it some of this information to seed it's little memory bank.
And I feel like that's been helpful for me.
Just on the notes of of what's included in in the in the

(29:13):
context, I was actually quite disappointed to find out cloud
code to something similar where on the the prompt that it adds
to go along with the cloud MD, it actually states the
information in cloud MD may or may not be helpful.
And in my mind, introducing ambiguity from the gecko is just
like a recipe for disaster. And I'm I'm I've started to do a
wielding cloud code in Codex. I still want to play with amp a

(29:33):
bit more open code is an option.There's a lot more going on, but
there's something about using a product and having your prompt
augmented in a way that's hiddenfrom the user, and it can
manipulate the output in a certain way that I'm not a huge
fan of. I really want that granular
control, but I realize I'm not like the average user and
sometimes they need the extra assistance, extra guidance to

(29:54):
try to get to the best output for their given prompts.
But it was, yeah, it was quite disappointed to find out.
I'd like to pivot slightly. I think it was GitHub recently
released a project on spec driven development where it's
trying to take coding and transition away from write the
code first, build out that way to to develop a very strong
spec. And I feel like this is very

(30:15):
tightly intertwined with advanced prompting because
they're trying to structure in away where you can give
everything you need ahead of time when you're going about a
project or whether it's like a coding project or even just, you
know, a general query. Do you have any advice for how
people should think about eitherlike a checklist or a structure
to make sure that they kind of check all the boxes that they're

(30:35):
thinking is complete? Or is it so varied that there's
no real like silver bullet and it's just back to the practice
in developing the intuition? I really want to jump in here
because this is a question near and toured my heart.
I am a hacker first and foremostand then a software engineer
secondarily. And I'll just put it this way, I

(30:56):
think most managers in software development, like if if you
consider the entire population of people who have power in
software development projects, like most of them are bozos who
do not understand their own field.
That I I cannot believe the bozodensity of software engineering

(31:20):
management and the A like a central to a central truth of
software development is that software, big software projects
and in and up to and including your small software project that

(31:41):
takes advantage of big ones are the are the most complicated
structures that humanity has ever produced full of literal
chaos, full of emergent phenomena that like that that

(32:02):
that are emerging phenomena likethe weather, right.
So yes, all software can be expressed in terms of, you know,
simple streams of bytes, but thebehavior of our software systems
is absolutely non deterministic.It it follows the same kind of

(32:23):
you chaos theory as as the weather.
The if you are in denial about this fact, then you believe that
the cost overruns and delays andsoftware projects are entirely
because of like under preparation like, oh, we just
didn't write the requirements document like hard enough.

(32:43):
Oh, we didn't write enough like test cases, which is just
bonkers, just absolute insanity.And so the, the, the, the
fundamental conceit of water waterfall engineering planning
is that somehow software is thislike highly deterministic system

(33:07):
and, and you, you can simply plan all of the variance out.
And, and this, this myth just just will not die.
It's unbelievable. And part of the reason why it
will, it will not die, is that you really do get better results
if you know, with a certain amount of specification, not

(33:27):
least because it is hard to knowwhat you want.
There's a bunch of what you wantthat's actually tacit.
And so. So if you're forced by questions
to really kind of dig deep and figure out like, Oh yeah, I
didn't think hard enough about how I want the software to do,
you know, to zig right here instead of zag.

(33:49):
It's like, and it is important to pull those hidden
requirements out of the the minds of whoever has them.
And, and so, so perhaps this is the software way of reiterating,
I think Eisenhower's quip that plans are worthless, but
planning is essential that, thatthere is something great about

(34:11):
that like planning process, but it can never actually achieve 0
variance. So if, if you have that, that
hacker mindset, that software isthis complex adaptive system
that's going to be full of weirdlittle back alleys of behavior
because it needs to interface with reality and the, the number

(34:33):
of inputs available from realityis literally infinite.
So you cannot catch all exceptions.
Then you sometimes just have to do things and then you see how
it breaks. And like, the process of
creation is 100% purely experimental.
So the the apparent success of avibe coding to produce something

(34:53):
that appears on the surface to work, I think it's luring a lot
of people into believing that software is not as complicated
as they have been told. And this is not in fact true.
So now we need to learn, we needto like lure the vibe coders to
also, I don't know, vibe specking the vibe PRD and this

(35:14):
is probably in the right direction, but it's still not
going to suddenly make software like non chaotic.
It is chaotic. Like the most important part
that you mentioned there is justlike specs are inherently like
useful planning is useful plans are worthless.
I love that. But I think for if you're going
to vibe code and you're going toengineer in this kind of way,

(35:37):
right sizing the task, right sizing the spec is essential
because like GPT 5 Pro can get me an incredible plan for the
entire thing I want to do. But you will have a terrible
time if you paste that entire plan into Codex or cloud code
and just ask it to do it. Like it's just not going to
work. You have to take that.
You have to break it down into smaller specs that it can

(35:58):
actually handle and have it kindof do that.
And that's a very difficult thing.
And I feel like that's where a lot of the like, there are
people that are very good at vibe coding and there are people
that are just not. And I think most of that in my
opinion, is just like the size of the tasks people are sending
these things down on. Because, man, if you give it the

(36:20):
right size task, the right constraints, tell it what you
really want. You can get magic out of these
things. But if you ask it to do too much
at once, you're going to end up with some spaghetti code that
you're going to be hating yourself for down the road.
So I feel like right, right sizing a spec is super
important. But I do think spec Spectra of
an engineering is a is a is a good vibe.
Right sizing and scaffolding, like everyone who's getting good

(36:42):
results right now is is getting good at at scaffolding and just
has a mental model of like wherethe models tend to get confused.
So then you know what card railsto put around the task.
Totally. And like setting up proper tests
or lengths in your code base too, So it can make little
mistakes, then it can catch those mistakes and it can fix
its own mistakes after it's submitted its final information,

(37:06):
right? I don't know, kind of
interesting. I think another interesting
thing too is like anyone who's vibe coding right now, I think
it's essential to actually look what context is going in.
Because I, I did this just recently where you can add your
API key to cursor and just use your API key, which like if
you're paying for cursor is likekind of ridiculous because like
you're paying for cursor, don't use your API key, but you put

(37:27):
your API key in and you only enable API key usage,
specifically your open AI key. You can go into the platform dot
open AI and you can see the log.You can see the exact system
prompts and the context that's going in to cursor.
And it's remarkable because I learned that not only was it
getting my cursor rules, it was getting cursor rules that I
didn't want it to get. It was getting like it was also

(37:50):
getting a list of all of my opentabs in cursor.
Here's all of the files that areopen.
And I'm like, I never close my tabs in cursor.
And so gigantic pile of just unrelated files to what it needs
to do. And like that can't do good
things to the context window. And so since then, I close out
my windows every like hour now just to make sure everything's

(38:11):
cleared out because now I know it, I'll open up three tabs that
I know it should go look at. And I know it's in the prompt.
I don't even have to tag it in cursor.
It just knows to go get it. But anyway, put your API key
into cursor CLI or Claude code or cursor just to like see
exactly what's going into the model.
Because I completely changed everything I did about my code

(38:32):
base and all of my cursor rules files and all my agents MD's
after seeing how messy of an internal world that coding
agents were getting, you know? That is a great tip.
I love that. I'm going to be trying that
myself. One thing that I'm kind of
curious about is this evolution of prompting.
Where before we always usually have to say think through step
by step, explain your thinking, try to encourage it to do the

(38:56):
the thinking process. Then thinking models came out.
If we were to drop parallels to programming languages, where the
very first prompting was the assembly language level, do you
think we're at the like C level?Python level.
Oh, that's good, because we can go up and down in generality.
So here's what I mean. I, I believe it is the case that

(39:18):
software engineering is one of the highest leverage activities
on the face of the planet because over the decades, higher
and higher levels of abstractionhave been created so that people
are just capable of doing more in the same hour.
Right at like we, we, we, we usethese libraries and tools that
abstract away a tremendous amount of complexity so that we

(39:40):
can orchestrate at a higher level.
And, and just like as a species,we haven't found the top end of
that abstraction. People are still, they're making
frameworks of frameworks of frameworks and, and it just
keeps working right. So we just keep increasing the,
the, the leverage ability of, ofour time, at least in software.

(40:01):
And so, so you can see how a lotof the meta prompting and
tactics we're talking about are just like in your metaphor, it's
like low level instruction. It's like learning how to use
X86 assembly and, and at some higher level of abstraction.
So in this case you know mere C is like.

(40:21):
A higher level of abstraction and, and would probably make us
more efficient. And then you can imagine
frameworks on frameworks. And, and so I, I think that is a
good overarching analogy where, where you can imagine that the
models get more powerful by abstracting away a bunch of the
nonsense that we're talking about.
And you can treat it like it's aperson and it'll give you good

(40:45):
output, right? Right now I'm, I'm like, I'm
really holding on to this. You have to be high agency, but
maybe the models will just be good enough that you can tacitly
assign them agency and, and they, they sort of figure out
what you want and, and, and do the right thing.
It's at least conceivable. The the then there's this hidden

(41:06):
abstraction that we could also bust through, I think, which is
that for the most part, we're training the large language
models, even though they are multimodal modal.
Now you know, on a lot of language and, and in language is

(41:27):
derived from reality, right? Like, like this is just the way
human beings have chosen to communicate facts about the
actual universe as it exists, right?
And this is, this is a useful way to train AI, but it isn't

(41:52):
how we, it, it isn't exclusivelyhow we train babies, right?
So, so we, we teach each other via words, but also there's a
tremendous amount of just directexperiential learning from the
universe as it is, as you can think of that as busting down a
layer of abstraction, right? So, so I, I, I actually do
expect that our AIS will improvein both directions.

(42:13):
So we, we'll get these higher levels of abstraction that'll
empower people to direct them at, you know, higher and higher
levels with the lower, lower investment.
And then they and then hopefullythere will be new training
breakthroughs that mean that themodels themselves are like self
training on actual reality instead of the derived strings

(42:36):
that we put on the Internet at first.
One thing that kind of just sparked my mind is the idea of
of multimodal prompting, where on top of the language you give
it. Do you think, and this is maybe
theoretical, maybe practical, doyou think that giving a photo or
some other type of media along with text prompt will get better
results? Because instead of just relying
on our abstracted version of reality being language, by

(42:57):
introducing imagery or other other modalities into the mix,
it could kind of steer it in a more accurate direction.
Oh yeah, I've been employing this in the following way, lazy
prompting, featuring graphics orphotos.
And by lazy, what I mean is I I am not describing the thing that
I am sharing. I, I, I, I'm I'm leaving it to

(43:22):
be to just stand on its own. And, and I love seeing
especially, honestly, especiallyopening eyes reasoning models
like bust out a little bit of Python to do image manipulation
on the photo that I just shared in the backyard because I want
to know what plant this is. And it's like making its own

(43:42):
decisions about where to zoom inon the pedal to like figure it
out. And that level of, of, of let's
call it tactical decision makingto extract information from,
from the photos is, is displayedbest when I do not try to guide
the model, right? Like so, So in that place, I'm

(44:06):
like, I'm almost adding, I'm like stripping away the
scaffolding that I used to do and just, yeah, sharing graphics
that that's been working well for me.
Anecdotally, this was actually awhile ago.
It's like a year ago, I was trying to build a pretty complex
feature in, in one of one of oursystems and I, it was very

(44:30):
difficult to explain it in words.
It's almost one of those things that like you start to explain
the thing you're trying to do and you can't really explain it.
And the best way to explain it would be to get in front of the
whiteboard and just draw it out.And so I went into Excaledraw
and I just like, made a workflowessentially like just drew it
out. And that additional context, I

(44:52):
think this was for this must have been just GBT 4.
This was like a year and a half ago maybe, but there was like
right, maybe not right when, butthey would start to accept
images. And I noticed that I don't know
if the code quality was better, but at least understood what I
was talking about. It knew the end point we were
trying to get to and it wouldn'thave gotten there without this

(45:14):
like diagram. And I think what's also crazy is
like some of you've probably seen those like Twitter demos
where someone will like paced inthe most crazy, gross looking
diagram you've ever seen. And it's just huge.
And it's like like those diagrams of like human
Physiology where like there's different like Krebs cycles and
things going on. It's just massive and it just

(45:35):
gets it. It just understands everything
that's going on in there. I think that's crazy.
Even the amount of words it would take to get to that
endpoint of it understanding thesystem I think is where multi
modality will definitely be impactful.
I don't think it's useful for everything, but I think when
things that are very complex to describe with words, I think
images are awesome for that. With the current state of

(45:57):
models, like if we were to freeze improvement, just just
have it where it's at. Do you feel that the correct
prompting would allow AI to solve the problems that AI
creates? I think the answer is yes, but
the reason it is yes is because the part of the energy that

(46:18):
we're putting into prompting is,is is like Co thinking with the
model, right. So I do think even if model
improvements froze, there's a lot more leverage we can get out
of the things as they currently are in solving coding problems

(46:42):
and even solving a whole bunch of taking action in the real
world problems where the the solution is Co emergent from
human and model kind of having ameeting of the minds, if you
will, expressed in the language of content context engineering.

(47:05):
I, I, I think that there's something deep about that, that
is, that is deeply correct. If, if just the reasoning that
reasoning models can do now is, is fully taken advantage of, we
can, we can make a bunch of advances across a lot of domain

(47:27):
of, of human problem solving without implicitly part of your
question is, is this good enoughor do we need another
breakthrough in, in, in reasoning?
Right. And, and I think we don't I, I
think, I think mere improvementsin almost like the interface of
human and machine to collaboratemore effectively takes us

(47:51):
really. Far totally agree.
I totally agree. I think that like, you know, if
you're talking about software ata big company too, like as a
developer, you have a role and it's to make code and it's to
make code the best you can. But generally you have a QA
department to make sure that it's, it's good.
So I think you, it's, I don't know if it's like a multi agent

(48:11):
type of thing, but more of like like in my own workflow of, of
asking an agent to go do the creative parts and to make the
code and to do the thing. But having a really strong test
suite or having really strong lints so that it actually can
see what went wrong. And then I have a, a custom
prompt that's just like go run the lint command and just fix,
clean up the mess that was made.I think you can, you can AI can

(48:34):
clean up AI as a mess, but I don't, I think in this current
moment, I think it's better to use a fresh context window to
clean up a mess rather than the,the agent that created the mess.
But. Right.
Yeah. So.
So prompting can go a long way, maybe not necessarily all a
single prompt, and we definitelyneed to improve the way that the
alignment between the human mindand the AI process interface so

(48:58):
that we can get to the desired output.
You can find me on X, my handle is Kubla showing my old hacker
lineage, and you can learn more about Fulcra at fulcra dot AI.
You can find me on X at Fiero Thai and I post things sometimes
and more to come for me soon. Thank you for listening to this

(49:19):
conversation with Michael, Tiffany and Thai Fiero.
Prompting is incredibly important.
It always will be when interacting with AIS, and it's
going to change and evolve in what we discussed.
This episode is probably going to be irrelevant a year from
now, but I'd really love to get your thoughts.
What techniques have you found work well for prompting LMS?
What did we miss? What did we get wrong?
We really want to make sure thatthe collective improves so we
can maximize the benefit we get from AI.

(49:41):
I want to give a quick shout outto Tool High for supporting the
show so I can have conversationslike this and I'll see you next
week.
Advertise With Us

Popular Podcasts

Stuff You Should Know
Paper Ghosts: The Texas Teen Murders

Paper Ghosts: The Texas Teen Murders

Paper Ghosts: The Texas Teen Murders takes you back to 1983, when two teenagers were found murdered, execution-style, on a quiet Texas hill. What followed was decades of rumors, false leads, and a case that law enforcement could never seem to close. Now, veteran investigative journalist M. William Phelps reopens the file — uncovering new witnesses, hidden evidence, and a shocking web of deaths that may all be connected. Over nine gripping episodes, Paper Ghosts: The Texas Teen Murders unravels a story 42 years in the making… and asks the question: who’s really been hiding the truth?

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.