Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
The thing that a lot of people are missing right now, the fact
that like it's just hard to use Claude code and figure out where
you need to go. You have this magical 8 ball
that can literally summon you a pizza to generate your next
billion dollar idea. And now that you have this tool,
how do you actually use it productively?
Because when I use it, it doesn't always work.
And then other people online sayit doesn't work.
And then when I use it this other way, all of a sudden, like
(00:21):
I'm just in the zone burning tokens at 1,000,000 miles an
hour. And so how can we actually
leverage this intelligence? And this conversation today is
going to actually help us figurethat out.
There is no reason the tools can't be constructed at this
moment with the technology as itcurrently is, that will reliably
let you give it a task and you come back in an hour and the
(00:41):
whole thing has been validated. Claude Code has quickly taken
over as the top choice for AI coding tools, but most people
are only using a fraction of capabilities.
This tool is way more powerful than people realize, and even I
wasn't aware of some of the incredible functionality that we
discussed today. So in episode 52 of Tool Use,
brought to you by Toolhive, we have two cloud code experts
joining us to tell us how the pros are using it.
(01:02):
There are so many great tips in this video.
Every Cloud Code user will learnsomething new.
We cover how to optimally set upcloud code, the benefits of
different thinking modes, context management techniques,
sub agents, incredible hooks andslash commands, and a lot more.
We're joined by Ray Fernando, a former Apple engineer for 12
years and my favorite AI streamer teaching the whole
world how to use AI. He's always cooking up some
(01:23):
great content and Eric Boosts anavid builder who's deeply
curious and an enthropic super fan.
He has tweeted out more high value insights on cloud code
than anyone else I've seen. If you're not following these
two guys, you're definitely falling behind.
This is a cloud. Code Master class.
When you install Cloud Code on anew machine, what's the first
thing you do? How do you get started?
1st I set up an alias for the letter C so that whenever I type
(01:48):
C it opens Claude with dangerously skip permissions
flag. So I can very quickly hop around
to a new project folder in the terminal and I can just type C
and then it's open and I can start asking questions.
I also try to do keyboard replacements.
So like you know your keyboard text replacement.
So I replaced the word, the letter U with ultra think and I
(02:11):
replaced the letter W with what happened ultra think and make a
plan before coding. So I have a set of just
principles that speed up the workflow quite a bit.
The next thing I do is install clod code docs and so this this
documentation allows a simple way for clod code to understand
what its capabilities are. It has a base set of internal, I
(02:35):
guess it's, it's not specifically training, but
Entropic provides it with a set of instructions about what it
can and can't do. But in order to get more
information about the the depthsof its capabilities, it has to
do a web fetch. So if you ask it, can you do
this, How do I use slash commands?
How do I use agents, things likethat.
Then it has to go to Anthropic docs.
And even though if you go to thedocumentation website, you can
(02:56):
click on copy markdown, it'll give you the markdown file.
For some reason right now Cloud Code doesn't go straight to that
file. It goes to the web page, It does
a a command which converts it tomarkdown and it downloads all
this stuff. And it takes a little bit of
extra processing to do that. But it doesn't seem to have a
direct manifest to know where all the other documentations
are. So the cloud code docs, what
they do is it's a one line installer and you install it and
(03:21):
then it puts the documentations locally and it's kept up to
date. There's a GitHub action that
syncs with the Anthropic COD code docs every three hours.
It downloads that. So when you ask a question,
there's a hook, and what the hook does is it tells.
So you do forward slash docs space and then you ask your
question. That's the slash command.
The slash command tells cloud code where to look for the docs.
(03:44):
Instead of going to anthropics websites, go to your local path
where you've already got the docs installed that are up to
date. It also has a hook and the hook
does a git fetch. And if it sees new content from
the the repository on GitHub, then it will do a git pull and
it does that good pull and finishes it before it sends the
request to whatever you ask the docs for.
(04:05):
So whatever information you wantto know.
So basically it allows Claude tobe able to answer any intricate
to deep questions you have aboutthe new things that came out.
So you can say what's new. It'll tell you here's a DIF from
the last time that documents were updated for these docs.
They were updated and these weren't.
And then based on those updates,I can see here's a summary of
all the things that this many days ago.
(04:25):
And you can click on the link tothe GitHub and you can see the
DIF and you can click on the link to the source and you can
see Anthropics cloud code doc for that documentation.
And then you just talk to him and say, how does, how does that
slash commands work? How can I use it with hooks?
What, what are the sub agents? What, how can I take advantage
of this tool in my project I'm working on right now?
They would and what would the trade-offs be for example, or
(04:47):
what are some unlike unexpected ways that I can connect these
together that I might not be thinking of based on this new
tool that just came out, this new feature.
And so having the docs locally is I think the biggest high
yield thing that you can do whenever you first get started
with Claude code. It gives you the tool to learn
the tool, but I'd like some clarity on a couple of things
you mentioned. So you said you set the flag
(05:09):
dangerously skip permissions, which is probably the most
intense sounding flag of NEC alive come across.
Is that Yolo mode or what does that enable into your workflow?
Yeah. So there's a lot of prompting
you can do and there are trade-offs with this flag.
That's what I do. And obviously for it's not
prescriptive for everyone else, it depends on your use case.
And but if you're like trying toget used to what Claude does,
(05:29):
then it's, you know, if you don't turn that flag on.
Whenever a clock code first cameout, I wrote a wrapper that
would instead of calling Claude directly, it would call this
other tool, which would then call Claude code and then
attached to the terminal sessionand then send commands to it
recursively. And so it basically convinced it
was in a sandbox in a running a docker container that got around
(05:52):
it kind of did the same thing. And then they give us a flag for
dangerous gift permissions. And what that does is it makes
it where a lot of the things where you, it asks you to do
something as a write to a file. Maybe it has access to read a
file, but maybe it can't change files in this path or, or run
this tool or whatever. Then it prompts you, Are you
sure you want to do this? You want to do it for this
session or, or if you have a subagent and you want to give it
(06:13):
access to do stuff, maybe you have to pass in exactly the
tools you want to have that. And so it's a little bit more
complexity on you up front and that's fine, but it essentially
is Yolo mode. So if you're comfortable in your
machine and your environments and the task that's in front of
you to do Yolo mode, that's great.
Now I set up a VM that's separate to all this so that I
(06:34):
could do snapshots and give it access to everything.
And it's running on top of my Mac.
It's a Linux, so it's a separatesandbox environment basically.
So I feel comfortable with my Yolo, but in most cases I think
Yolo is. I haven't had any issues with it
doing anything in folders I didn't want it to.
There's lots of ways to prevent that.
Even if you you can use a hook for example, to say before you
(06:55):
run this command, make sure it'snot in these areas.
And the hook is a separate process that runs outside of
cloud code and so it can preventthings from happening as an
example. And Ray, Eric mentioned thinking
mode and I know there's a few different types of commands you
can throw in there. How?
How does thinking mode affect? The way you use Cloud Code.
Yeah, so thinking mode is a modethat I would use a lot for
(07:17):
research or specific types of tasks.
Keep in mind there's like 4 different modes.
So like there's the Ultra think,which is can use 32,000 tokens
for thinking and then it goes all the way down pretty much by
half. So there's like think harder,
which would be 16,000. I think like I forget.
Like actually let me pull this up right now because I have it
on my blog. Yeah, I'm ultra think all the
(07:39):
way. Just take it all.
I've learned an interesting lesson.
And this is kind of why I love that we're doing the podcast,
because I used to be just Uber ultra think all the time.
And I noticed that I get variable results and I kept kind
of digging into why and I kind of went back to like something
(07:59):
that I learned, which is basically less is more type of
thing. So I have an article here that
tells me from my website, Ray Fernando dot AI.
And here basically you have the four thinking modes.
So you have think think hard, think harder and ultra think as
no spaces. And these are key terms that
Claude code is going to be looking for inside of its
(08:22):
environment. And this one is going to consume
about 32,000 tokens. You're going to be 16 here.
It's going to be about 8 and about 4000 for, for thinking.
And it's also important to note that like these thinking windows
are, are very important for helping users understand like
how much context you're going tobe using up.
And if you're using ultra think all the time, what I discovered
(08:45):
is that it's really good to justhave your conversation isolated
to a specific concern or problemthat you're looking for, and you
kind of want to maybe eventuallystart breaking that up.
And I could show you why becauseof the thing that everyone's
talking about these days called context engineering or context
window. So I built this simulation to
(09:05):
kind of help us understand a little bit of kind of what
you're doing if you do like an ultra think type of thing.
So if you have some type of input about like, you know, 2000
tokens that you're saying, here's some relevant information
that I'm trying to do. And then we do agent thinking.
You pop in 8/16/24, you know, a bunch of thinking tokens.
So you use all this thinking token window to, to have it do
(09:26):
some stuff. What Cloudco is going to do,
it's going to do some tool calling to grab some stuff,
maybe some code files and various stuff.
And then eventually it's going to try to generate like a little
plan or some output or somethinglike that.
So maybe it generates 32,000 tokens sometimes I've seen it,
you know, generate like go through 40,000 or 50,000 tokens
just searching through your codebase and grabbing different
files and things. So once this context window has
(09:48):
filled up, then you're going to say, OK, cool, yeah, let's go
ahead and build my next billion dollar SAS.
And then you're like, I want youto do these types of things and
make sure you grab all these other files as well.
And then if you still have ultrathing turn on, it's going to try
to like then digest those types of things there.
It'll do some more tool calling your code and then it's going to
start to do some more code output and then eventually the
output will not be kind of what you want it to be.
(10:10):
And that's just two different conversations that you've had.
You're like, Dang, I have a 200Ktoken context window and like my
code isn't as good as what everyone's saying.
And that's usually quite the problem here and there.
There actually is a study that'sbeen done by Chroma DB that
talks about context rot and in afunny way that the effective
(10:32):
token context windows starts to really fall off the Cliff after
about 50% for most use cases. And it gets distracted if you
throw too many different problems.
And so it's really important to kind of start with a
conversation and then maybe justkind of start branching it off
into different components that you want to solve for something
(10:52):
like that. So you as a human will want to
review this giant research output that you just got out and
then see if you can kind of break that out into a different
plan by doing slash clear. And another technique people use
is maybe spawning sub agents. So if we kind of reset this
here, let's just say we have a simple query that we do and this
is just kind of what happens. You know, the agent comes back
(11:13):
with some more code output or something like that, or it did
some work and now you're saying,OK, I have this plan now let's
go ahead and see if we can execute on this in these
different chunks. And you can even ask cloud code
to generate some sub agents to try to figure out what it can do
from there. So what that does now is it
spawns off a whole new session with another set of 200,000
token context window. And that way you can just go
(11:35):
ahead and follow on request and so forth.
And each of those sub agents, you kind of want them to really
isolate and focus on one specific problem.
So that way it doesn't affect the main agent that's at the
very top. If there's any reporting or
things that need to go back, it will send it back.
But then you're now kind of dealing with something that's a
little bit more intelligent, kind of thinking of yourself as
like a manager and you're handing off these, you know,
(11:56):
really intense tasks to people to really focus on and get their
work done. And that leads to very high code
quality and high code generation.
So back to kind of like the previous example where we talked
about like think, think hard, think harder and ultra think,
you may want to think about maybe if I try something like
think hard, maybe not not as much tokens, but it can still
(12:17):
get me some pretty good results because I still value the
accuracy. And maybe for the first request,
I want to ultra think because I want to go through a lot more
thinking. And and then because I've had
this problem for a while and I've tried all these techniques
and none of these work. So I'm going to feed that in as
part of the plan saying I've tried these techniques.
These are some code snippets we've tried and here's some
(12:38):
documentation. I want you to now ultra think
and just, you know, the bazooka comes out and just, you know, go
ahead and solve that problem. And that's just a little bit of
context. And in terms of like how this
stuff is kind of managed, but I feel like this can provide some
oversight on maybe how do you choose think versus think harder
or ultra think. And then why some of this
visualization for you to see is important, which kind of goes
(13:00):
into the topic of sub agents. And I'd love, you know, Eric, to
talk a little bit more maybe on how do you figure out these like
sub agents and maybe, you know, how do you kind of best take
advantage with clock code in these types of environments?
Yeah, that was awesome. I loved your visualization.
I think that's a great explanation.
I feel like these are two related but independent problems
that feed into each other. The issue of, you know, it's
(13:21):
like Miller's law, I think it iswith humans.
You can hold 5 to 7 chunks of information in working memory at
one time and then you know, you get overwhelmed with stuff.
So if you, it doesn't matter howgood your model is at
instruction, following benchmarks at some point, if you
give with too many, do this, don't do that critical.
You must do this. Like it's very confusing,
especially the whole lot of the more you fill the information
(13:42):
space, it can't attend to everything equally at all times.
And every, you can think of it like every percentage you focus
its attention on one thing, it'sit's got a little bit less for
something else. Usually that's separate.
And if you keep the conversationgoing over time, as it gets
close to its mass capacity, its quality of output is definitely
going to degrade. It's been like that since the
beginning and I feel like for a long time models are probably
(14:05):
going to do that and that's why Claude code has this auto
compact feature, which is reallycool, but you don't want to get
to like 90% like when it says there's 10% remaining, it's
probably time to start over anyway.
And so the issue of like the reason I spammed the ultra think
thing is, is I do, I tend to like start a fresh session.
I clear and I have an ultra think based off a previous plan
(14:27):
that was well documented. Because I feel like when you do
a project, it's really importantto plan, not just, you know,
what your cloud MD should be, but like what the file structure
should be, what the instructionsthat the model should have that
kind of deviate from its baseline that you want to change
a little bit. So like, hey, don't randomly
create a bunch of files. Let's just stick with these
documentation files. Don't change that many.
You can add, you can change the content, but don't add new ones,
(14:49):
the Max amount of this many lines per file, etcetera.
And then keep them up to date. You can do that with something
like a hook or you can do that with the slash command, which is
what I tend to use for context reasons.
But then at the end, you know, you clear or you compact or
whatever before it's too large. And so you can still manage your
context without and they have high quality responses and, and
(15:11):
in that case, Ultra think it could be advantageous.
I've found it to work really great for me.
But at the same time, it does take up a lot more tokens and
you just have to watch that morecarefully.
But if you're on that, like I'm on the Max plan of 20X, it gives
you, you know, a lot of tokens to play with in a 5 hour
session. And so in that case, if I start
running close to it, I have a session tracker tool that will
(15:32):
tell me how many, approximately how many hours or minutes I have
left in my 5 hours so that I canknow it's going to trigger
another one. And it gives me a countdown.
I can see it in the menu bar. It's like, OK, I'm going to hold
off and I'm going to wait until like the next so I don't burn
through too many in a month. But then the clouds going to and
tropics changing that at some point.
We don't know the details yet. I think.
I'm curious, Eric, about your hook workflow.
(15:54):
Like how do you, like, what are some of the hooks that you've
tried that? Because I feel like everyone's
tried different hooks and none of them have really stuck around
for me too much, you know, like,you know, I, I from generating
sounds to having it do extreme code reviews where nothing
happens until like, you know, ititerates on self and I'm like,
yeah, that's an infinite loop right now.
(16:15):
It's like I can't get out of my agent.
So have you found any hooks thatyou found that are pretty useful
for yourself that you've either generated for your own tool use
or just getting curious on what you've been cooking with?
One thing that I have is a you know that I mentioned the Claude
docs. So that's a hook because it does
a fetch every time you do a readrequest from the path where the
(16:35):
docs are installed. Then it detects that you're
trying to read in that path and then it triggers a hook and the
hook calls a script and the script does a thing and the
thing gets done and that response comes back and gets fed
into the clawed model or like update something before the
request from the user goes to the model to be processed.
If you do a pre tool use hook. So as far as other things I do.
(16:58):
So yes, I have something called a hook that I love and I
wouldn't do any. So many of my projects rely on
this one hook and it is definitely stuck around and I
think it'll be there forever fora very long time.
And it is the indexer. So I have a project that is
cloud code is called project index.
And so a long time ago, like when it but when chat BGPT first
(17:20):
came out and I like got obsessedwith OK, this is going to write
my code for me. And literally since then, I
haven't written more than a couple 100 lines of code.
It's been entirely vibe coding from day one.
Well, probably from like day three accurately, but but I I
like a monkey was sitting there.I felt like figuring out, OK,
here's the code and here's the my code and my project.
And then I've got to figure out how to berge the two together by
(17:42):
hand. So I wrote this tool that like
you just copy the clipboard and it automatically it copies all
the code because it does unifiedDIF format and it knows it paths
where it's supposed to be mergedand then just merges all the
code in looks at the clipboard. But I realized early on to get
the best results, what you want to do is have as high signal and
as little noise as possible to the models so we don't overwhelm
(18:04):
their contacts with stuff that is irrelevant to the task at
hand. So whatever you want them to do,
you need to give them as much information that they need to
know what the documentation updates since in the API, since
what their training was and whatare the files and not the other
files they don't need, just the files that they do need.
So is there a pre processing step that could be useful to the
models to be able to optimize their response, their quality of
(18:25):
response? And so I determined then the
best solution was step one, takea minified version of the entire
code base and so for every file.I don't mean like web
minification where it's obfuscated and like the letters
instead of variable names of. I mean like, you know, like a
UML style abstraction, but you have the actual import
(18:45):
statements, method signatures, maybe root level constants or
something like that, return types, etcetera.
And all the dependencies for every file in your project is
not in dot git ignore. So this is like a project.
It's got the path like the project tree structure.
And it has for all of these files a little bit of
information about where they sitand what they relate to and what
(19:05):
content they contain. And so I have a hook that goes
through the project every time afile is changed and it does this
and it updates the index. So there's a project under score
index, all caps dot Jason that sits on the root of every
project. And so this hook maintains that
and it's outside of cloud, doesn't know about it because
it's not inside its life cycle. The hooks sit outside of the
(19:29):
cloud you're talking to. So it doesn't it doesn't dilute
the the context window of Claudeto use a hook.
So this hook updates the projectindex file and then whatever I
ask for a change, I can, for example, spin up a sub agent.
I can say, hey, use a sub agent to look at this, your project
index and figure out just which files which lines are needed for
you to reference to, to look at this change as an example.
(19:52):
Another is like whenever I startlike I'll often do clean up
before I like. If it finishes a task I had for
it, I'll I'll do a clean up slash command.
And that will go through and tell it to update all the docs
and and plan this next phase. You can pass arguments in with
your slash command. You can say something after it
and it'll pass it in as arguments into the the command
that gets run. So then it knows what it's going
(20:13):
to do next since update the docs.
And then I clear it. And then once it's cleared, I
have a command that's like freshand the fresh command tells it
to read all the documentation and everything in the project
index to read the whole thing. And then it knows not just what
I wanted to do next, but all thedocs and not everything in the
project obviously would be way too much for the context.
(20:34):
But it has this minified version, like a simple version
of the entire project, what's dependent what, what's where.
So Claude does a really great job of using the search tool and
going and finding things right. But the the bad thing is, you
know, you get sometimes it will miss stuff, right, if it's
really big or sometimes it the the worst case scenario when
you're vibe coding is it createssomething in one place that it
(20:56):
should have refactored in another, right?
That's like what happens is whenpeople try to avoid this doesn't
work. The project's too big.
If it's small, it's simple. Everything fits in context and
it just works. If it's big, you have this
problem. So to avoid this problem, I have
a like bulletproof ish solution called the project index that
works and that works off the hook.
Wow. So in practice, what's the set
(21:17):
up like to get this going? Like how do I go from I just
installed cloud code to now thisall starts for me with Cloud
code docs locally. And that's a public repository.
I can well can share it. It's cloud code docs, it's my
repository, it's my project. But the project index I don't
have shared at the moment, but Icould share it.
(21:37):
And it's, it's basically it'll be like a one line install and
then you can just run it one time.
And then at once that project index file exists in the
project. It also installs the hooks.
And so it'll just see, is there a project index in this in this
folder? If so, then the hook will
trigger and it will just create the index of the project for
you. So, but if you don't have that,
this is what I built for myself.I'm happy to share.
(21:58):
If you don't have something likethat, but you want something
like that, you can just say, Hey, Clot, they'll do 4 slash
docs if you have that installed.What do I need to do in order to
improve? You know, you you gave me this
result. I didn't like the result.
How can I talk to you better in order to prevent you for giving
me this result and give me one more like that reference your
documentation and see if there'sanything useful that I I could
(22:20):
use to help, you know, encourageyou in the right direction and
just conversations like that, like asking it this messed up.
I want it to not be like that. What would you recommend I
change ultra? Think about it.
You make a plan and then you reference the docs and it will
tell you all of these really crazy interesting solutions.
Like I've found so many things that that I guess aren't really
(22:41):
why aren't used a lot. Like passing contacts from a sub
agent, calling a sub agent, not just a sub agent that's defined
early, but like an ephemeral onethat you get defined for the
task at hand and then gets removed.
Or or getting the context out ofa sub agent and passing it into
a Claude that you can resume. Or taking the content from an
(23:01):
existing session and moving it to another session through a
background. There's just a ton of things
that you just talk to the docs and ask it, can I do this?
Well, and here's your project folder, Claude.
Can you look at that and see what else I can do?
And it's into your docs and it will tell you.
Claude code is awesome at being able to allow us to get real
world work done. And to do that you need to share
(23:21):
your real data and systems and that's done through MCP and that
can be a little bit scary. So that's why I've been using
Tool Hive. Tool Hive makes it simple and
secure to use MCP. It includes a registry of trust
MCP servers. It lets me containerize any
other server with one single command.
I can install it in a client in seconds, and secret protection
and network isolation are built in.
You can try Toolhive 2. I highly recommend you check it
(23:42):
out. It's free and open source and
you can learn more at toolhive dot dev.
Now back to the conversation with Ray and Eric for Hooks,
just for the mental model. Is it something you should think
of as an automated slash command, or how do you
differentiate what functionalityshould be a hook versus what you
can put as a slash command? This is what I do.
I am pretty rigorous about trying to maintain the context,
(24:05):
like knowing what's in the context.
And you can think of it like, and ever since this is like
clearly the models are trained, the reinforcement learning, they
have a knowledge cut off, they're put into use.
They do inference At that point.What they know is, is a
combination of like the hyper parameters, like what you know
what, what's their temperature, their type of all these things
(24:25):
that are like tell how much compute to use, right?
But then they have a system prompt that's usually not
controlled by you unless you usesub agents.
And sub agents are different than the regular, Hey, Claude,
create a sub agent. If you have a defined sub agent,
you can tell it the system prompt to you so it has a higher
priority. But there are trade-offs to all
this stuff because it's like you're receiving an e-mail and
someone's telling you, hey, whatreally matters is the second
(24:47):
paragraph. But the second paragraph here is
the analogy of the prompt. Some people are focusing on like
the prompt is what matters. It does matter.
But everything that e-mail matters to you, like you come
from your boss. That's like the system
instructions. Hey, here's an e-mail.
I want you to really focus on this to the customer and or like
the external domain, you know, notice that you can't forward it
along or whatever, that kind of thing.
(25:08):
Those are information that maybeyou didn't control.
It was a system message or something like that.
But that's all part of the what you're trying to figure out as a
user. Or in this case, like what the
agent knows is everything and some, some of it, you know, some
of it you don't. But of the part you can control,
it's really important to make sure that there's not extra
stuff in there to confuse the model.
So I try to remove other like ifyou put a bunch of MCP servers
(25:30):
and a bunch of sub agents. Well, Claude, if you ask Claude,
what do you know right now? Like you just start a new cloud
product project and you a new session and you say, what do you
know? It knows about its projects
directory. It knows a little bit of
metadata about the project. It knows a cloud MD from the
root, the project of the user root and from the project root.
(25:51):
And it knows about its hooks like a brief little list, sorry,
not hooks. It does not know about its
hooks. It knows about its slash
commands, a brief little one line description or something.
And it knows a whole lot about its sub agents.
And I don't like that because I don't want every command that I
send for it to read a whole bunch of information that it, it
(26:12):
may not be relevant for the taskat hand.
And it has to decide between, should I call this sub agent?
Should I not call this sub agent?
So I don't use them unless it's like something where it's, it's
very consistent for this type ofproject that's going to need
that. And it's usually at the project
level. But for hooks, it doesn't know
anything. It doesn't dilute the context at
all. So hooks exist outside of the
(26:32):
life cycle of. So there's a software
development life cycle like a runtime.
And then it's like before the hook is before the, you know,
you submit a request and there'sall these stages to process that
request. And there's certain points where
there's a check to see if there's a hook there and there's
a if there is, and it runs the code.
But clod itself, the context you're talking to doesn't know
(26:53):
about it, which means that it's not messing with the context
any. When you would go to a slash
crane versus a hook like I understand about the polluting
the context, but let's just say for for general functionality
like your your docs, slash docs.Why?
Why would that benefit from a slash crane versus a hook?
I created this before hooks wereout.
It's possible that if I did docsas a slash command.
(27:15):
The thing is Claude really is instructed I guess in the system
prompt Claude code to look in their website for their
documentation. So I tried to put instructions
in Claude dot MD and it would just ignore them.
It was like inconsistently follow them anyway.
But when I did it as a slash command, then the slash command
(27:36):
could tell it could instruct Claude at the level directly
what the user's doing. It seemed to have a higher
priority than the Claude MD being read in or the project
level. Project level is higher priority
it seems then the the root levelfor user level for some reason.
But then your direct commands are higher level than all of
them still. And it might just be a recency
thing like where it falls into whatever.
(27:57):
But for whatever reason, when I did a, A slash command, it
consistently followed them because I could say look in this
local repository for the docs. Now if I do a hook, it has to be
it, it does a fetch, right? So it's, it's got a hook in
there for the docs, for example.And it will do the fetch.
And if it's, it'll do a pull if there's more information, but I
(28:18):
think I tried that, I ran into some kind of issue because
remember that's running outside of your cloud contacts.
You can inject stuff from the hook, I think into the cloud,
but definitely can, but I just don't know if it's supported.
There's lots of things you can do that I'm not sure how.
And, and also I just want to pivot 1 real quick and say
there's like, I'm always on the fence.
It's a grey area between, you know, you don't want to do
(28:41):
anything that could, you know, be investigating anthropic
stuff, right? But these models, it's kind of
tricky because you ask, hey, look at these new tools and tell
me what you're capable of doing or hey, I want to do this thing,
but I'm not sure how to do it and it will go off and figure
out it's sandbox, it's environment like a lot of these
(29:03):
tools. I won't say which tool and which
environment, but I asked him to just do a little research to
tell me how I could do this thing And it created a mechanism
that like basically broke out ofits this thing got me gave me
all stuff that I, I thought, I don't think I'm supposed to
notice And so I had to report it.
But I'm just saying it's very easy to get further down the
road than you want to get. Sub agents are a great example.
(29:24):
I don't use them very much because the default, if you say
sub agent has been around in Claude for a long time.
You can just say, hey, create a sub agent and it's a task.
It's a tool, it's a task. And it's called a general
purpose sub agent. And Anthropic defines what its,
its system prompt is. But it can go and do anything
(29:45):
you want. You can get 10 of them in
parallel at one time and they'lljust do it, do research.
There's different use cases for different mechanisms to do it
and Claude will manage it all. And if you, you know, stop,
escape or whatever to interrupt,it'll handle the interrupt and
all that stuff. If you do your own version of it
where you have your own sub agent and it's not actually a
sub agent, it's really like a headless clod code that you're
(30:05):
calling, then you have to handlethe interrupts and all that kind
of stuff. So basically sub agents, I use
them whenever there's a task that's really repetitive, like
the I know exactly I want to do.I want to be dedicated and
really good at doing this one specific thing, doing research
on some task that I needed to pull out.
But I don't want to create a lotof them because I don't want
(30:26):
them to be confusing the contextof them and a lot of times slash
commands are my preferred use case.
And Ray, on one of your live streams, I saw you put together
a sub agent. It was either for coding styles
or for design styles. When do you go to a sub agent?
Yeah, for sub agents, I've been kind of experimenting with this
because my goal is to get some consistency.
(30:46):
And the only consistency I've received from sub agents have
been for research tasks in my code.
And these are the tasks where I do want to spend a lot of tokens
to go through to make sure I don't repeat code.
And so I think that's kind of what's been happening.
It's like I started with my vibeidea and then it's just kind of
blowing up with more and more and more features.
(31:07):
And when you start to add databases, authentication and
all these different patterns, yeah, it starts to get pretty
interesting to see what the model prefers.
And so I generally spawn off like a sub agent.
And I say, you know, to like to the main task.
Like the goal right now is I want to implement
authentication. And here's some like
documentation. Can you just do a quick review
(31:28):
on like where all this stuff andwhere I should be putting in
considering, you know, my clientside is here and here's my my
back end side. And so it's just going to start
digging through code files. And I'm basically just giving an
intern a task in some way to say, you know, just dig through
the code, give me all the piecesthat I should be aware about so
that when I do my own code reviews, I can actually review
(31:49):
and making sure that these listsare kind of all checked off.
Because one thing I discovered in this entire process too, is
the fact that the model will saythat it did one specific task,
even though there's like a wholebunch of them market is complete
and move on. And that's something that you
have to be careful of as well. And so I like to give these sub
agents as like a second look in that whole pass.
(32:11):
And sometimes I'll just give it the instructions and saying,
here's my, here's my manifest oflike areas where the
authentication was supposed to be implemented.
Can you just do a quick pass on each of these sections?
And I want you to kind of think about how each sub agent should
take a look at it. Just I'm just kind of delegating
it to the model. So at this point, I have Opus
and I have the Max plan. Opus is a really great
(32:33):
orchestrator. It will kick off other sub
agents and those sub agents still use Opus.
I found out. I thought it uses the next lower
model, but I guess maybe becauseI have the higher version of the
plan, it's just uses another version of Opus.
And Opus is really, really, really good at grabbing lots of
obscure information and you'll see it do lots of tool calls in
that sub agent. So yeah, I basically treated the
(32:55):
sub agents right now as code reviewers and as like secondary
reviewers through the different files.
And then I can kind of quickly suss out as like this pattern
looks like it's been repeated a whole bunch of times.
You know, it's just like my worst scenario.
It's like I, I generated a bunchof code and it's just, you know,
we could have just taken care ofthis in a, in a react hook of
(33:16):
some sort. And that pattern should just
repeat, be repeated everywhere else.
And those are the times I just kind of back out the change and
then just start fresh again and say, let's go ahead and
implement this. And this is kind of what a bad
example looks like. And just literally copy paste
some of those examples and starta fresh new prompt and
everything again. I think that helps the steal the
model to generate way better output and be thoughtful about
(33:38):
its architecture. And so I can basically go from
right now, I spend a lot more time, I'd probably say I'd spend
like 60 to 70% more time now in planning phases.
Yeah, and, and, and code reviewing.
So those are kind of like my splits right now.
And then the other parts of the generation are just to sort of
kind of babysit and loop back. And that's kind of where I'm in
(34:01):
this current phase right now. I wanted to do more automation.
I wanted to do more things. But as an engineer, I'm still
spending a lot of time in this phase to verify all the stuff.
And I'm actually very shocked atthe output that it is not as
good as what people are saying. I trusted it a little too much
because this is what I'm discovering.
It's being over anxious about check marking the boxes and
(34:23):
moving on. So just want to give people that
type of heads up. Yeah, trust but verify.
I I have I love Claude because, you know, anthropic does a lot
of work to try to make sure and there's a there's a reason I'm
anthropic fan. I don't know if we'll get to
that at some point, but like it does a lot of work on its
personality and trying to instill principles and values
and ethics and stuff into the models.
But it's it is very I don't trust Claude.
(34:46):
I have to put in this instructions to, you know, not
lie to me, basically like do notsay that a thing is is done if
it's not done. That's dishonest.
I think that like gets into it'slike really pays attention to
that, right. And so when I talked about sub
agents earlier, I was really in the context of defined like
predefined sub agents. I do use a task sub agent where
(35:07):
you tell Claude to do a thing a lot.
And that is largely for the purpose of doing something,
anything I can do this outside the main context preserves the
context of the main plot agent you're talking to.
And so I don't need it to go andsearch a bunch of things that it
doesn't need to know about. Whenever those sub agents can
pass the information that it finds, it's relevant to the
(35:29):
question back to the main agent that it just has a subset of
that knowledge is relevant. So that's the the whole idea of
maximizing the context is using the sub agents all the time in
that way, but blind validation. So like before you do this is
what I found. This is for me to you, Ray.
I don't know if anyone else struggles with this, but when I
say I don't trust Claude, I meanI explicitly do not trust it
(35:49):
when it says it checked off a box that it's done.
I require in every case that it closes the loop with testing and
that it doesn't. It doesn't validate.
It has to have a sub agent or some other agent be a blind
validator. So in other words, when I'm
starting a project, I'll first determine the plan and it has to
include a testing plan and how specifically it's going to close
(36:09):
the loop on testing. Whether it's going to use an
MPMCP server, some other tool it's going to build something
gets online, something like Puppeteer if it's a or, or if
it's a Python script like takingscreenshots of the GUI and and
saving them. But it can't be the agent.
The one I'm talking to that's building cannot be the agent to
to verify that the the checklistthat it the the main agent made
(36:31):
it at first or whatever was designing the the plan for
testing. That agent can't be the one
that's that's checking to see ifit's done.
It has to be a separate agent explicitly with the task of you
are a a blind validator of this thing.
And it looks at the screenshots and it looks at the checklist
and it determines if it's done or not.
And it passes like it updates that file, for example.
(36:52):
And so that's the only way that I'll because otherwise, like
right now, the issue is it the model golf and do a bunch of
things and then we'll come back and it'll say it's done and then
we check it and then we're like,it's not done because I, I can
see this thing, it didn't work right.
So it needs to have some kind oftest that it can run to verify
that it's done, but it shouldn'tbe the one to do it.
(37:12):
And so you have a salvation thattakes care of that piece for
you. And that way when you come back,
it's actually reliably done. And that takes extra work up
front to set up, but it saves you so much time down the down
the stretch of the project. And then also right now we're,
you know, we come back in a few minutes or whatever.
But there is no reason that tools can't be constructed at
this moment with the technology as it currently is that will
(37:35):
reliably let you give it a task and you come back in an hour and
the whole thing has been validated.
Because the the reason that I love flawed is that from the
very from early on, you've been able to make composability A
foundational feature where you give it a task and it can spin
up other versions of itself thataccomplish these other sides.
(37:56):
So it can be dynamic like on thefly in the moment, based on the
task at hand, create a set of sub agents that are responsible
for this and this other thing and they go off and do it and
then come back. And so and these can be nested.
They don't just have to be in parallel like sub agents can't
regular sub agents can't form other sub agents nestedly.
But you there is a way around because of clawed code, because
(38:18):
of not just the SDK, which has this different set of features
and trade-offs, but clawed and headless mode can be called by
clawed and it can in turn claw clawed headless mode and sub
agents and things like that. So you can imagine like this big
tree that's a tree and context sharing and all this stuff in
between with resume ability. So it's it's possible now people
(38:39):
are building it. I've been working on some
versions of it myself. Yeah, I want to also pull back a
little bit too. And like, I would not get
discouraged if you're kind of maybe listening to this
conversation, you're like, Oh myGod, this is way too far
advanced. I I dropped out of school,
right? I got into Apple just by pure
grit and then worked my way up and I learned a lot of these
(39:01):
software engineering practices literally on the job, right?
And then got to solve like bigger problems and bigger
problems to like multi million to billion dollar problems,
right. And I think there's a core of
truth that if you're just getting started out with this,
don't be discouraged. You can go a long way simply by
just acting as a user and literally using your product and
(39:23):
then trying to solve that one feedback loop.
And that's going to just you. You.
That's like the 8020 of everything I feel.
Is that like, OK, how do I not do this again?
Or how do I set up a system to help me verify it?
If you just start at that level of curiosity, it's going to take
you a very long way because now you say maybe I can set up a sub
agent for this to help me do this.
Sir, how can I prompt the model to help me do this so that next
(39:46):
time I implement a feature as I'm testing it by hand, I don't
hit that same problem or something.
And then you'll start to kind ofdiscover some of these
workflows. So I want to encourage people
that you can see how why so manydevelopers are extremely excited
right now is because. The capability is like literally
just almost infinite. It does get infinite and and it
(40:06):
goes not only infinite in like One Direction, it's in in many
directions. And that's how big of this, you
know, glacier ice pool with likethis giant piece of mass that is
this AI system and how differentpeople with all these different
perspectives are, are are talking about it.
So I wouldn't be discouraged if you're listening to this and you
maybe not know what half of the things are, but some of the
(40:28):
basic tooling to just get started.
It's like, find one thing that you're doing, if it's being
repeated a lot, can you think about a system that you can
repeat over and over again to make it more reliable for the
next time to help you solve thatproblem and that what kind of
help you in your learning journey?
And I think the other important thing for me has been just
asking cloud code itself to helpme with that.
(40:49):
Like, what does that look like? Because I don't know, you know,
like I didn't know how to implement a good sub agent or
something like that. And they've luckily have
implemented this type of thing. If you do slash agent, you can
actually just have a natural conversation about what you want
to do and then it generates the system prompt for you.
So you don't have to be a promptengineer, you don't have to give
it good and bad examples. You don't have to do all these
(41:10):
different techniques. You know, those are like more
advanced, I'd say, but that already gets you a pretty long
ways just in using that type of system in there.
And so I, I mean, I only advise is like always keep it simple
and sometimes the simpler the better.
So if you, you feel like you're kind of going off the deep end,
it's OK to kind of clear everything out and start fresh
again. It can help you learn things.
(41:30):
The models are constantly evolving.
We're constantly evolving in ourknowledge.
We're learning more, we're demanding more from our AI now
that we understand what this real thing is, right?
So it just kind of my, my overall lessons like don't be
discouraged. This is a really great time to
be alive. And you know, ask us more
questions and reach out whereveryou want to reach out at and so
forth. So get a hold of us, yeah.
(41:52):
Yeah, absolutely agree. I've been seeing more and more
people trending around use TDD to to operate your cloud.
And you don't need to know what test driven development is.
You just have to think of the principle, write a test, make
sure it passes the test. Eric, would you say there's any
other guidelines, principles forpeople to just do this
exploration, this play, this discoverability of the
capabilities or any advice on just, you know, getting started
(42:14):
and getting that comfort level of just, you know, experiment
with the tool? I mean, I really feel like
obviously the adage of just use it.
So I feel like everything that doesn't go right is an
opportunity to figure out how can I talk to it better.
And this has been sort of like from the beginning.
If you just think about like what we're leveraging here by
(42:34):
being a person who's using a tool like this, you're really
setting yourself apart by because this is the flywheel,
right? If you're like every moment you
spend getting through like grinding on what is the
difficult thing that most peoplestumble with what failed here?
And then learning how do I overcome that by using the tool
(42:56):
to help learn about it and then continuing just that process.
Then you develop your own systemand it changes from project to
project and project size and style to project size and style.
And then you kind of get a feeling for it.
Like what I'm doing is not anything special or different
than what anyone is able to do is just, I've spent time with
(43:16):
the tool and asking it a lot of questions whenever something
didn't work right. And so based on that, I found,
OK, well, I, I don't want to like there's friction between
I've got a chat conversation andit's like this, this, you know,
you get a long e-mail thread from someone.
We'll go back to that analogy. I'm just the top of my head.
I don't know if it works. But then like they, there's 16
(43:36):
messages, but it's fresh every time.
You've never seen it before and you've got to read through all
this stuff to figure out where you were.
If you have to clear that e-mailthread and start afresh, well,
there's stuff that you would want to carry over, not all of
it, but some small subset of it that's really relevant.
And that's the process of finding a, a system to preserve
(43:57):
the right context as you clear your previous session, right?
It will handle all the compact for you, all of these tools, all
of these like cursor and windsurf and different models
and wrappers for these models. Try to solve that problem as
easily as possible. But there's always still some
friction there, like which context matters for the next
session and how do you make sureyou get it over?
(44:18):
And I think the persistence of the file mechanism, just
writing, having it's like updatethe documentation, write the
next steps to the file that willbe read in.
And then as soon as you click clear you, you run the thing
that caused it to read that documentation to know where to
go. I think that system is, you
know, is very, very valuable. I think this is a good primer
(44:40):
for people to get started with cloud code and ask more
questions and get a little bit more perspective on some of our
use cases. I feel like this is just
scratching the surface or understanding where things can
go. And I, I would encourage people
like, like Eric saying, just be curious, start playing.
Maybe it's chunk down to like maybe one problem that you're
(45:00):
trying to solve that you're, youknow, you see that little bar
there, you know, start asking atthe questions, trying to see if
you can actually kind of work you through a workflow.
I feel like a lot of people havethese different areas of of
concern that the you know, that they can kind of go down these
rabbit holes and I think these tools can get very advanced very
quickly. And just if anything, if I have
(45:22):
one piece of advice to give anyone is like your context
window is so important and protected with all mighty power,
like it's your first born child and you don't want to let it go
ever. So that type of thinking will
kind of help you get the most out of this like intelligence
that we have today. And yeah, I guess you're if
people want to find out some more, I, I do AI live streaming.
(45:44):
So my name Mara, my handles Ray Fernando 1337.
That's my YouTube channel. You could also find me on XI
post a lot on there as well. And so I, I, I do AI live
streaming several times a week and you can find me live
streaming. I'm also going to have some
recorded content kind of going over this concepts and stuff.
So yeah, appreciate you having me on the show, Mike.
Yeah, I just would say it's a great pleasure to get to hang
(46:05):
out with you guys. It's really fun to find people
who are also interested in the same things around the world or
like minded and interested in sort of driving forward in the
possibility of like what things we can unlock for ourselves and
our families and our our future.Basically to make simple to give
us more time for the things we want to do.
And it's just a really fun sandbox and tool to be able to
(46:27):
do so much. You don't maybe yet know the
value when you're going to find something in the future like,
oh, I wish I could do that, but you've learned the skills to do
that and to talk to AI in the right way with the right tools
to be able to do that. It's just, it's very fun.
And I just want to encourage allof you guys, especially like
Ray, it's really great to finally meet you.
I've been a fan for a while. I really appreciate it.
(46:48):
Mike, I was following you at Open interpreter.
I think like it's, it's been a journey and I don't know what's
going to go, but I know that thefuture is bright and I, I really
appreciate the ability to be able to get on and, and talk and
share some of this stuff. I've been kind of hidden, doing
my thing, talking to people one-on-one in the background,
involved in some interesting stuff.
(47:09):
But anyone who's doing this, I think you're spending your time
very wisely because it's a good use and the payoff is gonna be
really good long term. Thank you for tuning into this
conversation on Cloud Code with Ray Fernando and Eric Boost.
I had a great time talking to them.
Both guys are two phenomenal people who I really enjoy
hanging out with. We hope that you gained value
out of this because I know I did.
I, I learned stuff from these two just in this conversation,
(47:31):
but there's so much more we wantto cover.
We were limited by time and we could have kept going for hours.
So there will be a Part 2 maybe here, maybe in one of Ray's live
streams or maybe on Eric's Twitter.
So keep tuned for that. Please follow both of them.
They're phenomenal people. And I just want to give a quick
shout out to Tool Hive, the secure MCP servers that really
help make you more comfortable sharing personal information.
And Cloud Code uses MCP servers,so you can definitely time in
(47:54):
there if you have any other questions, you know where to
find. Thank you for joining.
We'll see you next week.