Practical AI Safety (ft Kyle Clark)

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Apart from a very small selection of people, nobody was
ready for these risks. No one was ready for this entire
plane of risk, and the human mind does not naturally
inherently understand entirely new paradigms.
It's hard, when you don't have experience with these systems,
to understand exactly how jaggedthe intelligence is.

(00:20):
There is no substitute for the human in the loop, and the human
who's in the loop needs to know what they're doing in the loop
for the loop to succeed. On episode 65, Two News run by
Kyle Clark. He's an AI engineer consultant
who's helped several companies through their AI transformation
from small businesses to big enterprises.
And today we're going to discussthings to look out for when
using AI tools in your personal life, risk mitigations for

(00:42):
businesses implementing AI, and how we can make the most of this
technology in a Safeway. So please enjoy this
conversation with Kyle Clark. Trying to do like a risk
assessment and just coming to the realization that our
processes and procedures and like our definition of risk,
even we're not necessarily appropriately adapted for this

(01:04):
new world that was coming our way.
And it was around Chachi BT3's public release when we had
people putting in requests to, to use it.
And then Microsoft Copilot came along and we got a bunch of
requests for that. And it was you didn't get a

(01:26):
choice if, if you have 365, if you have a certain level of
licensing Copilot, it's just there, right.
Microsoft is, is telling you that this is what you're going
to like whether you like it or not.
And so we, there was an additional licensing level in
365 Copilot that enabled integrations to Outlook and

(01:49):
Teams and SharePoint and all of your company's data.
And I had concerns around the the effective categorization and
labeling of data and the permissions within the
SharePoint infrastructure and the idea of giving people a tool

(02:10):
that could accidentally surface something they should not be
able to see because we weren't inherently ready for something
like that. That gave me a lot of pause.
And then upon further investigation, I found out about
prompt injection and ASCII smuggling attacks through like
Outlook. An attacker can send you an

(02:31):
e-mail that can have like .5 size white font that has a very
comprehensive prompt injection. And when you search through
Copilot for something where Copilot references that e-mail,
it doesn't have to be about thate-mail.
If it pulls that e-mail into itscontext, then if it produces a

(02:53):
link, that link can take you where you're supposed to go.
But it can also as a side effect, exfiltrates your account
information session cookies to a, an attacker who can then use
them and your Copilot license toask Copilot to mimic your
writing style and contact the last 10 people that you

(03:13):
communicated with about something relative to those
conversations and include a linkthat does the same thing.
So basically, I, I was concerned, I was, I was
concerned about the implicationsof, of that.
And as I investigated and spoke to Microsoft engineers about it,

(03:35):
it was one of those things whereyou assume that because a
product is out in the in the wild and like in the hands of
serious companies doing serious business, that it is something
that is complete and secure to areasonable extent.
But fundamentally, such things with the current technology

(03:58):
cannot be completely secured if you can't ensure the integrity
of all of the data that goes into the system.
Right. That's actually one thing that
really concerns me about the proliferation of these AI web
browsers. Comet, Atlas, Strawberry.
There's so many coming out and all of a sudden the entire
Internet is accessible and we don't have the mitigation

(04:19):
against this. Mm, hmm.
Yeah. And none of that is covered in
the advertising. I mean, they'll say, oh, these
have security concerns and don'tgive it super secret
information, but they don't tellyou explicitly how easy it is
for these things to get hijackedfrom a user perspective.

(04:40):
It doesn't. It doesn't take you doing
something wrong. It doesn't take it necessarily
visiting a nefarious website, because legitimate websites can
have comments. Comments can have ASCII smuggled
prompt injections, and so you have to be entirely confident in
the provenance of all of the data that the systems are

(05:00):
interacting with if they have access to any of your sensitive
information, right? If we're talking about like a
general use browser or like a research agent just asking
questions that's not connected to sensitive information, sure,
go wild, have fun, audit the output, check, check the links
to ensure validity. But you don't have to watch it

(05:21):
every second. But any system that has any
level of access to sensitive information, you you have to be
monitoring it very actively to have any kind of confidence that
you're not causing yourself great issues down the road.
Yeah, cuz what's that thing Simon Wilson came out with?
The lethal trifecta, I believe, where it's like access to your

(05:44):
personal information, access to the web, and like read access
for an AI, and it's just like a recipe for disaster.
Exactly, exactly. Yeah, that's, that's what the
Outlook thing came down to. And so, yeah, same exact thing
with the web browsers. I would say a little bit more so
with the web browsers because you're going out into the open
world. Whereas with Outlook it's it's

(06:05):
more like things that come to you.
That might change over time, butfor now I think it's a it's
slightly less of a concern in coming from Outlook.
But yeah, people, people trust these systems.
They trust these companies. Nobody reads the fine print and
everybody is just a little bit jaded to warnings at this point.

(06:29):
Everybody is used to signing offon all of permissions all the
time. Like some of us are paranoid
about cookies and reject them atevery opportunity.
But I, I see how the average user interacts with their
machines and systems and they donot have the patience to be

(06:50):
auditing things to a sufficient level.
And I mean, it's also, it's not fair to really blame people for
not being fully locked down in these ways because apart from a
very small selection of people, nobody was ready for this, for
these risks. No one was ready for this entire
plane of risk. And the human mind does not

(07:14):
naturally inherently understand entirely new paradigms without
getting some situation that makes them relatable inherently,
like accidentally leaking your information to a bad actor.
I'm, I imagine might be one of those incidents for average
folks, but hopefully we can findsomething to bridge that gap

(07:37):
that is less severe. I know they're they're working
on trying to make it better because I remember when for
Chattopia first came out, it wasvery easy to jailbreak it.
Then it got a little more complicated unless you use a
different language, in which case they didn't have the proper
safeguards in place. And then I forget who it was.
They're able to embed instructions within emojis
because there were extra characters within just the the
encoding practice or the encoding protocol.

(07:59):
It's just so much surface area for attack vectors.
Yeah, yeah, it's, it's almost limitless.
And you can, it's like the Swisscheese approach every time.
You can only cover so many things at a time.
But the other component to that is, is that a lot of
organizations with these foundational models, they try

(08:21):
to, they try to address a lot ofthis through post training for
the models themselves. But a lot of those safety
controls are really limited by the tension mechanism.
And so you'll notice a lot of the jailbreak techniques involve
like filling the attention with irrelevant data and then
slipping in the the override after.

(08:46):
So after so much processing has passed essentially, and ensuring
that because it gets lost in themiddle, it won't trigger a lot
of the safeguards. And and that has been a very
effective pattern that has lasted across across systems and
generations of these models. But it's, it's evolving
everyday. There's a elder Plinius is a

(09:10):
well known prompt injector and he does a lot of very
interesting work and releases itto the public.
And I know he has a Discord server.
It's Basi. I don't know exactly what that
stands for, but it's definitely worth checking out if you're
interested in what breaks these systems.

(09:32):
And that can be very helpful data not just for protecting
them, but also just to better use them yourself.
Sometimes you have to bully and gaslight a model into answering
a truly harmless question just because you triggered the
safeguards in some way because of the sensitivity of the
subject. And before I forget, I just
wanted to bring up too, we had an episode with Bobby Chen

(09:54):
recently who is big on the authentication aspect of, of web
browsing agents or AI agents in general.
And there's still no real solution that's wide widely
accepted. So people are logging into their
personal systems. And I remember when I just gave
Atlas a spin for about, you know, 20 minutes before I
deleted it. You have the option to have like
stay signed in mode where just you give your credentials and

(10:14):
then it is you and it'll browse with full access to your,
whether it's your Gmail or whatever social media account,
whatever you decide to log in with.
And then it can just run amok onthe Internet, doing research,
following arbitrary links. With your experience, as
companies start implementing AI systems, different tools,
different workflows, what are some things that they should
look out for in order to do it safely?

(10:35):
It starts with the users, right?The users need to understand the
systems that they're interactingwith and the limitations of
those systems because the marketing does not represent its
capabilities accurately and mostgeneral use does not necessarily
help you understand where the fuzzy edges of the of the

(10:57):
limitations are. And so you have to, you have to
empower your employees to explore the systems and to feel
comfortable making mistakes, giving them sandboxes, giving
them test beds and letting them explore the technology and how
it interacts with their work flows.
Is, is going to be just a reallycritical foundation to ensuring

(11:23):
that they're prepared to use it responsibly.
I've noticed the customer success information from the
different providers has been notkeeping pace with the speed of
the technological development. So open a eyes information.
For example, they released ChatGPT 5 before they had really

(11:48):
training materials for ChatGPT 5.
And GBT 5 was such a profoundly different kind of prompting
structure and response mechanismand even the fact that it might
engage thinking of its own accord.
These are all things that can technically be covered in a
blurb, but as a user experience,it's very, it's fundamentally

(12:08):
different. And if you're not preparing
people for these changes, they're going to lose a lot of
cycles trying to do the things that used to work with the new
technology and getting frustrated by what seems like
failures the technology, but arein fact just failures of change
management. It's, it's, it's, it's the

(12:29):
change management, the organizational governance around
it and just making sure that thethe people feel comfortable
using the technology that they're not going to be
replaced. People need to understand, well,
companies need to understand frankly, that the the real value
add is the augmentation of your human workforce, not necessarily

(12:52):
their replacement. Because even a really good AI
agent is going to fall back at some point to human auditors or
managers. And the the more you construct
the systems to enable your humans to do more, the better

(13:13):
return you're going to get versus just trying to get an
agent that does everything for you.
And. I've talked a lot about the
importance of a human in the loop as as a practice when
you're building your systems, but also implementation.
Have you seen any products or companies that do this
particularly well? I think Anthropic, specifically
with Claude code, has done a really excellent job in allowing

(13:35):
users the freedom to be hands off if they feel they know
genuinely what they're doing. But also having a really good
baseline of asking for permission and, and showing you
what it's doing and telling you what it's doing and providing a,
a base level of auditability. It's been getting better, I
would say in in those ways, but they did have a really strong

(13:58):
foundation. I worry that a lot of companies
are trying to go the opposite way where they're trying to give
you a screen that's fire and forget, where you just give it a
a task and it just goes off and does things in the background.
And while that can make sense, if you have a finely honed

(14:21):
process and finely honed agents that know what you want and how
it fits into your needs, that can make sense.
But for general purpose tools orin my experience for coding
agents, it can be counterproductive because if

(14:43):
you're not following what they're doing at least a little
bit as they're doing it, they're, they're much more prone
to go far off in the wrong direction.
Generally, you, you wanna, you wanna keep an eye on them,
They're, they can run for extended periods of time on
their own. And sometimes what they produce

(15:04):
is usable when they do that. But almost always, if you know
what you're looking for, if you know what you're trying to build
and you're paying attention, you're going to get better
results by interrupting them on a semi regular basis.
Yeah, I know. I've never really trusted Cloud
Code to do Yolo mode. Sometimes they'll fire it up in
a sandbox and just kind of experiment to play.

(15:25):
But any meaningful work that I've done, I review the code, I
review the process, I course correct constantly.
Maybe it's my developer background make me want to have
the control, but I do feel the due diligence is required.
There's just so much potential for it to just like get a signal
that sends down the wrong direction and then it amplifies
that signal. I like to use sub agents a lot,

(15:45):
especially with Cloud Code. Cloud Code has done an amazing
job implementing both like a general tasking sub agent and
specialized sub agents. And I before they even
implemented themselves, I was using Cloud Code to spin up
headless sessions of Cloud Code to do things as sort of my own

(16:09):
gorilla sub agents. But then they introduced it like
a month and a half later and thework wasn't wasted because I
definitely learned something in the process.
But it is an interesting feelinghaving your work become obsolete
within a couple of months. And I guess something that we
have to get used to in this day and age if we're trying to
actually live on the edge with this stuff, but I digress.

(16:33):
In terms of sub agents, I've gone through a lot of different
theories of sub agent interaction.
And I've definitely found that societies of sub agents don't
really work for most of the use cases I'm building it.
You definitely need a, a specific hierarchy of like an
orchestrator and sub agents thatreport to the orchestrator.

(16:55):
Because the more you try to divvy up the responsibility, the
ultimate responsibility, the more difficulty you get in, the
less able you are to keep thingscohesive in any meaningful
manner. Because as you noted, they'll
get a little signal and just go off the deep end.
Do you yeah. Do you think the the sub agent

(17:16):
pattern might be a good mitigation because the rather
than like the the entire contextwindow with the risk of
pollution, it's almost just likea a subset a fully separated
contact window. So maybe the blaster easiest can
be isolated. It's it's a mitigation, but it
also depends on how you're applying them, right?
So more is not always better. There's a lot of people who just

(17:37):
say throw more compute at it. But the how is really important
in that because I've tried just blowing, blowing it wide by
having like 10 sub agents reviewthings from different angles and
then have like a, a meta analysis sub agent come through
and review their analysis and compile the most relevant

(18:00):
things. That that is also subject to
just the detritus, the detritus of communication accumulating
and weighing down the value of the outputs, right?
And it it can work if you have avery clear rubric and the

(18:20):
grading and the information thatthey're generating is very
limited in scope and if it can be amalgamated and analyzed
effectively in batches. But if you're asking for like
qualitative reviews and you're asking many sub agents to
provide qualitative reviews, it's going to be

(18:41):
counterproductive past two or three.
There's just too many different avenues for exploration in
pretty much every vein of interesting research, right.
And so the sub agents really shine when when they have
explicit tasks, when you know exactly what you need from them

(19:02):
and you know exactly the shape that they need to deliver it in
and exactly how much context they need to get to do that
effectively. It it's helpful to have like sub
agent tasking files. You can build a sub agent for
every different task, but this is speaking in Cloud Code

(19:22):
specifically. But those official sub agents
will take up context just layingaround, just being active.
They'll take up context from thecontext window when they don't
necessarily need to. So I will just create markdown
or YAML files that are effectively sub agent

(19:45):
instructions and point and just feed that file into the man line
and say use use this as the sub agent to accomplish XY task.
And that has served me pretty well.
I I have a colleague who swears by having only one named sub

(20:06):
agent and that that is Karen. And Karen's job is to review the
actions of every other agent andsub agent and to see if they
meet quality standards and if not to report them to
management. And he swears by Karen.
He says she's been inordinately helpful to him in his
development endeavors. Pivot away from subway just
though MCP servers. Have you found any patterns or

(20:28):
anti patterns that are common with leveraging MCP in
workflows? I found MCP themselves to almost
be an anti pattern. It's it's great to have a
unifying force, right? It's great to have a standard to
rally around, to inspire people,to give them ideas of what can
be done with the technology whenit's abstract otherwise, right?

(20:50):
For all of those reasons, MCP isa boon to this world.
In practice, it tends to be muchmore context than it's worth.
There are not many slim MCP servers out there.
There are a lot of different solutions that try to manage MCP

(21:11):
server bloat and access and tooluse and permissions, but right
now it's just too many variablesin a package to meaningfully get
reliable utility out of it for most situations.

(21:32):
Now, of course, certain people with certain limited scope
things are going to have MCP servers that make their lives a
lot easier. In practice, though, if it's
something that you're doing a lot, you could probably design
your own tool calls that beat the pants off of any MCP server.
But just like anything else, there's a balance between how

(21:53):
much time people have to investigate things themselves,
build things out themselves, or troubleshoot and maintain the
things that they've built. Because you know, it's, it's
fair to say that that that is a lot of overhead.
But I, I would also just note that a lot of that is building
the skills and the, the muscles to better understand these

(22:15):
products. So even if you were to use an
MCP server later, you would be making better use of that MCP
server because you would fundamentally understand how the
different layers interact with the models.
It's very important to control what's in your context window.
There's also the risk of gettingmalicious MCP servers where
you're just like hey, browsing the web this one sounds cool,

(22:36):
NPM install. And then you really have no idea
what tools it's introducing yoursystem and what access you give
it to. Because again, like you said, we
just accept every parish that comes our way.
Yeah, same. Same reason you don't want to
just install every package that your coding tool recommends to
you because there's I don't knowif you've covered slop squatting
before no, but so there there was domain squatting where

(22:59):
people would take domains that could be misspellings or like
look similar to other domains and use them for nefarious
activities. But bad actors discovered that
coding agents were regularly hallucinating the same package
names of things that didn't exist.
And so they would register package names that kept coming
up in hallucinations as malicious tools.

(23:21):
And so an enterprising vibe coder who doesn't know exactly
what they're looking for with the things that they're creating
could be recommended or even just have the package installed
on, you know, for them by codingagent.
And then someone's got remote access to their system and you

(23:44):
know, all their get keys. You got to be careful out there.
The the Internet happens to be adangerous place it that that's
always been the case. It's just a different version of
dangerous nowadays. Yeah, 'cause it's, it's almost
just like we have a different, adifferent interface, a different
filter. And because it's so new, there's
just not the experience as to best practices.

(24:06):
And people just get blown away by the magical aspect of AI.
We're like, it can do anything. I'm just going to let it
continue to do everything. Yeah, that's, that's actually a
really common anti pattern I've seen just in user level of
adoption. It's if people have been getting
nothing but success in their first few interactions with an
AI system, especially if it's like one of the the simpler

(24:29):
ones. They they put it through the
first few things that they can think of.
And it's hard when you don't have experience with these
systems to understand exactly how jagged the intelligence is
and things that seem like they should be very easy for a
magical robot brain to understand.
Like, you know, the the number of Rs and strawberry is the

(24:51):
classic example can be quite vexing.
And because people will get these initial victories that
hold up in like these, these limited explorations, they will
think that it just keeps workingthat well in all the use cases.
And stepping from like the the company tier to the personal
tier. One example is I like using

(25:14):
Obsidian for my notes. I think it's it's very powerful.
Love the idea of having markdownthat I can just pull anywhere
File over app Love it. I've started running cloud code
in my vault and it can generate dynamic dashboards for me.
It's a ton of fun. I'm cognizant that I don't want
to have all of my personal notesin that vault.
Now I do have a separate personal and and work focused 1.
Do you have any thoughts or bestpractices around using these

(25:35):
coding agents outside of classical coding?
Yeah, I am also a big fan of using them to collect and
organize and make sense of your personal information.
I use repos myself. I open them with Obsidian.
So it's a very similar probably use pattern from from in terms

(25:59):
of consuming what's produced. I, I find that it's very helpful
to organize my information into repos.
I also keep them separated into specific subject matters because
I, I generally don't find it productive to have everything
accessible from every agent session because otherwise things

(26:22):
can get a little bit off topic. I just like to control the
different ways things can go wrong, right?
And basically everything you give to an agent is something
that it thinks it can use to solve a given problem.
And so for that reason, I try tobe very judicious in what I
exposed to agents that I'm that I'm working with, because it's,

(26:48):
it's not just about the the dataprivacy component of it.
It's much more about the cross contamination of data.
Context contamination I guess would be a better.
Turn for him. And that's one reason I have
memory turned off for Chachapi and Anthropic because I've seen
people where they'll test out like an image Gen. model Sora or

(27:10):
whatever, and all of a sudden conversations that they've had
in the past kind of leaked theirway through.
Whether it's like changing the location or changing the vibe of
it. And you just don't know what
it's going to pull if you don't control that system.
Same reason I turn off geolocation because if you're
asking it to generate a picture of you, sometimes it'll just
create like a a sign in the backthat's like the the city name of

(27:33):
where Geo located you 2 and it'slike I didn't ask for that but
but it's in the context so it's relevant according to the model.
And, and so just following that,that train of thought using
these state-of-the-art proprietary models, they always
have the best performance, but you lose control over like the
entire setup. Like we don't know what goes in

(27:53):
the context. They always add their own system
message. Are you a proponent for open
source models, or are they not at the level that they're
functional enough for you? I mean, I like some open source
models. It's, it's variable.
It's variable and it's, it's more, it's definitely a more
cumbersome way to do work for the the most part in that

(28:17):
they're not served as conveniently or if you're trying
to run them locally, you need sufficient hardware to run the
level of quantization that you're looking for.
You can use open source or open weight models to accomplish a
great deal of very productive work.
Generally for the problems that I'm trying to solve, time is, is

(28:42):
one of the more important factors.
And so the principle that I workwith is prove it out with the
big models. And then if you've got the time
and it makes economic sense, refine it.
Because you can accomplish insane things through
orchestration. If you if you build a lot of

(29:03):
scaffolding, you can squeeze so much horsepower out of tiny
models just by being extraordinarily clear and atomic
about the context that you give them and the expectations of
what you get from them. And then just chain a lot of
those interactions into basically an agentic workflow.

(29:28):
And there there are insane efficiencies to be unlocked
there. The counter argument to that is
Richard Sutton's The Bitter Lesson.
I'm sure you've heard of it. Basically, there has never been
a cost curve for any product in existence that I'm aware of like

(29:51):
there is for the inference of foundation models.
They cost for inference of powerful foundation models are
dropping to like one thousandth of what they were within a year,
just because nobody wants the old model after the new model
comes out. And everybody's getting better

(30:12):
at at making their GP us work harder with less.
And so there's efficiencies being gained in every direction.
And while it might be a fun project, and I would say it's
probably a really productive learning exercise for people who
are just trying to get into things.
I mean, don't get me wrong, run your own models.
If you've got the hardware, if you got access to it, run your

(30:33):
own models. It is a fantastic learning
exercise, but if we're talking about trying to productionize
it, trying to save real world time and energy, it's it's
usually a better bet to just go with one of the leading
foundation models and go from there, frankly.

(30:56):
From the business perspective, we're on the same thing.
What would you consider any likeheuristics or rules around the
build versus buy debate? Because I know some companies
don't want to send their data toexternal systems, external APIs
and some other ones want to finetune their models.
So do you have any ideas when people should explore, you know,
just follow your path, use the big ones, get the job done,

(31:17):
figure out later versus bring everything in house and and
control it. I mean.
Yeah, build versus buy is a is awhole spectrum.
And so it really depends on the people that you've got to work
with, right, and their, their workload and their availability
and how much you can afford to dedicate them towards building,
building the skills to build these systems out.

(31:39):
And that's going to be differentfor every organization.
The one thing that I've seen is that you can build a system that
is perfect on paper for exactly what people need to do.
And if it doesn't Click to the end user, how they are supposed

(31:59):
to split this into their day, their standard operating
procedure. If that is not painless, then
the entire, the entire exercise could have been a waste ChatGPT
Enterprise is absolutely worth the value of $40 per user per

(32:21):
month. That's, that's, that's just
true. But the value that you get out
of that is going to depend so much on how how you prepare the
people that are receiving it to use it most effectively and how
you support them to succeed and how you make them feel

(32:45):
supported. It's it's important to make sure
that you understand your systemsbefore you make that decision.
Everybody's systems are going tobe different.
Everybody's data is different and there are really compelling
solutions that work for 90% of organizations that might be

(33:06):
fundamentally unsuitable for youbecause of some nuance about how
your systems work or even like customer agreements.
If you're need to be GDPR compliant, that's going to
substantially affect like how you're allowed to process what
data depending on who your clientele are.
It's, it's good to, if you don'thave in house expertise, it's

(33:29):
good to at least start with a consultant first.
Get your data and your processesmapped before, before you do
anything else. You need to know where your data
is coming from, what transformations are happening to
it and where it's going and whatregulations it's subject to.
And as long as you have that down, then you will be able to

(33:53):
have a very productive and fruitful conversation with a
consultant or a team of consultants and they will be
able to very quickly lead you into the most productive paths.
And that might be to build or that might be to buy, depending
on what your situation is. But it's it's going to be a
unique decision for every organization.

(34:16):
In your experience with the consulting, where do you see
companies making common mistakesor easy?
Besides getting their data set up, what other things could do
is prep work to get ready for for this transformation.
Permissions management is reallybig because permissions
management is one of the trickier parts about managing
effective agents and agentic workflows, ensuring that they

(34:40):
can only see what they're supposed to see.
A lot of these preparations are just good business sense to
start with. So you need to ensure the
Providence of your data, like again, ensure that you know that
everything that comes into this is accurate because the data
that that you're building these systems around is going to have
an outmoded impact. Organizations think that they'll

(35:03):
just take their help desk tickethistory and upload that and
they'll be able to use that to train an AI system to answer
help desk ticket requests. You might be able to
hypothetically do it that simplyin practice you're going to need
to have probably a small team ofpeople pouring through that for

(35:26):
weeks at least to audit and editthese files.
Ensure that extraneous information is removed, and
ensure that the responses are graded so that the models you're
training off of that will know what is.
It is not a good response. You can just feed it all your
data, but it's the old adage garbage in, garbage out.

(35:50):
It's just a lot sneakier. When you don't see the garbage
going in into these systems, it can be much harder to know how
and where it's going to come out.
One thing I want to make sure wetouched on that we've chatted
about that I thought was great is context rot.
So could you explain to the audience what it is, why this
should look out for it, and why it's such a big failure point?

(36:10):
So context rot is something that's been around since people
have been using generative AI. We didn't always have a term for
it. Chroma just released a really
good research paper and explainer video that I would
recommend anybody to go check out.
Basically every token that goes into a prompt or a response is a

(36:36):
point of failure, right? Because every token has a a
weight to it that pulls the response in One Direction or
another, and if it's not explicitly leading towards the
outputs that you're looking for,it's detracting from from the
response. And the more tokens, the greater

(36:57):
the cumulative effect. So quad sonnet by default.
I know it has a bigger one available, but by default it has
a 200,000 token context window. I try to keep my operations
under 70,000 tokens when I'm processing data if it's hard to

(37:19):
explain exactly, but when it's on a roll, sometimes I'll push
that to 100,000, but much beyond100,000 and I'm starting a new
session and and working from there.
I don't use automatic compact I I like Dexter Horthy's
purposeful compaction practice where it's basically the human

(37:39):
understands what is it important, what is and is not
important in a prompt, and the human takes the important stuff
and brings it into a new chat session.
But I I digress. So beyond 100,000 tokens, you
start to see a rapid increase inhallucinations and distractions

(38:01):
and erroneous responses, and certain things in the training
data become more pronounced. For example, sonnet 4.5 seems to
be really convinced it's still 2024.
I don't know why that is, but the further into the context
window you are, the more prominent that is.
And so there's just these littlethings that keep going more and

(38:23):
more wrong. And by the time you get to the
end of the context window, it's almost always unreliable outputs
that you're getting. So it not every interface makes
it really easy to see how much of context window you're using.
And I think that's a failure of providers and I think that that
is going to improve over time aspeople start to become educated

(38:43):
about what is and is not a good experience with these models.
Cloud Code has the slash contextcommand, Codex CLI has just the
context percentage at the bottomthere.
So whatever interface you're using, try to understand how
much of the context window you've been using.
And if it's been, if your chat'sbeen going for a while, just
start a new chat. You'll get better outputs.

(39:07):
And again, watch the Chroma video called Context Rot on
YouTube. 6 minutes long or so. Definitely worth your time and
pretty charts and graphs. They can explain it to you.
Pretty much anyone. We'll make sure.
It's linked down below. Kyle, this would be great.
Before we let you go, is there anything you want the audience
to? Know you have to be exercising
your own human reasoning and judgement at all times if you're

(39:31):
making systems with these models.
There is no substitute for the human in the loop, and the human
who's in the loop needs to know what they're doing in the loop
for the loop to succeed. I can tell people to educate
themselves, join communities, learn the stuff's all available
to you guys. There's no there's no reason

(39:53):
that anybody can't be an expert in this stuff in short order
with the resources that are freely available to everybody
today. Thank you for this my
conversation with Kyle Clark. One thing I want to highlight is
that Kyle and I met through the tool use Discord.
There's a lot of great conversations there.
I think you'd really enjoy it, so I encourage you to join.
The invite link is down below and I will see you next week.

All Episodes

Episode Transcript

Popular Podcasts

Stuff You Should Know

Paper Ghosts: The Texas Teen Murders

Dateline NBC

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Practical AI Safety (ft Kyle Clark)

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Stuff You Should Know

Paper Ghosts: The Texas Teen Murders

Dateline NBC

All Episodes

Practical AI Safety (ft Kyle Clark)

Stuff You Should Know