Independence Day: Cloudflare's Dual Defense for Mobile Apps & Original Content

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Imagine there's this silent war happening right now, you know,
behind the scenes of the Internet.

Speaker 2 (00:04):
We all use a war, interesting way to put it.

Speaker 1 (00:07):
Well, yeah, it's like a battleover online content where artificial
intelligence is bumping right up against human creativity. And the
big question is who's actually winning and maybe more importantly,
what's it mean for how you find information, how stuff
gets made online, and well who gets paid.

Speaker 2 (00:26):
It really is a high stakes situation. Yeah, and today
we're going to get right into that intersection AI bots,
content creation, which is you know, the lifeblood for so
many and the actual plumbing the infrastructure of the Internet itself. Okay,
we're going to unpack some really pivotal changes in how
websites are trying to fight back, you know, protect their
valuable content, and how these AI systems are interacting with

(00:49):
all of it.

Speaker 1 (00:49):
And we've been looking at a whole stack of recent articles,
technical updates, even some legal stuff that's been filed. It
gives us a pretty good picture of how one huge player,
cloud Flare and the handle something like what twenty percent
of all Internet traffic.

Speaker 2 (01:03):
Around that Yeah, a significant junk.

Speaker 1 (01:05):
How they're basically trying to rewrite the rules for everyone exactly.

Speaker 2 (01:09):
And our goal today really is to help you cut
through the noise a bit, understand why these big changes
are happening now, the clever tech being used to make
them work right, and maybe most importantly, what it all
means for the future of information online and like who
owns what in the digital world for all of us? Really?

Speaker 1 (01:29):
Yeah, this isn't just some minor tweak in the background.

Speaker 2 (01:32):
No, No, it feels much more fundamental. Yeah, like rewriting
the Internet's next chapter.

Speaker 1 (01:36):
Okay, so let's get into this battle. Then we see
AI doing more and more online, but clearly some of
it's not exactly welcome. What's the actual problem these AI
bots are causing for say, poeeshers or anyone creating content.

Speaker 2 (01:51):
Well, historically you had bots like search engine crawlers, right,
google bot being bought. Sure, yeah, and they were mostly
seen as okay, even good, they'd index your content yet,
but the main point was they sent traffic visitors back
to your website.

Speaker 1 (02:05):
How was the deal exactly?

Speaker 2 (02:07):
That was a sort of unwritten agreement. You crawl my site,
you send me clicks, I get ad revenue or engagement.
Everyone's reasonably happy. But the huge shift now with this
AI boom is this new type of bot. They're scraping everything,
tech images, whole articles, not to send visitors back to
the source, but specifically to train their big AI models.

(02:31):
And then critically, they generate answers directly inside the AI
tool itself.

Speaker 1 (02:35):
So they bypassed the original site completely.

Speaker 2 (02:38):
Completely, which you know, starves the content creators of that traffic,
that potential revenue that their work should be generating.

Speaker 1 (02:44):
It just breaks that old agreement, right, So it's not
symbiotic anymore. It feels more parasitic if they're just taking
without giving back traffic or money. What does that actually
do to someone's motivation to create original stuff? Why even bother?

Speaker 2 (02:56):
That's the million dollar question, isn't it? Or maybe the
billion dollar question now. Matthew Prince, he's the CEO Cloudslair.
He put it pretty bluntly. He said, if AI companies
just freely use data without asking or paying, then quote
the incentives for content creation are dead.

Speaker 1 (03:11):
Wow, stark it is.

Speaker 2 (03:14):
And it's not just a hypothetical worry. Cloud Flare sees
the traffic right. Their data shows the sheer scale. Back
in March, they were seeing these AI data crawlers hitting
their network more than fifty billion requests every day.

Speaker 1 (03:28):
Fifty billion a day per day.

Speaker 2 (03:31):
Yeah, it's just an astronomical number. It really shows how
much data is being vacuumed up, and you can understand
the pressure publishers and creators are feeling.

Speaker 1 (03:40):
Okay, fifty billion requests, Yeah, it definitely paints a picture.
It makes that broken agreement feel very real, very immediate,
and it's leading to actual fights right like in court.

Speaker 2 (03:50):
Oh. Absolutely. The legal challenges are really heating up and
getting public. Just last month, Reddit suit Anthropic. They claimed
Anthropic and lawfully used data from their platform and reticates
what one hundred million users a day, huge numbers, massive
use that data to train its AI. And then, probably
the most famous one, The New York Times suit OpenAI
and Microsoft back in twenty twenty three over copyright infringement

(04:12):
directly linked to their AI systems.

Speaker 1 (04:14):
I remember that.

Speaker 2 (04:15):
Now open Ai and Microsoft they deny those claims, of course.
But it's not all conflict. Some publishers are striking deals.

Speaker 1 (04:22):
Oh interesting.

Speaker 2 (04:23):
Yeah, the Times, for instance, they made a deal with
Amazon to license their content for Amazon's AI stuff. And
you've got other big names Axel, Springer, Conde, nast NewsCorp.
They've also signed licensing agreements, so AI companies are paying
them for using their material.

Speaker 1 (04:40):
So it's splitting. Some are fighting it out legally, others
are negotiating licenses.

Speaker 2 (04:44):
Exactly. It shows there isn't one single response yet. It's
still shaking out.

Speaker 1 (04:47):
That really sets the scene. But let's shift to the plumbing,
as you called it, the infrastructure. With all this going on,
how is a company like cloud Flare actually changing the
Internet's code its rules to deal with this AI scraping issue.
This sounds like where the real action is.

Speaker 2 (05:03):
It really is transformative. Cloud Flare is rolling out a
major policy shift. They have this new setting. It's permission
based and it's sent to go live July first.

Speaker 1 (05:12):
Twenty twenty five, okay, so very soon.

Speaker 2 (05:14):
Very soon, And basically it lets their customers know any
website using cloud Flare automatically block AI companies from scraping
their sites by default.

Speaker 1 (05:24):
By default, so you don't have to opt in to
block in.

Speaker 2 (05:26):
Exactly, the de cell changes from July first, AI bots
will need explicit permission to crawl a cloud Flare protected site.
Matthew Prince again, he said, we're changing the rules of
the Internet. If you're a robot, now you have to
go on the toll road in order to get the content,
a tool road.

Speaker 1 (05:42):
I like that.

Speaker 2 (05:43):
Yeah, it flips the model instead of AI crawlers assuming
access is okay unless told otherwise. Now it's access denied
unless you have permission.

Speaker 1 (05:50):
That's a huge power shift back to website owners. Gives
them a big lever. But what about things beyond just websites,
like mobile apps? That's such a critical area. Now does
this cloud Flare protection reach into apps too?

Speaker 2 (06:03):
It absolutely does, and that's a key point. The immediate
effect is more control for content owners. Yeah. Roger Lynch,
CEO of Conde Nasty, he welcomed it publicly called it
a critical step toward creating a fair value exchange.

Speaker 1 (06:15):
Makes sense.

Speaker 2 (06:15):
And yes, this protection isn't just for web browsers. Cloud
Flair designed their AI bought blocking tech to cover both
websites and mobile apps. Their bot Management feature, for example,
it analyzes traffic right, but it doesn't just look at
desktop signals, okay. It uses machine learning models trains specifically
on mobile request data. That helps improve accuracy and cut

(06:38):
down on blocking legitimate mobile users by mistake. Things like
their superbot fight Mode are designed to stop automated traffic,
including the kind that often targets mobile apps and their
back end APIs.

Speaker 1 (06:48):
APIs right the connections apps used to get data exactly,
and for apps.

Speaker 2 (06:52):
That rely heavily on those APIs, cloud Flair also offers
their API Gateway product for even more targeted protection. It
acts like a shield specifically for that app to server communication.
So yeah, mobile is definitely part of the picture.

Speaker 1 (07:04):
So it's about stopping the unwonted stuff, regaining control and
maybe just maybe getting paid. Finally, is there a longer
term plan for creators to actually get compensated when AI
uses their content?

Speaker 2 (07:16):
Yeah, that's definitely on the roadmap. Cloud Flare is actively
working on something they're calling a pay per crawl system.

Speaker 1 (07:22):
Pay per Crawl.

Speaker 2 (07:23):
The idea is it would give content creators the option
to charge AI companies for crawling and using their original content.
Princes seems really confident they can enforce it too. He said, Uh,
I'm one hundred percent confident we can block them and
if they don't get access to the content, then their
product will be worse.

Speaker 1 (07:40):
Huh, So pay up or your fancy AI gets dumber.

Speaker 2 (07:44):
That's the leverage basically create a real economic reason for
AI companies to play by these new rules to join
this fairer system they're trying to build.

Speaker 1 (07:54):
That sounds powerful, But okay, blocking is one thing, the
toll road is another. But how do you even know
if a bot is, say Google trying to index you,
which you probably want, versus some scraper you don't want,
or even something malicious. How do you tell the good
bots from the others when they all just look like
network requests, especially if the old ways don't work.

Speaker 2 (08:13):
Yeah, that's a huge technical challenge. Yeah, because you're right,
the old ways like looking at the user agent header
that's just the bot saying Hi, I'm Google bot, which
anyone can fix, anyone can fake it trivially, or trying
to track IP addresses. Those change all the time, especially
for big services. So those methods are pretty much, as
cloud Flair puts it, broken for reliable authentication.

Speaker 1 (08:35):
So what's the fix?

Speaker 2 (08:36):
They're pushing for something much more robust, using established cryptography,
basically getting legitimate bots to prove who they are in
a way that can't easily be faked. Using digital signatures.

Speaker 1 (08:47):
Okay, cryptography, digital keys and stuff.

Speaker 2 (08:50):
Exactly secure digital keys it provides a much stronger signal
for the website owner to trust.

Speaker 1 (08:56):
And what's the main technique they're backing.

Speaker 2 (08:58):
For this their preferred method right now, It's called HTTP
message signatures. You can think of it like a digital
passport for the bot.

Speaker 1 (09:05):
Okay.

Speaker 2 (09:06):
Instead of just showing up, the bot uses its private
key to cryptographically sign the request it sends. It's like
putting a unique, verifiable digital seal on it. And this
isn't just some cloud floor idea. It's based on a
published standard RFC nine four two.

Speaker 1 (09:21):
One, an official standard.

Speaker 2 (09:22):
Right which gives it weight. The request includes special headers
like signature input and signature agent that tell the server
who signed it, how to verify it, and even includes
a little tag like web bot off to say why
it's crawling.

Speaker 1 (09:35):
So it's declaring its identity and purpose verifiably precisely.

Speaker 2 (09:40):
And the really big news here is that major AI
players like OpenAI have already started signing their bot requests
using this standard.

Speaker 1 (09:48):
Oh wow, so it's actually happening.

Speaker 2 (09:50):
It's actually happening, yeah, which is a massive step toward
getting wider adoption and making the bought world a bit
more transparent.

Speaker 1 (09:56):
Okay, So that's HTTT message signatures. Are there other ways
they're looking at for bots to prove their ID?

Speaker 2 (10:02):
Yes, there's another approach they're exploring called request MTLs, which
stands for mutual TLS.

Speaker 1 (10:08):
Mutual TLS sounds serious.

Speaker 2 (10:10):
It is. It's another very strong security method. In simple terms,
both the bot and the website verify each other's identity
using secure digital certificates like high tech ID cards during
the connection setup.

Speaker 1 (10:22):
So both sides prove who they.

Speaker 2 (10:23):
Are, right mutual authentication. They've even proposed a small change,
a new flag in the connection process where a bot
can signal hey, I support MTLs. This lets the server
ask for the bot certificate without accidentally blocking regular users
who wouldn't have one.

Speaker 1 (10:37):
Clever, So why focus on the HTTP signatures first? If
MTLs is also strong?

Speaker 2 (10:43):
Good question. While MTLs is definitely robust, the thinking seems
to be that HTDP message signatures might be a bit
simpler for bought operators to adopt. Right now, there are
already reference implementations out there, so it's currently got a
bit more momentum. But both are aimed at that verifiable
identity goal. Got it?

Speaker 1 (11:02):
So these advanced methods signatures or MTLs they're really about
giving website owners clear visibility, real control. What's the upshot?
What does this mean for identifying bots and managing content
down the road?

Speaker 2 (11:14):
Well, the core idea for both approaches is the same.
Let legitimate bot owners prove who they are in a
way that's tamper proof. That verification is key. It gives
site owners much better control over who they let in,
and it opens the door to actually having sensible conversations
about crawling agreements because you know who you're talking to.
It's not just some anonymous IP address.

Speaker 1 (11:33):
You can have a relationship based on verified identity exactly.

Speaker 2 (11:37):
And cloud Flair is planning to weave these authentication checks
right into their tools, like their AI, audit and bot
management products. This will give website owners unprecedented insight into
which bots are willing to stand up and identify themselves. Honestly,
it kind of shifts the burden of proof onto the bot.

Speaker 1 (11:55):
That makes a lot of sense. Now, these moves by
cloud flare they sound really significant, especially given their position
in the Internet's infrastructure. But are they the complete answer?
Is this enough to solve the whole complex mess of
AI and content rights or is it just one piece
of a much bigger puzzle.

Speaker 2 (12:13):
That's a really important question to ask. And when you
look at what other experts are saying, the view is
more nuanced. They see it as a vital step definitely,
but maybe not the final solution. At Newton Rex, he
founded an organization called Fairly Trained Data. They certify AI
companies that use properly licensed data.

Speaker 1 (12:33):
Right promoting ethical training data exactly.

Speaker 2 (12:36):
He called cloud flares move a sticking plaster like a
band aid. He said pretty directly that what's required is
major surgery, meaning we still need much stronger legal protections.

Speaker 1 (12:47):
Overall sticking plaster.

Speaker 2 (12:49):
Yeah, and he used this great analogy, said cloud flares
protection will only offer protection for people on websites they control.
It's like having body armor that stops working when you
leave your house.

Speaker 1 (12:59):
Ah Okay, that really lands. It highlights the limits. It
protects your own turf, but maybe not content that ends
up elsewhere.

Speaker 2 (13:05):
Precisely, it underscores that need for broader solutions, maybe even
legislative ones that go beyond what a single, even powerful
company can do on its own infrastructure.

Speaker 1 (13:15):
That's a really clarifying perspective. So putting it all together,
this shifting landscape cloud flares moves, the legal fights, the
licensing deals. What does it really mean for the people
creating content and for you listening as someone who learns
online consumes information. What's the sort of bigger picture impact here?

Speaker 2 (13:37):
Well, the bigger picture is really about trying to forge
a sustainable future. It's about building a new economic model,
fundamentally where creators actually get fair compensation for the value
they provide.

Speaker 1 (13:49):
Bringing back the incentive exactly.

Speaker 2 (13:52):
These steps, like cloudslaars their crucial groundwork. They give creators
more control, which is the first step, and that control
hopefully helps sure we continue to have a vibrant public sphere,
a place with diverse original information and creativity.

Speaker 1 (14:06):
Because if the creators aren't supported.

Speaker 2 (14:08):
If they aren't supported, if the incentive really does die,
like Prince warned, then that wellspring of original content could
just dry up, and that would change the Internet profoully,
likely not for the better. So these actions, they're about
finding a way for the Internet as we know it
to survive the age of AI. It's about ensuring creativity
isn't just scraped away without acknowledgment or payment, but it's

(14:29):
actually valued, incentivized.

Speaker 1 (14:32):
Protected, Recognizing that creating stuff has.

Speaker 2 (14:35):
Real value, real inherent value, whether it's an article, a photo,
a piece of software, whatever, and that value needs a
fair exchange. This whole situation feels like the start of
trying to rebalance those scales.

Speaker 1 (14:47):
Okay, So, just to wrap up our look at these
new Internet rules, we've seen cloud Flare stepping up, really
taking an active role, changing the defaults for AI bots,
moving to permission first.

Speaker 2 (15:02):
Prove their identity, yeah, HTTP signatures MTLs, and pushing towards
actual payment models like pay per crawl so creators might
actually get compensated.

Speaker 1 (15:10):
It really does feel like a potentially pivotal moment trying
to rebalance things.

Speaker 2 (15:14):
It definitely feels like a major attempt to shift the
power dynamic. Yeah, rebalancing the scales between the folks creating
the content and the AI systems consuming it. It's about
acknowledging that human effort has value, even maybe especially in
an age dominated by algorithms.

Speaker 1 (15:30):
Absolutely, which leaves us with maybe a final thought for
you to consider. As AI gets woven deeper and deeper
into how we all find information, how we learn, how
we consume things online, how might these new rules of
the Internet ultimately change the value we place on original content.

Speaker 2 (15:49):
Yeah, and what does digital creation even look like going forward?

Speaker 1 (15:52):
Exactly what would a truly fair value exchange look like
online in the next few years, And how will that
shape the articles you read, the images you see, the
tools you use every single day.

Speaker 2 (16:02):
It's definitely something to keep thinking about as this whole
landscape keeps evolving so incredibly fast.

All Episodes

Episode Transcript

Popular Podcasts

Stuff You Should Know

The Joe Rogan Experience

What Are We Even Doing? with Kyle MacLachlan

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Independence Day: Cloudflare's Dual Defense for Mobile Apps & Original Content

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}Stuff You Should Know

The Joe Rogan Experience

What Are We Even Doing? with Kyle MacLachlan

All Episodes

Independence Day: Cloudflare's Dual Defense for Mobile Apps & Original Content

Stuff You Should Know