All Episodes

January 31, 2025 32 mins

In this episode, Ed Zitron explains how DeepSeek, a relatively-unknown Chinese model AI developer incubated in a hedge fund, has punctured the generative AI bubble, throwing the US startup scene (and markets) into disarray.

---

LINKS: https://www.tinyurl.com/betterofflinelinks

Newsletter: https://www.wheresyoured.at/

Reddit: https://www.reddit.com/r/BetterOffline/ 

Discord: chat.wheresyoured.at

Ed's Socials:

https://twitter.com/edzitron

https://www.instagram.com/edzitron

https://bsky.app/profile/edzitron.com

https://www.threads.net/@edzitron

See omnystudio.com/listener for privacy information.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:02):
A media Hello and welcome to Better Offline. I'm your
host ed Zitron. What a lot of you have been
getting in touch? Yes, you're getting your Deep Seek episode.

(00:24):
In fact, this is the first of a two parter.
This will come out on Friday, which is when you're
listening to this, and then it'll follow up on Monday.
I apologize. I spent a lot of Monday writing this
and also learning about a lot of this stuff in
an attempt to distill it as best I could. This
situation is extremely weird, and it's developing, and I think

(00:45):
even when I put out this episode there will be
new parts of it that I have yet to really
get to. I will do my absolute best to explain
in these episodes both what is happening with Deep Seek,
what it means, what they've built, and what it's going
to do in the future. But let's again, so, as
general came to a close, the entire generative AI industry

(01:05):
found itself in a kind of chaos. In sure, the
recent AI bubble and in particular the hundreds of billions
of dollars being spent on it, hinged on this big
idea that we need bigger models, which are both trained
and run on bigger and even larger GPUs, almost entirely
sold by Nvidia, and in turn they're based in bigger
and bigger data centers owned by companies like Microsoft, Oracle, Amazon,

(01:27):
and Google. Now, there was also this expectation that this
would always be the case. Hubris within this industry is
kind of part of the whole deal, and generative AI
was always meant to be this way, at least for
the American developers. It was always meant to be energy
and compute hungary. Throwing entire Zoo's worth of animals and
boiling lakes was necessary to do this. There was never

(01:50):
any other way to do it, and I thought, at
least I've thought for a while that this was because
they just they tried to make them more efficient, but
they couldn't. There was just something about transformer based architecture,
like the stuff that underpins Chat GPT, so the GPT
model under Chat GPT either. It wasn't the case, though.
A Chinese artificial intelligence company that few people had really

(02:13):
heard of, called deep Seak came along a few weeks
ago with multiple models that aren't merely competitive with open aiyes,
but actually undercut them in several meaningful ways. Deep Seak's
models are both open source, which means that their source
code and research is public, and they're significantly more efficient
as well as much as thirty times cheaper to run.
In the case of their reasoning model are one which

(02:34):
is competitive with open Aizo one and fifteen or more
times more efficient than GPT four. Oh, it's actually kind
of crazy when you think about it, and as you're
going to hear, this whole thing has jokeified me all
over again. And what's crazy is that some of them
can be distilled, which I'll get too later and run
on local devices like a laptop. It's kind of crazy,

(02:56):
and as a result, the markets have kind of panicked
because the entire narrative of the AI bubble has been
that these models have to be expensive because they are
the future, and that's why hyperscalers had to burn two
hundred billion dollars in capital expenditures for infrastructure to support
this wonderful boom, and specifically the ideas of open AI
and anthropic the idea that there was another way to

(03:18):
do this, that in fact, we didn't need to spend
all this money, and that maybe we could find a
more efficient way of doing it. Well, that would require
them to have another idea rather than throw as much
money at the problem as possible. Yeah, they just didn't
consider it, it turns out. And now as long as
come this outsider that's upended the whole conventional understanding and

(03:39):
perhaps even dethroned a member of America's tech royalty, Sam Altman,
a man who has crafted, if not a cult of personality,
some sort of public image of an unassailable visionary that
will lead the vanguard in the biggest technological change since
the Internet. Yeah, he's wrong. He never was doing that.
I've been saying it for a while. He's never been

(03:59):
doing this. But Deep Seek isn't just an outsider now.
They are a company that's emerged as a side project
from a tiny, tiny Chinese hedge fund, at least by
the stands of hedge funds, like five point five billion
dollars on the assets under management, and their founding team
has nowhere near the level of fame and celebrity or
even the accolades of Sam Moltman. It's distinctly humiliating for

(04:21):
everyone involved that is in Deep Seek. And on top
of all of that, Deep Seek's biggest ugliest insult is
that its model, deep seek are one, is competitive, like
I said, with open AI's incredibly expensive oh one reasoning model,
yet significantly and I mean ninety six percent cheaper to run.
And it can even be run locally. Like I said
speaking to a few developers, I know one was able

(04:42):
to run deep Seek's R one model and their twenty
twenty one MacBook Pro with an M one chip that
is a four year old computer, not a thirty thousand
GPU inside. It's kind of crazy. Worse still, Deep seeks
models are made freely available to use, with the source
code published under the MIT tech line, along with the
research on how they were made, although not the training data,

(05:04):
which makes some people say it's not really open source.
But for the sake of argument, I'm just going to
say open source. And this means by the way that
deep seeks models can be adapted and used for commercial
use without the need for royalties or fees. Anyone can
take this and build their own. It's kind of crazy.
By contrast, open ai is anything but open and its

(05:24):
last LM to be released under the MIT license was
twenty nineteen. Is GBT two No no wait wait, ship
let me correct that deep Seek's biggest ugliest secret is
actually that it's obviously taking aim at every element of
open aiy's portfolio. As the company was already dominating headlines,
this week it quietly dropped its Janus Pro seven B
image generation and analysis model, which the company says outperforms

(05:47):
both stable diffusion and open AI's Daly three. And those
are by the way image generation thinks. So you type
in something you like Garfield with boobs, and then outcomes
of Garfield with juicy cans, and that's probably the first
time you hear that on the podcast, but probably not
the last. And as with its other code, deep Seek
has made this freely available to both commercial and personal

(06:09):
users alike, whereas open ai is largely paywall darly three.
This is really, it's a truly crazy situation. And it's
also this cynical, vulgar version of David and Goliath, where
a tech startup back by a shadowy Chinese hedge fund
with eight billion dollars under management is somehow the plucky
upstart against the lumbering loss eo fish one hundred and
fifty billion dollars startup back by multiple public tech companies

(06:33):
with a market capitalization of other three trillion dollars I realized,
by the way I said earlier, five point five billion
dollars under management. This is why you check your notes
in advance. But I'm not cutting it. This is fresh.
I am inside a closet in New York. The content
must flow anyway. Deep Seek's V three model, which is
comparable and competitive with both open AI's GPT four roh

(06:54):
and anthropics Claude Sonnet three point five models, which by
the way, has some reasoning features. As I said, it's
fifty three times cheaper to run the R one when
using the company's own cloud services, and as mentioned earlier,
said model is effectively free for anyone to use locally
or on their own cloud instances, and could be taken
by any commercial enterprise and turned into a product of

(07:15):
their own should they desire to say, compete with open Ai,
the loudest and most annoying startup of all time. In essence, Deepseek,
and I'll get into its background and the concerns people
might have about its Chinese origins released two models that
perform competitively and even beat models from both open Air
and Anthropic, undercut them in price, and then made them

(07:35):
open undermining not just the economics of the biggest generative
AI companies, but laying bare exactly how they work. The
magic's gone. There's no more voodoo inside Samultman's soul. It's
all out there. And the last point is extremely important
when it comes to open EI's reasoning model, which specifically
hid its chain of thought for fear of these unsafe

(07:57):
thoughts that might manipulate the customer. And then they add
slightly under their breath that the actual reasons they did
it was a competitive advantage. Now to explain what that means.
When you make a request with open Aiy's oh one model,
say give me all the states with the letter are
in them, it actually shows you like the thinking. And
by the way, these things don't fucking think. They're they're
computer bullshit, like they don't think at all. But I'm

(08:19):
going to use it just for this so you see it.
Say okay, here are all the American states, which ones
have that letter? I'm checking all of those. It's effectively
having a large language model check a large language model. Now,
the thing is the steps they were showing you were
all cleaned up. They would look nice, they would be
formatted nicely. Deep Seak's chain of thought is completely laid bare,

(08:42):
which is very interesting because it really takes the wind
out of open Aiy's sales. And on top of that,
it allows you to see actually how these things think
through things, again not really thinking, but still you can
see things about how large language models work that these
companies didn't want you to have. On top of this,
open aiy one model has something even shittier to it,

(09:04):
which is these chain of thought things all cost money.
When you see it generate these thoughts, it's actually generating
more thoughts than you see because they're hiding the chain
of thought. So open ai is just charging you an
indeterminate amount of money, an insane amount of money, as
I'll get too later. But nevertheless, you don't know what
you're being charged for. You don't even know what's really
going on under the hood. Or you could use deep

(09:26):
seek and let's be completely clear, by the way, open
AI's literal only competitive advantage against Meta and Anthropic was
its reasoning models OH one and O three and O three,
by the way, is currently in a research preview and
is mostly just more of the same. Although I mentioned
earlier in the show that anthropics. Claudes Sonnet three point
five has some reasoning features. They're comparatively more rudimentary than

(09:48):
those in O one and O three, and i'd argue
are one, which is deep Seek's model. In an AI context,
reasoning works by breaking down a prompt into a series
of different steps with considerations of different approaches. Like I
said earlier, effectively a large language model checking its own
homework with no thinking involved, because like I said, they
do not think or no things an open Ai rushed

(10:09):
to launch its O one reasoning model last year because,
and I quote fortune from last October, Sam Mormon was
eager to prove to potential investors that in the company's
latest funding around, the open ai remains at the forefront
of AI development, and as I've noted in my newsletter
at the time, it was not particularly reliable, failing to
accurately count the number of times the letter R appeared

(10:31):
in the word strawberry, which was the code name for
one very funny stuff. At this point, it's fairly obvious
that open ai wasn't anywhere near the forefront of AI development,
and now that its competitive advantage is effectively gone, there
are genuine doubts about what comes next for the company.
As I'll go into there are many questionable parts of
Deepseek's story. It's funding, what GPUs it has, and how

(10:54):
much it actually spent training these models. But what we
definitively understand to be true is badly for open Ai,
and I would argue every other large US tech firm
that's jumped onto the generative AI bandwagon in the past
few years. Deep seeks models actually exist. They work, at

(11:20):
least by the standards of hallucination PRONELLA lams that don't,
at the risk of repeating myself know anything. They've been
independently verified to be competitive in performance, and their magnitudes
cheaper in price than those from both hyperscalers, Google's Gemini, Mets, Lama,
Amazon Que and so on and so forth, and from
those released by open Ai and Anthropic. Deep seeks models

(11:41):
don't require massive new data centers. They run on GPUs
currently used to run services like chat, GPT, and even
work on more austere hardware, Nor do they require an
endless supply of bigger, faster Nvidio GPUs every single year
to progress. The entire AI bubble was inflated based on
the premise that these models were simply impossible to build
without burning massive amounts of cash, straining the power grid,

(12:04):
and blowing past emission skulls, and that these costs were
both necessary and really good because they'd lead to creating
powerful AI, something that's yet to happen. And it's kind
of obvious at this point that that wasn't true. Now
the markets are sitting around there asking a very reasonable question, Shit,
did we just waste two hundred billion dollars? Anyway, let's

(12:27):
get into the nitty grit. What is deep Seek? First
of all, if you want to super deep dive into
what it is, I can't recommend venture beats right up enough.
I'll link to it in the show notes as they
usually do. It's really good and it goes into a
lot more detail than I woar. But here's the too
long didn't read for you. Deep Seek is a spin
off from a Chinese hedge fund called high Flyer Quant.

(12:47):
It's a relatively small and young company, and from its
inception it went big on algorithmic and AI driven trading.
Later it started building its own standalone chat bots, including
a chat GPT equivalent for the Chinese market. This is
what we need, right Now, I'm sure some of you
will say, oh, well, who knows if that's really true. Sure,
I think that that's fair. I also think that there
are parts of Sam Mortman's legend that we should question

(13:09):
as well. I think the circumstances under which Sam Mortman
got made head of y Combinator are extremely questionable. I'm
saying you can question deep Seek, and indeed you should.
We should be more critical of these powerful companies, but
don't do it halfway. If we're going to be worried,
let's be worried about everyone. Now. Deepseak did a few
things differently, like open sourcing its models, although it likely

(13:31):
built upon take from other companies like Metaslama and the
mL library PyTorch to train its models. It's secured over
ten thousand Nvidia GPUs right before the US imposed export restrictions,
which sounds like a lot, but it's a fraction of
what the big AI labs like Google, Open Air, and
Anthropic have to play with. I think I've heard estimates
of like one hundred thousand to three hundred thousand each,

(13:53):
if not more. Now you've likely seen or heard that
deep Seak trained its latest model for five point six
million dollars a poster to the insane amounts that I'll
get to later, and I want to be clear that
any and all mentions of this number are estimates. In fact,
the provenance of the five point five to eight million
dollar number appears to be a citation of a post
made by an nvidio engineer in an article from the

(14:15):
South China Morning Post, which links to another article from
the South China Morning Post which simply states that deep
Seat V three comes with six hundred and seventy one
billion parameters and was trained in around two months at
the cost of five point five eight million dollars with
no additional citations of any kind. So you should take
it with a pinch of salt. But it's not totally ludicrous. Well,

(14:36):
there are some that have estimated the cost. Deep Seeks
V three models allegedly trained using two thousand and forty
eight n video h eight hundred GPUs according to its paper,
and Ben Thompson of Strategory has made this clear that
the five point five million dollar number only covers the
literal training cost of the official training run, and this
is made fairly clear in the paper by the way

(14:56):
of V three, and that's the one that's competitive with
Opening Eyes GPT four O model, meaning that any costs
related to prior research or experiments on how to build
the mooddle were left out. Now big big shower to
Minimaxer the guy on Blue Sky and Twitter, he's great.
He is wonderful, and also added that this is fairly
standard for the industry. Again, you choose how you feel
about this, but I want to give you the information.

(15:19):
And while it's safe to say that deep Seak's models
are cheaper to train, the actual costs, especially as deep
Seak doesn't share its training data, which some might argue
means its models are not really open source. As I said,
the numbers get a little harder to guess at. Thompson
notes that Deep Seek had to craft a bunch of
elegant workarounds to make the model perform, including writing code
that ultimately changed how GPUs actually communicated with each other.

(15:41):
This functionality isn't otherwise possible using Nvidia's developer tools. They
really had to get in there. It's kind of cool.
Deep seaks models V three and R one are more
efficient and as a result, cheaper to run, and can
be accessed via its API at prices that are astronomically
cheaper than open eyes, Deep seat Chat running deep six
GPT four oh competitive V three model cost zero points

(16:04):
zero seven cents per one million input tokens as in
commands given to the model, and one dollar one ten
per one million output tokens as in the resulting output
from the model. I know that these numbers kind of
like just sound like numbers like you, Maybe you don't
have context, so let me give you some. This is
a dramatic price drop from the two dollars fifty cents

(16:24):
per one million input tokens and ten dollars per one
million output tokens the open Ai charges for GPT four. Oh,
this isn't just undercutting, this is this is a bunker buster. If. Now,
there is a side that I'll kind of get into
a little bit later, in that you are using models
hosted in the country that you don't know, probably China.

(16:46):
There are data concerns. But again, you can put this
on your own server. You could put this in Google Cloud.
Both Microsoft and Google are apparently thinking about it now.
The Information reported that Google had added it to Google Cloud.
No they did not. They didn't do that. They allowed
you to connect hugging face. This is a whole bunch
of technical stuff that if you understand, you'll be like, yeah, Ed,
I know. Long story short, the hyperscalers are already bringing

(17:10):
deep Seek out, and I'll get to why that's bad
later in detail. But it's also very funny. Now here's
something else that's funny. Deep seek reasoner. It's reasoning model
costs that fifty five cents per one million input tokens
and two dollars and nineteen cents per one million output tokens.
Now that sounds expensive. Maybe it is. Whatever, that's goddamn

(17:31):
nothing compared to the fifteen dollars per one million input
tokens and sixty dollars per one million output tokens of
open ai WOF. If I'm Sam Orman, I'm shitting myself.
But there's an obvious bar here. We do not know
where deep seek is hosting its models, who has access
to that data, or where that data is coming from
or going to. We don't know who funds deep Seek

(17:53):
other than it's connected to High Flyer, the hedge fund
that I mentioned earlier that it's split from. In twenty
twenty three, there are concerns that deep seak could be
stayed funded, and that deep Seek's low prices are a
kind of geopolitical weapon breaking the back of the generative
AI industry in America. I'm not really sure whether that's
the case or not. It's certainly true that China has
long treated AI as a strategic part of its national

(18:13):
industrial policy and is reported to help companies and sectors
where it wants to catch up with the Western world.
The Made in China twenty twenty five initiatives SAW are
reported hundreds of billions of dollars provided to Chinese firms
working in industries like chip making, aviation, and yeah AI.
The extent of that support isn't exactly transparent, surprise, surprise,
and so it's not entirely out of the realm of

(18:34):
possibility that deep Seek is also the recipient of state aid.
The good news is that we're going to find out
fairly quickly. American AI infrastructure company Grok is already bringing
deep Seek's model online, meaning that we'll get at least
a very some sort of confirmation of whether these prices
are realistic or whether they're heavily subsidized by whoever it
is that backs deep Seek. It's also true that deep

(18:55):
seek is owned in part by a hedge fund, which
likely isn't short of cash to pump into them. But
as in the side, given the open AI is the
benefactor of billions of dollars of cloud compute credits and
gets reduced pricing for Microsoft's zero cloud services to run
its actual models, it's a bit tough for them to
complain about Arrival being subsidized by a larger entity with

(19:16):
the ability to absorb the costs of doing business should
that be the case. Same goes for anthropic by the way,
and yes, I know Microsoft isn't a state, but with
a market cap of three point two trillion dollars in
quarterly revenues, larger than the combined GDPs of some EU
and NATO nations, it's kind of the next best thing.
But I digress. Whatever concerns there may be about malign

(19:36):
Chinese influence of bordering on irrelevant outside of the low prices,
of course, offered by deepseek itself, and even that is
speculative at this point. Once these models are hosted elsewhere,
and once deep Seek's methods, which I'll get to in
a little bit, are recreated, and by the way, that's
not really going to take very long. I believe we're
going to see that these prices are indicative of how
cheap these models are to run. So you might be wondering,

(20:11):
how the hell is this so much cheaper? And that's
a bloody good question. And because I'm me, I have
a hypothesis. I do not believe that the companies making
these foundation models, such as Open Air and Anthropic, have
actually been incentivized to do more with less. And because
they're chummy little relationships with hyperscalers like Amazon, Google and
Microsoft were focused almost entirely on making the biggest, most

(20:33):
hugest models possible, using the biggest, even hugerris chips. And
because the absence of profitability didn't stop them from raising
more money. Well, they've never had to be fucking efficient,
have they. They've never had to try. Maybe they should
buy less avocado fucking toast. Anyway, let me put it
in simpler terms. Imagine living on fifteen hundred dollars a month,

(20:54):
and then imagine how you'd live on one hundred and
fifty thousand dollars a month, and that you have to,
like Brewster's millions, spend as much much of it as
you can to complete a mission, a very simple mission. Live.
In the former example, you concern survival, you have a
limited amount of money and must make it go as
far as possible, with real sacrifices to be made with
every dollar you spent. If you want to have fun,

(21:15):
you're going to have to eat less. Potentially all the
food you eat will have to be cheaper. You have
to live on a budget. You have to make decisions,
and indeed you might learn to cook at home. You
might walk more, you might do things that will help
you not spend all your money. In the latter example,
where you have one hundred and fifty thousand dollars a
month that you must spend, your incentivize the splurge to

(21:35):
lean into excess to pursue this vague idea of living
your life, your actions are dictated not by any existential threats,
or indeed any kind of future planning, but by whatever
you perceive to be an opportunity to live. Open AI
and anthropic are emblematic of what happens when survival takes
a back seat to living. They have been incentivized by

(21:56):
frothy venture capital and public markets desperate for the next
big thing thing, the next big growth to build bigger
models and sell even bigger dreams. Like Dario Amaday of
Anthropics saying that your AI and I quote could surpass
almost all human beings at almost everything shortly after twenty
twenty seven, I just want to take a fucking second. Journalist,
if you're listening to this, stop fucking quoting this bullshit.

(22:19):
Stop it. You're doing nothing. You are failing at your
goddamn job every single time you quote this bullshit, this nonsense.
Shortly after twenty twenty seven. What the fuck does that mean?
Twenty twenty eight, twenty twenty nine, twenty thirty, what does
surpassing humans and almost everything even mean? This shit doesn't work.
This shit is not good. Oh my god. Anyway, back

(22:42):
to the podcast, the Calm Damn. Both Open AI and
Anthropic have effectively lived their existence with the infinite money
cheap from the SIMS. And I know some of you
might say, by the way, it's not an infant money,
you just add you go into the console. You get
my point. And both companies have been bleeding billions of
dollars a year after revenue, and that's, by the way,
making billions of dollars and then still losing billions is insane,

(23:03):
and they still operated as if money would never run
out because it and it wouldn't. If they were actually
worried about that happening, they would have certainly tried to
do what Deep seek has done, except they didn't have
to because both of them had the endless cash and
access to GPUs from either Microsoft, Amazon or Google. And
the stargate thing is just I will mention it later,

(23:23):
just long story short. They're not going to put five
hundred billion dollars into the it was up to five
hundred bill I'm so tired of this shit. Open iron
anthropic have never been made to sweat, unlike me in
this closet where I'm recording this. And they've received endless
amount of free marketing from a tech and business media
happy to print whatever vapid bullshit they spout, and it's

(23:44):
just very frustrating. They've raised money at will with ananthropic,
by the way, is currently raising another two billion dollars,
valuing the company at sixty billion dollars. And this was
I think happening while deep Zeek was going on, which
is really funny. And they've done all of this off
of a narrative of them. We need more money than
any company is ever needed ever because the things we're
doing have to cost this much. There is no other way.

(24:09):
You must give us more money. My name is Sam Altman.
I need more money than has ever been made from
my huge, beautiful company that sucks and needs money to
train it. Help me, please, My big, beautiful sick company
is dying, but the best and most important company of
all time. It's also normal. Now. Do I think that
they were aware that there were methods to make their

(24:30):
models more efficient? Sure, open AI tried and failed in
twenty twenty three to deliver a more efficient model to
Microsoft called Arakis. I'm sure there are teams that both
Anthropic and OPENII that are specifically dedicated to making things
kind of more efficient. But they didn't have to do it,
and so they didn't. And as I've written before in

(24:51):
my newsletter and argued on this very podcast, open AI
simply burns money and have been allowed to burn money,
and up until recently likely would have been allowed to
burn even more money because everybody, all of the American
model developers, appeared to agree that the only way to
develop large language models was to make them as big
as humanly possible and work out troublesome stuff like making
them profitable or turning them into a useful thing later,

(25:14):
which is I presume when AGI happens, a thing that
they're still in the process of defining, let alone doing.
Deep Seek, on the other hand, had to work out
a way to make its own large language models within
the constraints of the hamstrung end video chips that can
be legally sold to China. While there's a whole cottaged
industry of selling chips in Chines using resellers and other
parties to get restricted silicon into the country, the entire

(25:37):
way in which deep Seek went about developing its models
suggests that it was working around very specific memory bandwidth constraints,
meaning that the amount of data that could be fed
into it and out of it and into the chips.
In essence, doing more with less wasn't something it shows,
but it's something they had to do. I've touched already
on the technical how of these models in greater depth,
and you can really read in that in my news

(25:59):
there and you can go to whez your hed not
at it's at the end of the episode. But I'll
also have show notes to what cales like Ben Thompson's
some strategory because there are lots of things to read here.
I know there are some really technical listeners, and I'm
sure you're gonna flame me in my emails. Please go
and read it. I'm not wrong. I've checked with a
lot of people too, and by the way, all of
this austerity stuff seems to have worked. There's also the

(26:22):
training data situation and another mayor culper. I've previously discussed
the concept of model collapse and how feeding synthetic data,
which is training data created by a generative model, into
another model, could end up teaching it bad habits, which
in turn would destroy the model. But it seems that
deep Seekers succeeded in training its models using generative data
specifically though, and I'm quoting geekwise John Turou like mathematics

(26:45):
where correctness is unambiguous, and using and i quote again,
highly efficient reward functions that could identify with which new
training examples would actually improve the model, avoiding wasted compute
on redundant data, and it seems to have worked. Though
model collapse may still be a possibility. This approach extremely
precise use of synthetic data is in line with some
of the defenses against model collapse I've heard from LLLM

(27:07):
developers i've talked to. This is also a situation where
we don't know the exact training data, and it doesn't
negate any of the previous points I've made about model collapse.
Now we'll see what happens there. But synthetic data might
work where the output is something that you could figure
out using a calculator. But when you get into anything
a bit more fuzzy, like written text or anything with
an element of analysis, you'll likely encounter some unhappy side effects.

(27:30):
But I don't know if that's really going to change
how good these things are. There's also a little scuttle
about about where deep seak got its data. Ben Thompson,
that's trategory suggests that deep seek's models are potentially distilling
other model's outputs, by which I mean having another model,
say metas LAMA or open ais GPT four H, which
is why deep seak identified itself as chet GPT at

(27:51):
one point spit out outputs specifically to train parts of
deep Seek. This obviously violates the terms of service of
these tools, as open AI and its rivals with much rather,
have you not use its technology to create its next rival.
And open Aye, by the way, has recently reportedly found
evidence that deep seek used open AIS models to train
its rivals. And this is from the Financial Times, although

(28:14):
it failed to make any formal allegations, but it did
say that using chat gpt to train a competing model
violates its terms of service, and David Sachs, the investor
in Trump administration AI and cryptos are, says it's possible
that this occurred, although he failed to provide evidence. I
just want to say, how fucking funny it is that
open air is going where where you're stealing my stuff?

(28:36):
Don't steal my things? Where fucking coward, pansy bastard bitches.
Fucking hell, what a what a bunch of whiny babies.
Oh no, my plagiarism machine got plagiarized. Where kiss my
entire asshole, sam Orman, you little worm, you fucking embarrassment
to Silicon Valley. You should be ashamed of yourself for

(28:56):
many reasons, but so much this though. Where Yeah, oh no,
you stole from use my plagier is the machine that
requires me to steal from literally every artist and author
on the Internet. The thing where we went on YouTube
and transcribed everything and fed it into the machine. That's
that's not stealing, that's good. But you using our model
to generate answers. That's just not fair. What a bunch

(29:19):
of babies, you guys say. I'm almos worth billions of dollars.
He has a five million dollar car. Cry more, you
little worm. Personally, I genuinely want open ai to point
a finger at Deep Seek and accuse it of IP theft,
mostly for the yucks, but also for the hypocrisy factor.
This is a company that, as I've just very cleanly said,
exists purely from the wholesale industrial larceny of content produced

(29:42):
by literally a fucking everyone, And now they're crying way.
I'm sam Olman. I'm a big baby. I've filled my
diaper because someone stole from my plagiarism machine. Kiss my ass,
Kiss my ass. These companies haven't got shit. Open ai
doesn't have shit. They they don't have anything, They don't
have a next product without reasoning, they haven't got anything.

(30:05):
And now they don't have that disgusting justification that overspending
the fat, ugly American startup culture of spending as much
as you can to build America's next top monopoly. They
should be fucking ashamed of themselves. They shouldn't be billionaires,
they should be poverty stricken. They should have to pay
everyone they stole for And it's just it sickens me

(30:27):
seeing the reaction from some people on this, seeing the sinophobia,
but seeing this level of defensiveness of a company like
open AI or Anthropic, And as I'll get into next episode,
we are really running out of time here, and I
think Deep Seek is really I think it could be
really the end of days for these companies. I don't
know how much they've got left time wise, or even

(30:50):
money wise, and I'm not sure how they even raise money.
But in the next episode, I'm going to deep dive
into Deep Seek and I'll tell you how they sent
the US tech market into a panic and what it
actually means the future of open Aianthropic and the hyperscale
is backing them. This has been a crazy few days.
I hope this has helped, and on Monday you'll find

(31:10):
out more. Thank you so much for listening. The support
I've got for the show has been incredible, and the
emails I've got about Deep Seek. I've been trying Okay,
I've really been trying so the fastest I could do it.
But I'm so happy to do this show, and I'm
so grateful for all of you. Thank you for listening

(31:34):
to Better Offline. The editor and composer of the Better
Offline theme song is Matasowski. You can check out more
of his music and audio projects at Matasowski dot com,
M A T T O. S O w Ski dot com.
You can email me at easy at Better Offline dot
com or visit Better Offline dot com to find more
podcast links and of course, my newsletter. I also really

(31:57):
recommend you go to chat dot where's youreaed dot at
to visit the disc and go to our slash Better
Offline to check out I'll Reddit. Thank you so much
for listening. Better Offline is a production of cool Zone Media.
For more from cool Zone Media, visit our website Coolzonemedia
dot com, or check us out on the iHeartRadio app,
Apple Podcasts, or wherever you get your podcasts.
Advertise With Us

Host

Ed Zitron

Ed Zitron

Popular Podcasts

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

CrimeLess: Hillbilly Heist

CrimeLess: Hillbilly Heist

It’s 1996 in rural North Carolina, and an oddball crew makes history when they pull off America’s third largest cash heist. But it’s all downhill from there. Join host Johnny Knoxville as he unspools a wild and woolly tale about a group of regular ‘ol folks who risked it all for a chance at a better life. CrimeLess: Hillbilly Heist answers the question: what would you do with 17.3 million dollars? The answer includes diamond rings, mansions, velvet Elvis paintings, plus a run for the border, murder-for-hire-plots, and FBI busts.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.