All Episodes

August 21, 2025 38 mins

What are the implications of recent court decisions for artificial intelligence systems trained on copyrighted materials?

In this episode I get to speak with two repeat veteran guests of the podcast about two important cases dealing with fair use analysis in the context of large language model training. 

Here are a couple of highlights: 

• Courts found AI training to be "transformative use" because the process changes the works significantly through tokenization and processing.
• Judges distinguished between legally obtained training data (dismissed claims) and pirated training data (allowed claims to proceed).
• Both judges signaled that if plaintiffs had focused on AI outputs reproducing substantial portions of their works, outcomes might have been different.
• The transformative nature of AI training was deemed significant enough to qualify as fair use even for commercial enterprises
• One judge noted that if copyrighted works are essential for training models worth "billions, if not trillions," developers must find ways to compensate copyright holders.

About Our Guests

Ryan Phelan and Tiffany Gehrke are recognized thought leaders in emerging technology law and artificial intelligence. Both are partners at Marshall, Gerstein & Borun LLP and returning guests on the Emerging Litigation Podcast.

Ryan has written extensively on digital innovation, including intellectual property issues related to cutting-edge AI systems. He is the moderator of PatentNext, a blog focused on patent and IP law for next-generation technologies. We based our discussion on his excellent article about copyright law meeting AI, titled U.S. District Court Issues First Decisions on AI Model Development and Copyright Fair Use. Ryan holds a J.D. from Northwestern Pritzker School of Law and an MBA from Northwestern’s Kellogg School of Management.

Tiffany is known for her expertise in intellectual property and technology policy, and for advocating balanced, ethical approaches to AI regulation. She chairs Marshall Gerstein’s Trademarks and Copyrights Practice. Before entering law, she worked as a software engineer. Tiffany earned her J.D. from Loyola University Chicago School of Law.

Together, they bring deep practical insight and academic rigor to the evolving legal landscape surrounding artificial intelligence. 

I appreciate them returning to the podcast and sharing what they know!

Tom Hagy
Host
Emerging Litigation Pocast

Thanks for listening!

If you like what you hear please give us a rating. You'd be amazed at how much that helps.

If you have questions for Tom or would like to participate, you can reach him at Editor@LitigationConferences.com.

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Tom Hagy (00:03):
Hello and welcome to the Emerging Litigation Podcast.
I'm your host, Tom Hagy.
Today, I'm joined by theauthors of an article titled
Copyright Law Meets AI.
These are key takeaways fromcases called Cadre versus Meta
and Bartz versus Anthropic.
In this piece, our guestsdescribe recent US court
decisions and their effects onartificial intelligence systems

(00:24):
trained on copyrighted materials.
The article covers four-factorfair use analysis as applied to
large language model training.
We're going to talk about howcourts address digitizing
legally purchased books forinternal AI training, in
contrast to building collectionsfrom illegal sources.
The article I refer to alsocovers questions related to

(00:48):
market effects, licensing,revenue dilution caused by
AI-generated works.
Our guests present informationfor AI developers, copyright
owners and model users,including the importance of
lawful sourcing, dataset reviewand output controls.
Legal precedence developing inthis area, of course.
That's why we're talking aboutit here on the Emerging

(01:08):
Litigation Podcast.
It's my pleasure, then, tointroduce my guests Ryan Phelan
and Tiffany Gehrke .
They're both recognized thoughtleaders in the fields of
emerging technology, law andartificial intelligence, and
they're both partners atMarshall Gerstein and Boren
Artificial Intelligence.
And they're both partners atMarshall Gerstein and Boren, and

(01:31):
they have appeared on theEmerging Litigation podcast
before.
Can you imagine, they're verywilling to share their insights
and expertise with a wideraudience, and that's what I
appreciate about them.
Ryan Phelan, partner at MarshallGerstein, I'm going to say it
right Marshall Gerstein andBoren.
He's contributed and writtenextensively on digital
innovation, includingintellectual property such as
patent and copyright issuesregarding cutting edge AI
systems.

(01:51):
He's a graduate of theNorthwestern Pritzker School of
Law.
Tiffany , also coming back, aglutton for punishment.
She's known for her expertisealso in intellectual property
technology policy and heradvocacy for balanced and
ethical approaches to AIregulation.
She earned her law degree fromLoyola University, Chicago

(02:13):
School of Law.
So together they bring us a lotof insight and academic rigor
to conversations on the evolvinglegal landscape.
Here they are, Ryan Phelan andTiffany .
Hope you enjoy it.
Ryan and Tiffany, thank youvery much for coming back to the
Emerging Litigation Podcast.

Ryan Phelan (02:32):
Great to be here.
Thank you for having us.

Tiffany Gehrke (02:34):
Great to be back .

Tom Hagy (02:35):
Okay.
So, Ryan, so the first one's toyou.
We're going to talk about somelandmark decisions.
As I said in the introduction,this is based on an article that
we will link to that in theshow notes and in the summary.
So can you summarize the mainfindings You're talking about
BARTs versus Anthropic and Ithink it's CADRE versus META and

(02:56):
explain how these casesinfluence the current
understanding of what is fairuse in AI training and
copyrighted materials?

Ryan Phelan (03:05):
Tom, yeah, so these cases are first of their kind
and pretty impactful, in thecopyright world at least.
And what people have beenwondering for quite some time is
how can these large languagemodels that we've seen crop up
in recent years, such as ChatGPTand cloud and all of these

(03:27):
things, you know, how how canthey take information from
authors without the permissionand use, use that to train their
model?
And if they do so, is there anyrecourse by the authors through
copyright law, you know, forthese model trainers, or is
these model trainers have in, infact, some kind of defense in
the copyright law, called fairuse, which is a common defense

(03:51):
used in traditional copyrightcases?
But how does that apply in ournew digital world, especially
with AI?
And so the interesting thing isthat these cases involve two
different sets of plaintiffs anddefendants, and the cases are
both in the Northern District ofCalifornia.

(04:11):
It's the district court thereand the cases were decided by
two different judges of thatcourt.
So the judges had theopportunity to explore similar
issues you know, in thecopyright realm, you know and
come to their own conclusions,but they in fact came to very

(04:32):
similar conclusions and rulings,but, you know, one did have
some criticism of the other.
We can get into that later.
But you know and I'll refer tothe two cases as the defendants,
just because they're morewidely known as entheropic being
the first case and meta as thesecond and what happened in each

(04:52):
one of these cases is each oneof these defendants entheropic
and meta has a model.
It has their cloud model, whichis very famous.
Meta has their Lama model, andboth of them did similar things

(05:13):
for training that the authorsdid not like or approve of, and
they took book data, bookinformation and used that to
train their respective models.
In one instance, they tookbooks and scanned them in.
From books that they hadpurchased, they stripped off the
covers and chopped out theheadings and the page numbers

(05:34):
and stuff like this, but theytook the text of the books
themselves and used it to train.
And then they also got someshadow libraries or pirated
books too, too, and also usedthat in their data, and the
author's theory of liabilityunder copyright was that.
You know, both of those twothings constituted copyright

(05:56):
infringement owners thedefendants, in each case, raised
the legal defense of fair usein order to defend themselves,
in order to state that, eventhough that they did do that,
they did take and use thatmaterial, that it was fair use

(06:17):
and therefore they were notsubject to the copyright law.

Tom Hagy (06:23):
Right, I'm sorry, I'm laughing, I'm just, it's like I
never really.
They were literally copyingmaterials, which you know it's
right in the word copyright.
You know what I mean, butthat's not, that's just me being
playing with words, but becauseit also interests me too, as a
writer, I'm relying increasinglyon AI and I always want to

(06:45):
attribute everything, and so, uh, it always makes me a little
nervous, are they?
You know?
Am I?
Am I not attributing somethingby accident to something that
maybe somebody copied in fortraining?
Uh, they're not publishing it,but they did use it to train.
So, uh, this is interesting tome personally too.
So, tiffany, over to you.

(07:11):
So let's talk about, you know,ryan's mentioned fair use, so
let's talk about transformativeuse and fair use.
How did the courts definetransformative use in these
cases, and why is this criticalfor fair use assessments when it
comes to AI training?

Tiffany Gehrke (07:23):
So I thought the Anthropic case was particularly
interesting here, because thejudge went through like
painstaking efforts to describewhat they view as the
transformative use.
And so they talked about, first,that each work was selected and
copied from a central libraryto create a working copy of a
training set.

(07:43):
Then the second step they saidthat happened was that each work
was cleaned to remove some ofthe information like page
numbers and, you know, headers,footers, that kind of thing.
Third, they then took eachcleaned copy and translated that
into a tokenized copy wherethey messed with the words right
, they might have changed thestemming of a word or grouping

(08:05):
different characters fromletters and words together.
And then the fourth thing theydid was then they each fully
trained LLM then itself retaineda compressed copy of the work
that they had trained on.
And I thought it was interestingthat the judge in this case
wrote out each of those steps tohelp to inform.

(08:29):
You know all of us readers thathere's what I'm viewing as
transformative.
You took this thing, you didone, two, three, four, and then
here's what the end was, and sothat's what that judge said.
Those are the steps that madethis a transformative use In the
meta case.
I felt the judge was just likeyes, this is transformative.
You know, it's not a book.

(08:51):
We're taking it and we're usingit for training of LLMs and
therefore it is transformative.
And for each of those, you know, they did discuss that it's
only one of the four steps offair use, but in each case they
found it in favor of the largelanguage models that it's only
one of the four steps of fairuse, but in each case they found
it in favor of the largelanguage models that it was
transformative, in favor of fairuse and against infringement.

Tom Hagy (09:15):
Okay, gotcha Interesting distinction.
So on to data legitimacy, ryan.
So why did the courtsdistinguish between lawfully
obtained and pirated trainingdata?
What's the practical impactthis might have on AI data
sourcing in the future?

Ryan Phelan (09:45):
You know the copying copyright infringement
can be found when you you knowwhen and where you copy.
So first of all, when you copyinformation for model training,
you have to stick it somewherelike on a memory of a computer,
and then you know, once you havethe books and the memory, then
you can train your model andthen perhaps the model can
output something that's the sameor substantially the same.
You know which would trigger,you know, copyright infringement

(10:08):
.
Um, the courts were, withrespect to pirated versus
non-pirated data.
Uh, they were focused on thefirst, of those where you know
you're just copying theinformation uh and uh originally
and sticking in your memory andthe computer memory for
training and so.
So the courts seem to be sayinglike we don't care, like how

(10:28):
much transformative use you mayhave done after the fact, after
you've copied the pirated works,like copying pirated works is
is never okay and there's somuch pointing to do.
They didn't get into that in thein their rulings because it was
aings, because there weremotions for summary judgment.
They talked about dismissingthe case.

(10:52):
They did dismiss the case underfair use in each court Each
court did this.
They dismissed the infringementunder summary judgment for the
legitimately copied works fromthe books that they had.
You know presumably under, youknow, first sale doctrine
offense because they hadobtained the books lawfully.

(11:14):
But for the pirated works whenthose were taken and stored in
the memory they did not dismissthose counts based on summary
judgment.
That's to be decided laterduring trial and you know,
presumably made an anthropic ornot.
Not looking forward to that.

Tom Hagy (11:33):
Okay, all right.
So how did the courts evaluatewhether the AI inputs were
substantially similar to thecopyrighted works and what
evidence is most important fordetermining infringement risk,
going forward Ryan?

Ryan Phelan (11:49):
So there's two along the lines of what I just
mentioned.
There's two ways that you knowyou can be found infringing
copyright for AI training.
The first is when you initiallystore that information in the
memory and then the second isthe output.
So you're kind of focusing onthe inputs and the outputs and
looking at infringement at thatmanner.
But the second of these it wasinteresting that the authors did

(12:14):
not allege, or did not, youknow, fight, at least in the
summary judgment motion, thatthe outputs were infringing.
Presumably they could not create, you know, output that was
similar to a book.
Let's say you had a book youknow that was in the lawsuit and
it had like a paragraph in thebook.

(12:37):
You know, presumably theauthors cannot get the AI to
output the same paragraph or atleast substantially similar
paragraph.
In fact, in one of the cases,one of the defendants I forget,
which I think it was in Metathey had an expert try to output

(12:57):
a same amount of text from oneof the books and that expert was
only able to output 50 words atmost of the output basically
trying to, and basically arguedon behalf of the defendant that
that was not sufficient, itwasn't enough to have One of the

(13:18):
fair use doctrine was that youhave to take heart or you know
substantially all, or like whatportion or how much like
quantity aspect of it, and shecould not substantiate that.
The expert and so both of thecourts were eager to address
this issue.
Like can your model output youknow the same sentences or lines

(13:40):
and phrases of the actualunderlying works, and the courts
indicated that they would havebeen very eager to address that.
That was the case, but that wasnot raised, so that second
copying was not just was not atplay and you can.
It's very hard, I guess, in thetextural realm to do that.

(14:02):
Other cases that are stillongoing, like the Getty Images
case, it's more easier to show asimilar output.
In fact, in the complaint inthe Getty Images case, you have
two pictures side by side ofsoccer players and they both the
real image has a Gettywatermark on it, and so does the
output AI image seems to have aGetty watermark on it too.

(14:25):
And you know the question islike why would your, your
generated Getty image have a I'msorry, your generated AI image
have a Getty watermark on it ifit wasn't trained on?
You know Getty real?
So it's kind of a giveaway,yeah, and so here there's not
that telltale giveaway, and sopresumably the authors didn't do
that.

(14:45):
But going back to the originalquestion, the first of those was
focused on when thatinformation was stored, and, as
Tiffany had mentioned, they hadtransformed that stored data in
so many ways that, you know, itwas not deemed to be a copy.
That was not deemed to be acopy because the information had
to be transformed down to thetoken level, which, if you're

(15:09):
thinking about the Englishlanguage, tokens are like
syllables and words, and thoseare much different from the
original text compared to thetokenized version of that text,
for model training wasconsidered to be significantly

(15:29):
transformative, and so was theoutput, by the way, and so that
was one of the reasons thattermed the court's decision for
fair use.
They both said that it washighly transformative in nature.

Tiffany Gehrke (15:43):
Okay, I have two thoughts on what Ryan just said
that I'd like to share.
So the first was when you werejust talking about the Getty
case.
That made me think about reallythe difference in visual versus
written.
You know, copyright protectionand expressive works, and I'll
be curious to see how this goesover time, as we're talking

(16:04):
about things that are text so itcan be jumbled up a little bit
more that you might not catch iton the output side, whereas
your example with Getty, wherethey could see the watermark on
there, I think that's reallypowerful and it's a little bit
more of an indicator that youknow where it probably came from
right, whereas with words youdon't always know that you don't

(16:28):
have that indicator on theoutside.
So I think that's a reallyinteresting case to be watching
as well, and thanks for sharingthoughts on that, Ryan.
The second thought was in thecase with Judge Alsup.
He repeatedly said throughoutthe opinion basically we're only
focused on inputs here.
The plaintiffs haven't reallyalleged anything similar on the

(16:51):
outputs and it was mentionedenough times throughout the
opinion that it made me think.
Is he trying to signal he mighthave found differently if they
had talked about the outputs,more differently if they had
talked about the outputs more.
I'm not sure on that becauseyou know in that opinion it also

(17:16):
came down pretty pro-LLM, but Ithought it was interesting that
it was repeatedly mentionedthroughout the case.

Ryan Phelan (17:20):
I took note that he both judges kind of mentioned
it throughout their cases askind of like a hint, hint, nudge
, nudge for future yeah yeah,please do this so we can, you
know, discuss this.
Uh, you know the outputs.

Tiffany Gehrke (17:34):
It just wasn't explored, um, you know, but both
judges seem to be eager to talkabout it and the second judge
even went so far I think it'smaybe somewhat bolder than even
a hint hint and basically saidlike if they had put this
evidence in they might have hada winning argument.

Tom Hagy (17:52):
Oh, wow, wow, Okay, in your careers have you seen
judges do that the nods and thewinks and need I say more kind
of a thing?
I mean, I've only seen itrecently.
You know like Supreme Court,you know opinions and dissents
are saying you know, what youreally ought to be doing.

Ryan Phelan (18:20):
I mean, they're not supposed to do that, they're
supposed to, you know decide oncases and controversies.
But it happens.

Tiffany Gehrke (18:35):
I think in these types of cases right now too,
where they know there's a lot ofeyes on them and they're
probably going to get appealedup, I think we are seeing it
more.
It's starting to trend.
You know, trend that direction.

Tom Hagy (18:47):
Yes, okay, yeah, thank you, that's.
That's interesting.
So, tiffany, we're going tocome back to you.
How did the courts reconcilethe strong protection for
expressive works like novels andplays, with their fair use
findings and AI trainingcontexts?

Tiffany Gehrke (18:59):
Yes, so the courts really seem to focus on
the fact that the books werefocusing on the or, I'm sorry,
the large language models werefocusing on using the books for
input, so they were, you know,saying these weren't going to be
, you know, creating anotherbook to compete, at least
initially.
Right, that's where we get tothe output discussion that we

(19:22):
had earlier.
The courts were insteadfocusing on although we want to
protect books, we want toprotect authors, we don't see
the training of the model assomething that's harmful to
authors.
At least that's what I saw fromthe judge in the Anthropic case
.
I found the judge in the othercase.

(20:10):
You know the original humanbeing authors.

Tom Hagy (20:14):
So back to, let's say, commercial use and copyright.
So why do the courts allow evencommercial AI training uses to
qualify as fair use, and howmight this reasoning affect
future IP policies andlitigation Tiffany?

Tiffany Gehrke (20:29):
I think they did it here, because both courts
acknowledged just howtransformative the training of
the model use was and theythought well, in that purpose
it's okay, even if it iscommercial in nature.
Even if it is commercial innature and you know, in the Meta
case the judge also went on tosay I thought they didn't, they

(20:57):
stopped describing it here.
So it said the plaintiffs inthe Meta case acknowledged that
the large language models haveend uses, including serving as
tutors, assisting with creativeideation and helping users
generate business reports, andthat several of the plaintiffs
testified that using the largelanguage models for various
purposes were all distinctpurposes from creating or
reading an expressive work likea novel or biography, and

(21:20):
because those functions weredifferent from the functions
that the books were going to beused for, you know, for reading
and enjoyment or learning thatthey thought copying the books
as a whole to develop this toolwas in fact going to create a
different use, even though itwas a commercial use developer

(21:45):
practices.

Tom Hagy (21:45):
How do you think these rulings are going to shape how
AI developers manage trainingdata, and what steps should
organizations take to minimizelegal risks and comply with
copyright law?

Ryan Phelan (21:52):
Yeah, I think that this sends a strong message to
AI model developers that theyneed to use, you know, purchase
licensed, legitimate data andnot, you know, pirated data.
In fact, you know, we've seen inthe news recently, amazon
agreed to pay, I think, $10million to the New York Times to
license data from you know thepaper for you know, presumably

(22:17):
for use for training purposes,and so model developers are
going to be very careful aboutwhere they get their data from.
You know, hopefully fromlicensed sources, hopefully from
sources like books that they'vealready purchased, that they
can rely on, like first saledoctrine, where you know, if you
own it, you can use it.
You know, similar to it,similar to what we all know from

(22:40):
the days of buying CDs andstuff like that.
You would own, you got a firstright sale on the music, on that
CD and couldn't get sued bycopyright.
So they're going to want to dosomething similar to that.
So I think that that was thelesson learned for them.
I think the big surprise wasthat, you know, the courts did

(23:01):
consider it highlytransformative and they just
need to make sure that theirdata is clean.

Tom Hagy (23:06):
Yep, okay, industry challenges.
So what challenges orresponsibilities do these
rulings create for companiesusing AI and copyright owners?
Are there collaborativesolutions or industry standards
that you would recommend or lookto?

Ryan Phelan (23:24):
Yeah, I would say that you know, for copyright
owners that have proprietarydata you know secret data they
certainly want to protect thatwith like an NDA before they
share it and they license it.
You know the problem in thesetwo cases was that the data was
publicly available.
Right, the authors wanted tosell these books.
They wanted them to be public,so you know they could sell

(23:47):
their books and information tousers, and so they're highly
public and you know they simplycannot have proprietary data in
that case.
So I'm sure they want to seek alicense with the model trainers
in order to get some type ofrevenue stream.
I know that there's a billcurrently floating in Congress

(24:11):
that allows authors to haveprotection with respect to the
works that they get some kind oflicensing revenue for, you know
, if their model is or if theirdata is used, and perhaps some
control over whether their datais used for training.
Although I think that's only oneof the two houses I don't think

(24:31):
it's, you know, advanced veryfar, and whether or not you know
the president signs out intolaw is also another thing.
You know the president was on apodcast recently with the AI
developers you know, including,you know, the hardware
manufacturers, nvidia, and thenalso some of the AI model

(24:53):
developers, including OpenAI,and he seems pretty AI model
friendly, so to speak, where thecurrent administration believes
that training AI models anddeveloping them is an important
strategic objective forcompeting with other countries,
including China.
So you know, whether or not hewould sign a bill that would

(25:28):
limit AI use and protect authorsis questionable authors,
because that would supposedlyhamper their desire or their
ability or speed to developthese AI models or be
competitive as a countrycompared to other countries.

Tom Hagy (25:46):
You know this is evolving, you know the
litigation is evolving.
Obviously, the technology isevolving rapidly.
People are using it withvarious skill levels, people are
bumbling through it.
I'm right in the middle of itas a writer.
Um it it, you know, early,early days, I it did make up

(26:06):
four cases that I was lookingfor, you know.
But fortunately, um, I'm usedto checking and checking and
rechecking things and, uh, Ieven worked with a, a paralegal
out in california.
He's a very good paralegal,he's a master researcher, you
know, and he's like well, no, Idon't think so.
Maybe they were in some littlecourt in some county in Texas
that you're looking at.

(26:27):
Well, this is an antitrust case, so they're not going to be in
a very, you know, a small town,municipal court or something.
So, and I do know one, at leastone I don't know if it was an
attorney or the firm was finedin Florida because, infamously,
put in some cases that werecompletely made up.
I mean, who doesn't check thosethings?

(26:47):
But anyway, it happened.

Ryan Phelan (26:50):
Apparently, a lot of people don't check those
things.
I see almost every week somecourt somewhere is angry or
sanctioning an attorney forfiling a brief.
Uh, yeah, uh, you know had madeup citations, so, um, in fact
saw a article uh this week aboutattorneys proposing a a fix

(27:10):
where there would be anautomated tool to scan briefs
filed to see if there were anymade-up citations, which I I
thought was amazing that courtswould even consider or do that.

Tiffany Gehrke (27:22):
Have AI.
Check the AI.

Tom Hagy (27:25):
Yes, I think that's smart.
I do, I think that's very smart.
I've started doing that alittle bit myself, but I think
having AI yeah, search foritself I think that's brilliant.
I mean that's a good use of it.

Ryan Phelan (27:38):
It's also kind of flies in the face of the ethical
rules that attorneys should besubject to.
So I think there's a debatethere, but I don't have a
solution.

Tom Hagy (27:50):
We can come up with all we want.
You can't.
I can because you representpeople and you have clients.
Nobody cares what I think, butit is interesting.
It's fascinating as a writer,especially in law, where I spent
my 20s and a good part of my30s as a legal reporter and I

(28:10):
was reading a lot of courtopinions.
I mean thousands of them.
I mean thousands of them.
And the time it took to makesense of a case.
You know, you guys know ittakes time and I'm looking at
Adobe and Copilot, which I guessMicrosoft they've got a
partnership with ChatGPT and allthis.

(28:30):
You give them a document.
It does a really good job ofhitting the highlights for you
if you need to understand itquickly.
A really good job of hittingthe highlights for you if you
need to understand it quickly.
And the cool thing I like aboutthat is there's no waking
anything up because it's got thedocument and it's got where it
got the information.
So you know what I mean, so youcan treat it like a junior
writer or something.
So there's super.

(28:50):
I mean it's just super pros andcons, but I'm enjoying learning
about it.
But enough about me.
So what do you expect in futurelitigation or legislation?
Ryan, you mentioned somethingthat's in Congress now.
I feel like they're focused onother things.
I don't know.
I read the news once in a while.
It looks like they're busy withother stuff.

(29:11):
What do you expect to see inthe future?

Ryan Phelan (29:15):
There's a federal bill that I mentioned, that
certain congressmen and womenare concerned about authors and
their works being used to trainthese models, to make sure that
they're compensated in some way.
So whether or not that thatbill is successful or gets
changed or is ultimately signedis, you know, anybody's guess.

(29:37):
The states are also coming upindividually with their own
frameworks for, you know, ai andit's kind of the Wild West
right now with respect to youknow what protections states are
offering.
You know California isconsidering some.
You know other states are alsoconsidering some.

(29:58):
There's also a big push andthis is tangential, but there's
also kind of this data privacyslash, digital what's the word?
A digital twin of people, likemaking sure that your appearance

(30:18):
is kept free of usage.
A lot of these cases come fromthe use of AI to oh, your image
and likeness.

Tom Hagy (30:26):
Yeah, thank you.

Ryan Phelan (30:27):
Image and likeness, where people you know AI to put
somebody's face on somebodyelse's body, with or without
clothes, and so it's you know.
States are, you know, puttinguh laws into place to protect
individuals from that?

Tom Hagy (30:43):
Mm-hmm.
Yeah, yeah, yeah, there's somepretty sophisticated tools that
are.
I mean, some of them arehysterically funny, some of them
are just very funny.
So it's amusing, but in theright and in the other hand it's
dangerous.
I see things and you got worldleaders making declarations, you
know, and it's very convincing,and it's not them at all.

(31:06):
It's just very scary.
Tiffany, what about you?
Any outlook for the future?

Tiffany Gehrke (31:10):
Yeah, I think, on the litigation side.
So I think both of the casesthat we primarily talked about
today are both going to probablybe appealed, Right.
I think both of the cases thatwe primarily talked about today
are both going to probably beappealed right.
But broader than that, I thinkthe message is that future

(31:32):
plaintiffs in these cases needto focus, you know, broader than
just training withoutpermission or, you know, seeing
if the output can generate theexact text, and they need to
focus on what both judges heresignaled is important, what both
judges here signaled isimportant, and that is are the
output of the large languagemodels going to create materials
that compete with the authorsand the underlying protected
works, or are they going todilute the marketplace?
Are they?
You know what's going to happenon that output and plaintiffs
need to focus on that.
And then my last thought.

(31:54):
I thought from the judge in theMeta case had a nice little
succinct soundbite where theybasically said you know these
products are going to generatebillions, if not trillions, of
dollars, and if usingcopyrighted works to train those
models is essential, then theyneed to figure out a way to
compensate the copyright holdersfor it.

Tom Hagy (32:15):
Mm, hmm, yeah, yeah, as I'm listening, as I'm
thinking about it, I feel likemaybe you guys have an opinion,
but I feel like the, the peoplethat are going to be that are
going to be most threatened bythis, or squeezed by this, or or
the creative folks, um, because, um, you know, writing is like.

(32:35):
You know, I can say, writesomething in the style of Kurt
Vonnegut and it'll get prettygood, and you know, 10 years
from now, it's going to beamazing.
You're not going to know thedifference, or make a painting
that looks like a Van Gogh, andit's going to look amazing.
And you know, with printertechnology, you know what?
Do you call it?
3d printing?
I'm sure there's technologythat does painting.

(32:56):
I'm not even I've never seen it, but I'm positive it exists.
So I think so the creativepeople are going to get squeezed
on one side, and then what theydo create is going to get
potentially diluted or used, oror just transformed, and, and so
I think I feel like thecreative person, people are
going to be under pressure here.

(33:18):
I don't know what you guysthink about that, or I don't
know what to do about it,obviously, but yeah, I agree.

Tiffany Gehrke (33:23):
I think both sides need to figure out a
working solution going forward,because I think the advances in
technology that can be gainedthrough this are incredible and
are very powerful.
But I agree that you want tocontinue to incentivize the next
great authors to create novels,to feel comfort in doing that

(33:45):
and that they would be protectedin the future, and so I think
there's got to be a balance thatis struck here, with
compensation or otherlimitations.
I don't have a great idea onwhat that will be yet, but I
think it needs to get there.

Tom Hagy (33:59):
Well, let's come up with it, Ryan.
You have a solution, don't you?

Ryan Phelan (34:02):
The closest solution I've heard, which you
know, I don't know if it is, issomething akin to what happens
in the music world now, wherenow music is, you know, authors
don't receive a license directly.
They, you know, kind of goesinto this, uh, collective, um,
you know, system where, you know, maybe a per per play of their
song.
They get a fraction of that,and then you know, and so

(34:24):
there's, you know, kind of aspotify model, so to speak, is
what I've heard.
Um, how that plays out withmodel training could be
different, because people aren't.
You know, once you train themodel, it's trained with the
information.
It's not like people arelistening to the information
again, right, the model is notlistening, unlike, you know,
people listening to songs overand over.
So there, there's similaritiesbut there's differences.

(34:46):
But there is this aspect ofauthors creating material and
that having an input into some,you know, larger model or system
, and then the, the author isgetting compensated for that,
which you know, seems to seemsto be the way to go.
Otherwise, you know, you get toa world, like Tiffany suggested,

(35:06):
where authors are just notincentivized to, you know,
create new things and the AI isjust learning on, you know, its
own output and thereforeeverything starts to look the
same and sound the same.
There's nothing new in theworld.
I guess I'll be old and datingmyself if I say this, but I can

(35:28):
remember the music back from twodecades ago.
It's quite different from whatwe have now and I don't know if
the the spotify model has uhimpacted that or if I'm just old
and I like the music from thedecade uh.
Well I mean, but maybe we,maybe we enter a world like that
if we don't incentivize enoughcreative talent to create new,

(35:51):
uh, good, good art it, just itspeeds up, but in in terms of
music it speeds up what humanswould do too.

Tom Hagy (35:58):
You know they would hear the Beatles and the Beatles
, would, you know, influenceanother band and they would
influence another band andpretty soon you get very like it
goes in different directionskind of very superficial
derivative.
You know people do it, but AIwill do it faster.
So you get this great musicthat turns into, like you know,
becomes an elevator music.
So it's going to be interestingto watch.

(36:21):
I've got a nephew who teachescreative writing in Wisconsin
and all kinds of writingcreative writing and just other
kinds of writing expository andit's a real challenge and you
know he's very good at spottingwhat is AI language, as am I.
In fact, somebody whospecializes in this sent me a
list of words that Googlealgorithms look for to see

(36:46):
whether something is AI written,and there are some like.
So I'm writing about law andyou guys will appreciate this.
I ask it, give me a summary ofa case, just so I can know what
it was, and then it'll spit out.
So you want me to do a blog?
I said, sure, write a blog, seehow it does In this landmark
case, you know every case is alandmark case.

(37:07):
Right, and even some of thenotes I did for this.
It uses the word landscape, thelegal landscape.
It's always landscape and it'sa landmark, it's like.
So I've got a whole list now ofwords that I have to avoid.
It's awkward because some ofthem are words you would
normally use.
Tiffany and Ryan.
Thank you guys very much.

Tiffany Gehrke (37:26):
Yeah, thanks for having us.

Ryan Phelan (37:27):
Thank you, Tom.

Tom Hagy (37:34):
The Emerging Litigation Podcast is a
production of Critical LegalContent which owns the awesome
brand HB Litigation.
Critical Legal Content is acompany I founded in 2012.
What we do is simple we createcontent that's critical on legal
topics for law firms and legalservice providers.
That kind of content can beblogs, papers, they can be
podcasts, webinars and we have agood time doing it.

(37:56):
And as for HB Litigation, well,and we have a good time doing
it.
And as for HB litigation, well,Litigation the name under which
we publish interesting at leastinteresting to me legal news
items, webinars, articles, guestarticles, all on emerging
litigation topics.
Once again, I'm Tom Hagee, withCritical Legal Content and HB
Litigation.
If you like what you hear andyou want to participate, give me
a shout.
My contact information's in theshow notes.

(38:18):
Thanks for listening.
Advertise With Us

Popular Podcasts

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Stuff You Should Know

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.

The Joe Rogan Experience

The Joe Rogan Experience

The official podcast of comedian Joe Rogan.

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.