Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:09):
Welcome to the Road to Accountable AI.
I'm Kevin Warbeck, Professor of LegalStudies and Business Ethics at the Wharton
School of the University of Pennsylvania.
For decades, I've studied emergingtechnologies from broadband to blockchain.
Today, artificial intelligence promisesto transform our world, but AI won't reach
its potential without accountability,mechanisms to ensure it's deployed in
(00:31):
responsible, safe, and trustworthy ways.
On this podcast, I speak with the expertsleading the charge for accountable AI.
Whether to restrict open weight foundationmodels is a big topic of discussion
in AI safety and policy circles.
Thanks But not one that the generalpublic and business community are
necessarily focused on my guest,Kevin Bankston from the center for
(00:54):
democracy and technology talks aboutwhy this issue is so important,
both to the future of innovation inAI and to our ability to address.
Serious AI risks, as you'll hear, Kevinhas strong views and in full disclosure,
I signed on to an open letter thatCDT wrote on these issues, but in our
conversation, we cover the argumentson both sides of the issue and why
(01:17):
this topic is actually so importantto understanding what the future
path holds for the development of AI.
Kevin, pleasure to haveyou on the podcast.
Before we get into talking about openweight models, just a quick Tell us
a little bit about your backgroundand for those who aren't familiar
what CDT is, what's the work thatthe organization is doing on AI?
(01:39):
Sure.
I'll start with CDT.
CDT is, I would say, the premierinternet and technology policy think
tank and advocacy organization in DC.
We've been around for, I want tosay, almost 30 years at this point.
Non partisan, non profit,focused on protecting civil
rights and civil liberties.
(01:59):
In the digital world thatencompasses a whole lot of things.
It's a broad spectrum organization, butobviously right now AI is at the forefront
of a lot of technology policy discussions.
And I, um, with my colleague,Miranda Bogan co founded
the new AI governance lab.
At CDT, um, which is focused on helpingdevelop best practices and standards
(02:23):
around evaluating and mitigating AI risksto help build a technically informed basis
for eventual legislation and regulation,taking advantage of not only my 20 odd
years in civil society at organizations.
Like and including CDT.
(02:44):
After that 20 odd years, I was approachedby Facebook, now Meta, to basically
help them figure out their pathforward on responsible AI development,
which was an intriguing possibility.
And so I spent about four yearsbuilding their AI policy team, being
one of the founding senior leadersof their responsible AI product team,
(03:07):
which consults with all the otherproduct teams on their AI products.
Thank you And did a lot of good work, butobviously working in a big company like
that in a lot of ways is very stressfuland challenging and time consuming.
And I am an, at this point, a latemiddle aged fella who just, more
accurately, my wife just had a baby.
(03:27):
And I decided for a variety ofreasons, but especially including
that one, I was ready to leave.
The corporate world and come back to civilsociety and apply what I had learned.
And also what Miranda, as my colleagueon my team at Metta had learned
around responsible AI development.
So now Miranda who did trailtrailblazing groundbreaking work at
(03:49):
Metta in terms of addressing issues ofrace and gender bias in ads delivery.
She is the director of thenew governance project at CDT.
While I'm a senior advisor to thatproject and to the leadership at
CDT on other AI policy issues.
I currently teach the First Amendmentand copyright in regard to AI technology
as an adjunct professor at Georgetown.
(04:11):
Those are issues that CDThas not spoken greatly about.
We do address the First Amendment inregard to this issue a little bit in the
comments we're going to be talking about.
But speaking generally, if I talk aboutthe First Amendment here, attribute
that to me as Kevin the scholar,not necessarily CDT's positions.
Great.
So this is why you are a perfectperson to talk about these issues.
(04:33):
I want to ask you in particularabout the debate that's happening now
around open weight foundation models.
So first, for those who aren't necessarilyfamiliar, what are we talking about here?
So this will require a little bitof a history lesson about What
open source software is beforewe get to what open source AI is.
So in the nineties, you began to seethe evolution of this new form of.
(04:59):
Software development and licensingaround this open source concept where.
You would have companies or even voluntaryprojects with a bunch of contributors
developing their code openly, as incollaboratively, or when publishing their
programs, also publishing the source codeitself so that other people could use that
(05:23):
to build their own software, to modify itfor their own uses, et cetera, et cetera.
Usually this is accomplished throughsomething called an open source license,
which basically says, You can use thisand there are no use restrictions on it.
We are essentially.
Although we still own the copyrightin this technically, we are
licensing it broadly for public use.
(05:45):
So, why is this importantand, speaking generally, good?
Certainly the internet and digitaltechnologies built around and
on the internet rely an enormousamount on open source software.
At this point, 96 percent of allcodebases include some open source code.
It's the original open sourcing ofthe Netscape browser in the 90s, which
(06:07):
led to the open source Mozilla Firefoxbrowser, which was a predecessor to
later the Google Chrome open sourcebrowser that enabled competition
so that we didn't all just have touse Microsoft Explorer on the web.
Now, I want to be clear, open sourceis not and will not be Always a
silver bullet for competition.
(06:28):
And it can be leveraged byalready powerful stakeholders
to make them more powerful.
Certainly it's enhanced Google structuralpower, the fact that it controls
ultimately the development of Android,for example, but then again, that, that
power has also helped counterbalance thepower of other people, other companies,
like your Microsoft's and your Apple's,and enabled a large number of hardware
(06:50):
vendors to compete that wouldn't.
Have had an operating system touse and it's complicated ecosystem.
But ultimately open source in thecontext of regular software has
been a boon for competitiveness.
Also a boon for security inthe world of open source.
They say enough eyeballsmake all bugs shallow.
This is because if you have.
Your source code interrogable by anyone.
(07:13):
That means you have a lot of eyeslooking at it and finding flaws
and helping you patch those flaws.
That's why, for example, we havethe security world has generally
been supportive of the opensourcing and wide availability of.
Penetration tools for breakinginto computer systems, not because
they want to help people break intosystems, but because they want to
help people detect how to break intosystems so they can fix those things.
(07:37):
And so speaking generally, opensource has also been a boon.
For our overall technical security.
Now we have to get to AI though, andopen source is a little bit of a misnomer
when talking about AI for reasonsI'll get to first, I'll have to back
up and say, what are we talking aboutwhen we talk about AI very generally.
When we're talking about AI, we'retalking about machine learning,
(07:59):
which is using software to lookat vast amounts of data to find
patterns in it that are useful to us.
In the current context, we areusually talking about large language
models or other generative models.
And to break that down as simply aspossible and get to the answering
the question of what are weights.
I'll just say to make a large languagemodel, you take a whole lot of data.
(08:22):
Often everything you can scrapefrom the web, data you've
licensed from other vendors, alarge amount of language, right?
A large amount of language and feed itthrough an algorithm that creates what
is essentially a massive mathematicalspace that sort of summarizes all that
data, not directly, but rather each.
(08:45):
Not even each word, each part of aword, and they're called tokens in this
context, gets its own little spot inthis sort of semantic map of all of that
content, so that, for example, as ithas its ingested more and more data, it
begins to not understand, but be ableto place close to each other related
(09:07):
semantic concepts like dog and puppy, forexample, are close together in this map.
What happens then is once you've trainedthat model to create that mathematical
map, which is what we call weights,Because the, to put it in non technical
language, like all of that semanticcontent has been weighted to some extent
(09:29):
or another in this mathematical model.
You need the weights to be ableto infer anything for the model
to actually give you an output.
Open weights, and so now theanalogy to open source comes in.
There, as with regular software,there's both open and closed approaches.
The closed approach is, for most ofits models, The chat, a good example
(09:57):
is OpenAI and the way it treats mostof its most powerful models, which
is it is available through them.
You have to either go to their websiteand pay the 20 bucks to access their
user interface for chatting withChatGDP, or you're a developer who
(10:17):
pays to access their API, theirapplication programming interface.
To build apps or services, internalor external on top of their model.
Either way, though, they havecentralized control over the model.
Users and developers do not haveaccess, direct access to the weights.
And yeah, so that's closed AI.
(10:39):
Open weights AI, and I hesitate to callit open source for reasons I'll explain,
is when you publish, at the very least,the model software and the weights.
So that another developer could deploythat on their own infrastructure
with their own modifications.
(11:01):
So in many ways it is similarto open source software because
these open weights models aretypically offered under a license.
That either that, that allow for freeredistribution and or don't have use
restrictions and or otherwise giveyou freedom to play with the model
and build derivative models in a waythat you can't do with a closed model.
(11:25):
I don't call that.
Open source for a few reasons.
Um, one, most of the components of themodel are not in the form of source code.
And in fact, this critical heart ofthe model, the weights, is not really
interpretable by a human at all.
Um, but I also don't call itopen source automatically because
(11:46):
it depends on the license.
Speaking generally, an opensource license is one that does
not have use restrictions on it.
And yet, as we'll talk later, thereare emerging some new kinds of
licenses that, for AI safety reasons,do actually attempt to restrict
the allowable uses for a model.
(12:09):
Okay, so let's get into the policy issues.
It sounds companies can decide, open,close, as they do with software.
What's the concern?
There are a number of concerns, butI'd love to start with the benefits.
Because really, the policyquestion requires us.
To weigh the benefits and the risks.
And I think the benefits, although thereare some ways that open source software
(12:31):
and open weights AI are not identical,there are a lot of ways that they are
similar, particularly in their benefits.
And I break these benefits downinto three basic categories.
One is simply distributing power,whether in the market or in the
culture, which I'll explain basically.
As with open source software, we expectand we currently see open weights AI
(12:57):
being a strong competitive pressureagainst open against closed providers, not
least because it is free, we are seeingvery fast diffusion of the technology.
We see very large enterprisesadopting it, including for internal
uses like Dell and Wells Fargo.
There's a great quote from Adele.
Senior VP.
(13:17):
That's basically like, why would wepay for a general purpose model that
does not know much about our company?
And if we want to teach it that wehave to upload all of our private
documents to their cloud and we haveto pay for it, or We could use this
free software and create our ownbespoke model for our own purposes.
(13:37):
That is more efficient, more tailoredto our needs, et cetera, et cetera.
And so you're seeing that thesecond big category of benefits
is simply the catalyzing of.
Innovation, not only innovation in AI, butin all fields that can leverage the AI.
The first LLMs that werebuilt were open research.
(13:59):
They were openly published the Google,the Google research scientists who
created the first LLMs in 2017 ish.
That is the root of the explosionof AI innovation today is the fact
that research happened in the open.
Thanks to open development.
You also see an explosion of developers.
Taking open models and thenusing them to build smaller, more
(14:21):
efficient, customized models.
Um, Build models that are smallenough to run locally rather than
in the cloud, which has both privacybenefits and environmental impact
benefits and just bandwidth benefits.
Like you don't have to spendall that network infrastructure.
Research around AI models also enablesa wide swath of security and safety
(14:42):
research that can't happen effectively.
with closed models and can enablethe faster development of including
the faster development of toolsto detect and prevent bad things.
There's another way in which it canhelp security rather than hinder it.
And then all the otherfields that leverage this AI,
they will also move faster.
And so there is a benefit ofsimply speeding up the process of
(15:04):
innovation and opening that processto a lot more stakeholders as well.
And then finally, there's the,the transparency benefits.
Which also brings securityand accountability benefits.
Like with regular open sourcesoftware, with enough eyeballs, a lot
of those problems will be discoveredand can be remedied because the
whole world essentially is your redteam, the team that's testing it.
(15:27):
There's a lot of research that'shappened that couldn't have happened
without open source models aroundguardrails, the guardrails that
companies use to try to prevent badthings from happening with their models.
A lot of those, the challengeswith those guardrails have been
discovered through open source.
Models, um, develop in a waythat actually reflects on.
(15:48):
The flaws of the closed modelsas well, because they're similar
architectures, the closed and the openmodels, things that you learn about
open models can be applied and you'rethinking about closed models as well.
So it's a way of even getting moretransparency into the closed models,
even though we don't actually haveliteral transparency into the closed
models and it's enabled researcharound bias and discrimination that
(16:09):
wouldn't have been possible withoutaccess to the model and the weights.
And one of the things we've discoveredin all that testing is, and this is
Problematic and worrisome, but alsorelevant to the analysis of whether open
source is worth the risks, consideringthe benefits is the guardrails that
we are currently using to try to.
(16:29):
Basically say no, havea model say no to you.
When you try to do something dangerousor problematic, those are very fragile
and they're fragile, whether or notyou are using open or closed models.
And we wouldn't have known that withoutresearch that was enabled by open models.
This goes to an important conceptthat we'll need to talk about when
we talk about the risks, which iswhat is the marginal risk of open
(16:53):
source or open weights models.
Compared to other technologies that areavailable, whether it's the internet
itself or closed models that arealso available to be used, because
if there's not a big, or there, ifthere's not a meaningful marginal
risk that is differential risk betweenthose two things, there's no reason
(17:15):
to target open source specifically.
For restriction and we shouldprobably think about a different
approach Okay Sounds wonderfulWhy then are there concerns?
Let me ask you in a more targetedway Ntia national telecommunications
and information administration inthe commerce department Launched
this request for comment in early2024 about open weights models.
(17:41):
What, why did the government feel theneed to even ask whether these should
be allowed to exist in a free form?
Yeah, I think there are a numberof different categories of
risk and a number of different.
phases of concern that we've gonethrough at this point in the past two
years, since chat, GTP dropped forreal, I'd say the first category I
(18:05):
would call emergent existential risks.
I would say these were verymuch at the forefront of.
Initial thinking about the dangers ofAI because there is a community, we'll
call them the AI safety community,although there are plenty of other
different types of AI safety folksand stakeholders, there is a community
(18:28):
around AI safety that essentially.
Evolved over 20 years from work beingdone in the Oxford philosophy department
around potential existential risksthat humanity might face and one of
those these people have theorized fora while now is the possibility of some
out of control super intelligent AIthat we lose control of and that does
(18:52):
something catastrophic to harm humans.
for its own reasons or because itmisunderstood its instructions.
And I don't want to denigrate thepeople who are raising this concern.
Speaking generally, I think theseare very intelligent, very well
intentioned, sincere people who wantto see these concerns addressed and
(19:13):
don't want to see AI destroy the world.
The more likely threat than the sortof vague super AI is the possibility
that these systems, as they get moreintelligent, could significantly
aid adversaries, whether stateadversaries or non state adversaries,
in developing chemical, biological,radiological, or nuclear weapons.
(19:38):
or cyber weapons, hacking,hacking tools and whatnot, and
automated attack approaches.
We will talk at more length aboutthat because it ties in to the NTIA
process, but really the key issuethere, as is the key issue generally,
is there actually a marginal risk?
Does open source pose a greater risk ofthose things than other technologies?
(19:59):
And I'm not going to spoil that.
I'll tell you how that turned out.
A third category is simply contentissues of harmful or problematic
or illegal content, particularly ofconcern are deep faked child sexual
abuse material, virtual CSAM or deepfaked NCII non consensual intimate
(20:21):
imagery, often called revenge porn.
And this is going to be aproblem and is a problem.
But again, the marginal risk issueis a question that needs to be
asked, and we'll ask it shortly.
Finally, the fourth category, andwhere I think most of the concern
is settling now, and which I'm surewe'll talk about later, is China.
China specifically.
(20:42):
What do we do about Chinesecompetition in the AI field?
How do we deal with China asa national security competitor
in regard to AI as well?
And I think we are at a point now.
Where that's probably the most relevantconsideration for policymakers in the U.
(21:03):
S.
As they look toward possiblerestrictions on open source now.
But, so that, that's the sortof range of risks I think
people are most concerned about.
Okay, let's drill down on some of those.
I, I agree we can putaside the existential risk.
But with regard to marginal risk,you made the point that, that guard
(21:24):
rails on these foundation models arefragile, but isn't it true that it's
substantially easier to remove a guardrail if you have access to the weight.
So if it's a, an open model, then it'sfairly trivial for someone to take a,
that's a hostile actor to take out thoseguard rails, much more challenging.
If the company keepscontrol over the model.
(21:46):
It depends the you mayhave seen chat GTPs.
model released last week.
And then within minutes on Twitter,people sharing their, their attacks,
their prompts that were able to getaround certain content guardrails.
The Taylor Swift sexually explicitdeep fakes that we saw making the
(22:08):
rounds earlier in the summer, thosewere generated by Microsoft's.
Closed image creator, but yes, opensource code, open source models are
more easily modifiable in a wide rangeof ways, including by bad actors.
But then that, that, that leads to thequestion of what about other technologies?
(22:30):
How does it compare to those?
And that's something that the NTIAgot into, but to set up the NTIA
process a This came up in the contextof the Biden AI executive order that
came out October 30th, I want to saylast year, and which called on NTIA
to do a report on open weights AI.
(22:53):
This was prompted as best we can tell bythere being staff at the National Security
Council of the White House and staffat the Bureau of Industry and Security,
the export control guys at the CommerceDepartment who were concerned particularly
about the CBRN and cyber risks.
And we're considering the possibilityor looking at the possibility
(23:16):
of putting restrictions on thepublication of open weights because
of those national security concerns.
It seems that somewhat coolerheads prevailed in pushing
instead pushing for that.
Not to be decided in the EO, but ratherto delegate to NTIA, the National
Telecommunications Information andInformation Administration, essentially
the president's lawyer, the president,not the president's lawyers, the
(23:38):
president's technology advisors, um, todo a more sustained study of the issue.
The issue in particular being what arethe benefits of open weights models?
What are the risks of openweights models based on that?
What are potential policy approachesthe president should take?
Or not and and so that was the processand they put out the call for comment
(24:03):
earlier this year A bunch of peoplefiled comments including us and i'll
talk about that And then they finallyreleased a report I guess a couple of
months ago And so let's talk about thatand i'll talk about the comments that
cdt filed first a not so humble brag.
I'm proud to say This is a dubiousachievement, but our comments were the
longest and most detailed comments filedin the proceeding, which had hundreds upon
(24:25):
hundreds of comments, more salient, ourcomments were cited in the report from
the NTIA more than any other comments.
And the only other single source thatwas cited more than our comments was a
paper that we coauthored with academicsat Stanford and Princeton focused on
defining marginal risk around open models.
(24:46):
I'm glad to say, I feel that we had avery positive influence on this process.
We also worked closely with NTIA to ensurethey heard from a variety of other civil
society voices from civil rights groupsto civil liberties groups to open source
advocates and everything in between, butthe gist of what we were saying in the
(25:06):
paper I mentioned in our comments is.
We are not ruling out categoricallythe possibility that some restrictions
on open weights may ultimately benecessary, but at this point at, with the
capabilities we see online now and thatare coming shortly, there is not enough
(25:27):
evidence to support those restrictions.
And by, by that, not enough evidencethat there's actually a significant
marginal risk from open models.
This is probably clearest.
In the context of the like biological andnuclear and cyber conversation there by
(25:47):
the time of our comments, there had beenseveral academic papers and papers from
security think tanks like Rand research byMicrosoft and open AI research where they
basically set up teams to compete on whocould come up with the best evil terrorist
plan around biological weapons andgave One team access to the best models
(26:12):
and one team access to the internet.
And what they found was there was verylittle difference in the capacity of
those teams to plan an attack, which,which stands to reason because what the
models know is what they've seen online.
And so it may give someincremental additional aid in.
(26:34):
in collecting some of your sourceswhen you're looking for information
to help plot a plot, but it reallydidn't make a huge difference.
Similarly, There was research on how muchmodels, which certain models can also
help instead of just speaking in language,they can also code, they can create
software, and so there were tests abouthow effective they were in helping create
(26:59):
penetration tools and other attack tools.
And again, they found only a incrementaldifference, not really a major difference.
between what a coder could do without amodel and what a coder could do with a
model in terms of offensive cyber stuff.
It's also worth noting, as with the casewith open source software, that capability
(27:23):
to, to create tools also is now in thehands of defenders as well as attackers.
And as we've seen in the open sourcecontext, that has tended to ultimately
benefit defenders more than attackers.
And so NTIA, which was primarilyfocused on those as the most realistic,
National security related threatsthat they were asked to assess
(27:44):
essentially concluded We don't seea significant marginal risk here.
That doesn't mean there mightnot be one at some point.
Therefore, we do endorsecontinued monitoring.
I guess what a doctor would call watchfulwaiting, like it's not something that
requires intervention now, but somethingthat we need to be watching closely,
(28:05):
including recommending continuedinvestment in the development of best
practices and standards around safety.
As well as market surveillanceand other measures to see how
these models are being used.
Because so much of the risk is actuallygoing to turn on what context it is
being used and by what types of users.
(28:26):
And basically advise the presidentto hold off on restrictive measures.
And I think this was the rightdecision in a number of ways.
But especially by analogy to a similarfight over tech policy that we had
in the 90s, that we haven't talkedabout yet, but that has to be talked
about in this context, which is thefight over open source encryption
(28:50):
code in the 90s and early 2000s.
I think we have a verysimilar issue with AI today.
And this is now I'll talk about China,because it's a very similar situation
where in the encryption case, it was.
We are afraid our adversarieswill misuse this technology and
therefore we want to try to limitthe spread of this technology.
(29:11):
But the technology we'retalking about is literally bits.
It's not a physical thing thatwe can physically constrain.
And so it's really hard to prevent thespread of software on a global internet.
And that's as true today, ifnot more than it was then.
And there are those that are concernedthat if we are doing open source AI models
(29:33):
that We are giving away our IP to ourcompetitors in China, that we are giving
away software that could be integratedinto military operations by China.
Both of those are true in the sense thatcertainly there will be stakeholders in
China that make use of this software, but.
(29:54):
Even if our software is not available,they will make use of similar software,
whether it comes from France or the UnitedArab Emirates or China itself, where
there are plenty of large models beingcreated today without reliance on U.
S.
technology.
So I think it's a question of, yes,we could attempt to restrict this.
(30:16):
We would probably fail.
It would probably leak anyway,but even if we did not fail, it
likely wouldn't be effective.
in preventing China from gettingcomparable technology elsewhere.
Meanwhile, we will have slowed down ourown innovation cycle, slowed down the
(30:36):
diffusion of the technology generally,and the many economic benefits and
other benefits that will accrue fromthat, and have hindered our ability
to compete in the global marketplace.
Around this kind of software, which Ithink is especially important to consider
when you think about China's geopoliticalposition right now and things like
(30:58):
the Belt and Road Initiative, which isits massive global attempt to provide
infrastructure, including communicationsinfrastructure to the global South,
to Africa, to South America, to theMiddle East as a way of consolidating
geopolitical power, our having arobust open ecosystem of AI models.
(31:19):
Is probably our best shot atpreventing China from dominating the
AI market in those global markets.
And they're actually at a uniquedisadvantage compared to our open
models because, because they are China.
The CCP has already passedregulations that basically enforce
(31:42):
ideological purity on Chinese models.
Basically, censorship in line withThe party line in china, which makes
them necessarily much less useful toa lot of stakeholders So if we can
compete on a global stage With free,powerful, customizable AI software,
(32:05):
and China is competing with less goodsoftware, then that actually strengthens
our geopolitical position in regardto China, rather than weakening it.
Would be Our argument and the argumentat this point of a number of relatively
conservative stakeholders as well asyour center left types like me Okay,
(32:26):
there's a lot more that I would loveto get into this with you about but
we have limited time I don't want tojust ask one or two more questions.
Sure One is that the company usedto work for, Meta, is the only one
of the major frontier AI labs that,that seems to be pushing forward with
releasing its most powerful models,the LLAMA models as open weights.
(32:50):
So can you understand, of course,you, you are not speaking for them.
Can you talk a little bit about.
Maybe why it seems like they've takenthat direction and how is that going
to affect this debate that it's not nowjust an issue about AI safety, but it
plays into all of the competitive andother debates in the technology industry.
(33:10):
Sure.
Yeah, I'm not going to attemptto divine the mind of Mark
Zuckerberg or say anything.
Confidential that I may or may nothave learned inside of Meta, but I
can note some things that other peoplehave said and some obvious facts.
I think one of them is simple isthe simple fact that Facebook now
(33:30):
Meta doesn't have a cloud servicesdivision through which it wants
to sell access to closed models.
It's it's whole businessmodel is different.
It offers consumer products.
To consumers that rely on AI,but the them dominating the AI
model market is not their goal.
(33:52):
As one could argue is the goal ofyour Google's, your Microsoft's,
your Amazon's, your open AI's.
So they just have different incentives.
One of those incentives is by openingtheir largest model, which is also,
they've said the model that is going tobe the primary basis of a lot of features
(34:12):
in their products, they can get thetraditional benefits of open development.
They can get those millions ofeyes on their, on their model.
They can benefit from seeingwhat customizations people
are doing on platforms like.
GitHub or HuggingFace and integrating thegood ones into their production model.
(34:33):
And then there's also simply finallythe obvious fact that it is what
they meta when ChachiTP dropped.
Meta had long been doing opensource releases of research models.
Some may recall the dubious releaseof Galactica shortly before ChachiTP,
which was a, an LLM trained on andfocused on scientific literature.
(34:56):
That spouted nonsense and wastaken down pretty quickly.
And so you already had this large researchinvestment in developing LLM technology
that wasn't necessarily being productized.
And there's the obvious fact that if.
Meta threw that out there whileopen AI and Microsoft were trying
to consolidate a closed AI position.
(35:17):
It would slow down those competitors.
Many of your listeners may haveheard of the memo inside of Google.
I believe it was called open AI has nomoat and neither do we in the sense of
a competitive moat against open source.
And I think.
People inside of Facebookprobably followed a similar logic.
I am neither confirming or denying.
And in fact, I was on parental leavefor much of this decision making, but
(35:40):
recognized that offering this would notonly create a lot of benefit and use
for a lot of people, but also slow theadoption of closed models offered by
their competitors, who again, Meta doesn'toffer a cloud service, but it does offer
a lot of other services that are incompetition with a lot of these companies.
(36:00):
And so I think as a number ofcommentators have noted, that was a
clear benefit to Facebook if they couldthrow an obstacle in front of their
fast moving opponents in the market.
As I make the distinction betweenopen weights and open source, Meta is
a good example of why we need to bereally careful about our terminology
there, because Mark Zuckerberg lovesto call LLAMA open source and use
(36:24):
that terminology, but the licensethat Meta is licensing under is
definitely not an open source license.
In some ways that are arguably goodand in some ways that are arguably bad.
Most notably, their originallicense forbade you from building
a model using outputs from theirmodel, which is actually a very
powerful way of building smaller,more condensed versions of models.
(36:48):
Thankfully, they fixed that.
But now if you do that, your model hasto be Marked as built with llama and you
don't typically see that kind of brandingrequirement in the open source community
And so I'd love for them to get ridof that and then also as has been much
commented on they also have a clause thatprohibits any company with more than 700
(37:13):
million average monthly users To use themodel this is to basically prevent google
and microsoft from using the model forfree And if they want to use it, they'll
have to go to to meta to license it againLike that's not the sort of thing you'd
see in a regular open source licensecarving out particular users At least
(37:33):
it is limited to Companies that had thatlevel of users at the time of the license.
So it's not like you could build yourcompany on LLAMA, ultimately get to 750
and then suddenly have the bottom pulledout of your business, but it's still fair
to say not anti competitive and somethingthat, that especially if they want to
keep calling it open source, they needto take out of their license or they
(37:57):
need to stop calling it open source.
All right, there's a lot more that wecould talk about here Obviously the
ntia report will not be the last word,but we're going to need to wrap up for
time kevin Thank you so much for goingthrough and in so much detail this really
important set of issues with us Anytime.
It was a pleasure.
Great to have you.
(38:17):
This has been The Road to Accountable AI.
If you like what you're hearing,please give us a good review and
check out my sub stack for moreinsights on AI accountability.
Thank you for listening.
This is Kevin Werbeck.
If you want to go deeper, check On AIgovernance, trust, and responsibility
(38:39):
with me and other distinguishedfaculty of the world's top business
school, sign up for the next cohort ofWharton's strategies for accountable
AI online executive educationprogram, featuring live interaction
with faculty, expert interviews, andcustom designed asynchronous content.
Join fellow business leaders tolearn valuable skills you can
put to work in your organization.
(39:01):
Visit execed.
wharton.
upenn.
edu slash ACAI for full details.
I hope to see you there.