Federated learning in production (part 1)

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Jerod (00:04):
Welcome to Practical AI, the podcast that makes
artificial intelligencepractical, productive, and
accessible to all. If you likethis show, you will love the
changelog. It's news on Mondays,deep technical interviews on
Wednesdays, and on Fridays, anawesome talk show for your
weekend enjoyment. Find us bysearching for the changelog

(00:24):
wherever you get your podcasts.Thanks to our partners at
fly.io.
Launch your AI apps in fiveminutes or less. Learn how
fly.io.

Daniel (00:44):
Welcome to another episode of the Practical AI
Podcast. This is DanielWitenack. I am CEO at Prediction
Guard and joined as always by mycohost, Chris Benson, who is a
principal AI research engineerat Lockheed Martin. How are
doing, Chris?

Chris (01:01):
Doing great today, Daniel. How's it going?

Daniel (01:03):
It's going pretty good. I would say my mind is a little
bit scattered today, maybedistributed over various topics,
jumping from peer to peerbetween different meetings.
Thankfully, we're just going tocontinue that theme today into a
little bit of a discussion onfederated learning because I'm

(01:26):
really, happy to have PatrickFoley here with us who is lead
AI architect who's focused onfederated learning at at Intel.
How are doing, Patrick?

Patrick (01:35):
Doing great. Thanks for having me on the show.

Daniel (01:37):
Yeah. Of course. I was saying, one of our one of our
engineers at, at PredictionGuard, Ashwarya, shout out to
her. She spoke at the FlowerConference over in London, not
too long ago, I bumped into you.So it's good to good to get
that, get that lead.
But, it's been maybe a littlewhile since we talked about

(01:58):
federated learning, which wehave talked about in previous
episodes. But I'm wondering justfor the audience at large, who's
maybe been hearing a lot aboutLMs and only LMs or Gen AI for
however long now. Just circlingback to that topic, could you
set the stage for us and give uskind of the explainer on

(02:22):
federated learning generally andwhat that means?

Patrick (02:25):
Yeah, absolutely. So the main training paradigm for
machine learning has been havingyour data centralized and then
training your model on thatlocal data. There's a lot of
cases where you can't centralizeyour data due to privacy
concerns or maybe even the sizeof the data is an issue. And so

(02:46):
there's a different techniquewhere instead of sending your
data to a central place, yousend your model to where the
data is and you train it there.So it's closely related to
distributed training, as youcould probably tell from the
description there.
But there's a much higher focuson privacy concerns. And so how
you can verify that the model isnot encapsulating something

(03:11):
about the data and who thethreats are because it's not
just a single person that iscontrolling all of the
infrastructure, but multipleparties who might not trust each
other, That's where a lot of thevariance of how we need to focus
on those concerns comes from.

Daniel (03:28):
And just to dig in maybe just a small bit deeper there.
So if you're bringing the modelto this distributed data, in
what way maybe just walk usthrough kind of a flow, I guess,
of training. So you send themodel to these places that have
the data, what kind of happensin that training process or how

(03:52):
does it iterate in a differentway than maybe what people are
used to hearing about?

Patrick (03:57):
Yeah, absolutely. So there's a number of both closed
source and open source federatedlearning frameworks that are out
there. I lead the open federatedlearning OpenFL open source
projects. And there's a numberof people that do this in the
same way. But really what itinvolves is first having a
shared notion of what that modelis.

(04:19):
And then there might be adistribution phase for the
workspace or the code ahead oftime so that everyone has a
record of what the code isthat's going to be running on
their infrastructure. And so atthe time that the experiment
starts up, there's a server orwhat we call an aggregator
that's the central point whereeveryone is communicating with

(04:40):
that server for what tasks theyshould be doing or what the
latest model weights are thatthey should be training on. And
then the client side is what weterm as the collaborator. So
everyone has a view of what thatcode is and we have this concept
of a federated learning plan,which includes everything

(05:01):
outside of the code itself. Sothis might be hyperparameters
for the model, some of thenetwork details that you might
want to know, whether there'sTLS being used, mutual TLS, and
a lot of other things that youmight care about if you're a
hospital that wants to berunning this software on your
infrastructure and you don'twant to be exposing your data

(05:22):
because of HIPAA or GDPRconsiderations.
So there's this vetting processthat's really important to
happen ahead of time. And thenonce this vetting has happened,
then there's an opportunity toactually launch the experiment.
And what this means is for theaggregator or the server is
launching that application thatopens a that starts a gRPC
server or some kind of RESTserver. And then for the

(05:44):
collaborators, they are juststarting their local process and
making the connections to thatlocal server. So the flow is
this is really all of the setupfor the experiment actually
taking place.
But the aggregator has initialmodel weights for what everyone
is going to be training on forthat first round of the
experiment. And so then everyonereceives those model weights.

(06:08):
And it's not the entirety of themodel. And the way that we
divide things into thisprovisioning phase and then the
runtime phase is so that we canlimit what actually gets sent
across the network. We don'tneed to be sending Python
objects, which are much higherrisk in terms of being able to
send code that could thenexfiltrate your data and it's
not necessarily vetted ahead oftime.
So there's very small windows ofinformation and we limit that

(06:32):
communication path to NumPybytes. And the great thing about
doing things in that way is thatif you're just dealing with
model weights, then that meansthat you can train across a
bunch of these different deeplearning frameworks. So we can
work with PyTorch models,TensorFlow models, etcetera. And
you can send those model weightsacross the network. You can
populate your Python code that'salready been shipped to you

(06:54):
ahead of time, do your localtraining, and then based on the
updates that you have for basedon your local data, you send
your updated model weights backto the aggregator and then they
get combined in some way.
In the simplest case, this canbe something like a weighted
average based on the number ofdatasets that you might have
locally for each of thosecollaborators. And then this is

(07:18):
really what constitutes a singleround of federated learning
training. And then what we'veseen is that just by using these
simple methodologies, you canget to a point where you have
somewhere in the realm of 99%accuracy versus a model that's
been trained on centralized dataalone.

Chris (07:34):
I'm curious, just as you were talking about the the
aggregation of the of each ofthe data back to to the to the
main server. And you talked alittle bit about different ways
of aggregating and stuff. I'mjust curious, are there a lot of
different approachesalgorithmically to that
aggregation or does that tend tofollow the same mechanism most

(07:55):
of the time? And do people tendto choose different ways of
aggregating data? I'm justwondering how much variability
is typically found in thereamong practitioners.

Patrick (08:04):
Yeah, that's a great question. So we've seen that Fed
average works pretty well in alot of cases, but because So Fed
average is the originalaggregation algorithm for
federated learning that wascoined by Google. This was back
in 2017 and they actually coinedthe term federated learning
originally at that time. Butthere's others that are out

(08:26):
there that deal much better withdata heterogeneity between the
different client sites thatmight have different data
distributions. And so whenthat's the case, you might need
to ignore some of the outliersor incorporate their local
updates in a different way thatallows you to capture that

(08:47):
information or converge fasterto what a global model would be
that would perform well on allof these different data
distributions.
So there's a number that do tryto capture some of this
information. So FedOpt is one ofthose that incorporates the loss
terms of the differentcollaborators that are out
there. This is really a hotresearch area, but really varies

(09:13):
is what we found. But byapplying some of these top
methods, you can generally getto a pretty good point in
convergence versus centralizeddata alone.

Daniel (09:23):
So Patrick, I'm curious about if we could just talk
through maybe a couple ofexample use cases, kind of
pointing out the actors in theprocess. So we've talked about
kind of the central aggregation.We've talked about these,
clients or collaborators, Ibelieve you called them. So this

(09:44):
distributed set of collaboratorswho have the model and are doing
updates to the model, which arethen aggregated back together.
If you could just maybehighlight, Hey, here's an
example use case in thisindustry with this type of
model.
Here's who the party would bethat would be the aggregator

(10:07):
party and where thatinfrastructure would run, and
here's the parties that would bethe collaborators where the
model would be distributed. Thatwould be very helpful.

Patrick (10:18):
Yeah, absolutely. So I'll take one of really the
first real world deployments oflearning that my team took part
in. So back in about 2018 or so,Intel started collaborating with
the University of Pennsylvaniaon trying to deploy federated
learning in hospitals for thepurpose of brain tumor

(10:40):
segmentation. So this was veryrecently after Google even
released their seminal paper onfederated learning showing that
this had high success for textprediction on Android phones.
And this was the healthapplication of this for
federated learning.
And so this progressed to apoint where we were able to

(11:00):
demonstrate that we were able toachieve 99% accuracy versus a
centrally trained model. Andthen this really spanned out to
a much larger real worldfederation where we were able to
train across roughly 70different hospitals across the
world. And so each of thosehospitals represent the

(11:22):
collaborators in thearchitecture that I was speaking
to earlier. And then theUniversity of Pennsylvania
served as that central point orthe aggregator for where the
initial model was populatedfrom. And it was a three d
convolutional neural network, asegmentation model.
So coming in with DICOM data andthen trying to get an estimate

(11:45):
of where a glioblastoma braintumor was based on that image.
And so there's the collaboratorsand the aggregator. And then
that's really the high level ofwhat this looks like. But then
there's a lot of other detailsthat had to be dealt with beyond
just this more, I would say,vanilla federated learning

(12:08):
architecture. And really wherecame from was there's a lot of
issues with figuring out how toidentify mislabeled data when
you have privacy that's atstake.
And so this really requiresexperts in data science or
someone who has a background infederated learning to go and
dive into how you're identifyingthese conversions issues that

(12:32):
might pop up. And so UPenn wastaking on a lot of that
responsibility. There were Intelengineers who were very involved
with a lot of those calls aswell and trying to get on the
phone and have these Zoom callswith, I mean, these different IT
admins and data owners at eachof the hospitals, just trying to
figure out where there might bea mislabeled dataset or that

(12:52):
type of thing. But it reallyexposed that there were gaps in
the total participants layout,and we needed to have more of
this kind of shared platform forhow you can exchange this
information and get access tothat data in a secure way. And
that's one of the things thatwe've been working on ever since
this study came out.

Sponsor (13:25):
Well, friends, NordLayer is the toggle ready
network security platform that'sbuilt for modern businesses. It
combines all the good stuff,VPN, access control, threat
protection, and it's all in oneeasy use platform. No hardware,
no complex setup, just secureconnections and full control in
less than ten minutes. No matterif you're the business owner,

(13:45):
the IT admin, or someone on thecybersecurity team, NordLayer
has what you need. Here's a fewuse cases.
Business VPN. How often are youtraveling? And you need to have
secure connections from oneendpoint to another, accessing
resources, preventing onlinethreats, preventing IP leaks.
This happens all the time. Whatabout threat protection being in

(14:07):
a place where you can preventmalware, where maybe there's a
high risk, you're at a coffeeshop, malware, ransomware,
phishing.
These things happen every singleday and users who are not
protected are the ones who getowned. And what about threat
intelligence? What if you couldspot threats way before they
escalate? You can identify,analyze, prevent internal and

(14:28):
external risks. This is likedark web stuff all day.
Data breaches, breachmanagement, serious stuff. Well,
of course, our listeners get asuper awesome deal up to 22% off
NordLayer yearly plans plus anadditional 10% off the top with
the coupon code usingpractically dash 10. Yes, that's

(14:49):
the word practical then l y dash10. So practically dash 10. And
the first step is to go tonordlair.com/practicalai.
Use the code practically dash 10to get a bonus 10% off. Once
again, that'snordlair.com/practicalai.

Daniel (15:15):
Well, Patrick, I'm wondering, you you gave a really
good example there in terms ofthe healthcare use case, the
distributed collaborators beingthese hospitals, the aggregator
being the university. Certainlythere's other details that are
relevant in that, that I'm surewere a lot of difficult things

(15:39):
to work out and research. One ofthe things that I'm wondering,
and this might be somethingthat's on people's mind just in
terms of the climate that we'rein around AI and machine
learning, is what are the typesof models that are relevant to
federated learning? It might besomewhat of a shock to people

(16:02):
just coming into the AI worldthat, hey, are still a lot of
non GenAI models. Actually, themajority of AI models, quote
unquote, or machine learningmodels out there are not GenAI
models.
It may come as a shock to themthat there's still a lot of that
going on. I assume based on whatyou said before that those types

(16:23):
of non Gen AI models arerelevant to the federated
learning procedure or framework.But could you give us a little
bit of a sense of the kinds ofmodels that are relevant and
maybe tie that into some of the,I guess, just the real world
constraints of managing one ofthese federated learning

(16:45):
experiments in terms of thecompute that's available or the
network overhead or whateverthat is and what that kind of
dictates in terms of the typesof models that are currently
feasible to be trained in thisway.

Patrick (16:59):
Yeah, absolutely. So I would say most of the real world
deployments of federatedlearning have focused on non Gen
AI models up to this point. Sothe example that I had was this
three d segmentation type of usecase. There's been a lot of
other deployments of theseclassification models. Really
where federated learning hasfocused on from the framework

(17:22):
support perspective has beenaround neural networks.
And a lot of the reason for thatis not just because of all of
the advances that have, ofcourse, happened for neural nets
over the past ten to fifteenyears, but it's been because you
have a shared weightrepresentation for all of those
models across each of the siteswhere they're going to be

(17:43):
distributed. And really what Imean by this, and just as a
comparison point, so say supportvector machines or random
forest, you're going to havesomething that is going to be
based fundamentally on the datadistribution that you have
locally at one of those sites.So with neural networks and
using that for federatedlearning, that allows us to have

(18:05):
much clearer methods for howthose weights ultimately get
combined for the purpose ofaggregation without knowing
quite as much about the datadistribution ahead of time. I
will say that there are somemethods for how you perform
federated learning on theseother types of scenarios. So
federated XGBoost is somethingwe recently added support for in
OpenFL.

(18:26):
There's other types of methodsout there that have actually
performed pretty well. And Imean, getting back to the Gen AI
piece of this, that is, ofcourse, a big area of interest
for federated learning too. Andwe have a number of customers
who have been asking about howthey can incorporate these large
foundation models, generative AImodels for the purpose of

(18:48):
federated learning and thistraining in a privacy preserving
way. And to get to your point orthe question around the size
constraints that we run into,is, of course, an issue for
these large GenAI models. We'revery lucky to have techniques
like PEFT and quantization thatcan be applied so that you don't

(19:10):
necessarily need to be trainingon the entirety of
70,000,000,000 waits at a timeand distributing those across
the network because as you scalethe federation, there's, of
course, a lot of network trafficthat can result from that.
So by shrinking that in any waythat you can, we can still
support those types of models,but it's still I would say we're

(19:32):
having to use these additionalmethods instead of just base
training because size and thetime that it takes to actually
train them is, of course, alwaysa concern.

Daniel (19:42):
Yeah. And, just for listeners that are maybe more or
less familiar with certainterminology, those sort of PEFT,
this is parameter efficientmethods where maybe only some of
the parameters of a model, amodel function are are updated
during the training process andcreate some efficiencies there.

(20:05):
Quantization being methods tolimit the precision or the size
of the total parameter set bykind of reducing the precision
of those parameters. I'mwondering, we've kind of
naturally got into it, Patrick,but you started talking about,
of course, requests to addfeatures and that sort of thing.

(20:28):
Obviously in your context, Ithink we're mostly talking about
OpenFL.
I'm wondering if you could justgive us a little bit of an
introduction. Now we've talkedabout federated learning more
broadly, what it is, kind ofsome use cases, that sort of
thing. Obviously there needs tobe frameworks to support this
process and OpenFL being one ofthose. Could you just give us a

(20:50):
little bit of an introduction tothe project at a higher level?

Patrick (20:54):
Yeah, so OpenFL, Open Federated Learning is what that
stands for. It's been aroundsince about 2018 and it came out
of this research collaborationthat we had with the University
of Pennsylvania. So what otherFederated Learning frameworks
have done is they've reallystarted from research and then
expanded into real world andproduction deployment. We kind

(21:15):
of took this the oppositedirection. We had to deal with
the real world issues that comefrom deployment of this
framework into hospitals and thechallenges that can really
result from that.
And when I say we, I mean, thisis a collaboration between my
team at Intel, which is morefocused on the productization
side of how you take thesetechnologies and then bring them

(21:38):
into products. University ofPennsylvania, but then also
Intel's Security and PrivacyResearch Lab. So they're, of
course, very focused on researchas well and have been thinking
about security and privacy andconfidential computing for quite
a long time. So this was reallya natural collaboration to bring

(21:58):
together research with theexperts in this health care and
in brain tumor segmentation typeof deployments to really bring
the right features into thisframework that started off as
largely a research project atNCEL, but then has since become
a much larger framework that'sfocused on how you can actually

(22:20):
perform this across companies oracross very large types of
deployments that involveacademia as well as just how you
bring different partiestogether.

Daniel (22:34):
Yeah. And obviously it's called OpenFL. I'm assuming that
people can find it somewhere inthe open source community. And
also I see there's kind of anassociation with the Linux
Foundation, if I'm understandingcorrectly. Could you talk a
little bit about those thingsand just sort of the, I guess,
the ecosystem where people canfind things, but also a little

(22:57):
bit about the kind of who isinvolved and some of how that's
developed?

Patrick (23:01):
Yeah, absolutely. So OpenFL started as an Intel first
closed source project, and thenwe open sourced it around 2020.
We've since donated it to theLinux Foundation, the Data in AI
subgroup of that. And the reasonwas is that Open is in the name.
We wanted this to be really acommunity driven and owned

(23:22):
project.
And that's the way that we sawthis gaining the most traction
and success over time. So wedidn't want Intel to be in the
driver's seat for havingcomplete control over what the
direction of this was going tobe. In order to be truly
successful as an open sourceproject, you need to be thinking
about the community andaddressing really those concerns
and letting them take the wheeland steering this in many cases.

(23:45):
So Intel still has a largerepresentation on the
development and roadmap forOpenFL, but we have a technical
steering committee that'sgoverned under the Linux
Foundation. So I'm the chairmanof that steering committee, but
then we also have Flower Labswho supports the Flower
Federated Learning Framework, isalso a participant on that

(24:06):
technical steering committee.
We have representatives fromFate, who is actually another
competitorcollaborator of ours,Leidos, and then University of
Pennsylvania as well. Theirfaculty has actually since moved
over to Indiana University, butthey still represents the
original collaboration that wehad. And they're longtime

(24:26):
collaborators of ours whocontinue to have a strong vision
of where federated learning ismost applicable for research
purposes.

Daniel (24:34):
And I guess in terms of usage, sometimes that's a hard
thing to gauge with an opensource project, but could you
talk a little bit about that andmaybe, you were just at the
Flower Conference, you'reengaging the community in other
ways, I'm sure at other eventsand online. Could you maybe talk

(24:58):
a little bit about what you'veseen over the past however many
years in terms of actual realworld usage of federated
learning and engagement in theOpenFL project and kind of what
that momentum has looked like,how you've seen that maybe shift
in certain ways over time andhow you see that kind of

(25:20):
developing moving forward.

Patrick (25:23):
Yeah, absolutely. So I think that it's really picked up
since about 2020. We had theworld's largest healthcare
federation at that time, and wepublished a paper in Nature
Communications demonstrating thework that we had done. But it's
really become evident thatthere's a lot of real world
federated learning that otherframeworks are are starting to

(25:45):
get into as well. So myinvolvement at the the Flower
Summit was we we I've actuallyso my team at Intel and OpenFL,
we've been collaborating withFlower Labs for the last three
years or so.
And we're jointly veryinterested in interoperability
and standards for federatedlearning. So I think that one of
the things that we bothrecognized early on is that

(26:08):
federated learning is pretty newcompared to just deep learning
as a study. And we've kind ofseen that things are heading the
same direction that they didwith the early deep learning
frameworks that were out there,where you have a proliferation
of them at the very beginningand then over time, there's more

(26:29):
consolidation across thoseframeworks as one ecosystem
becomes more mature or theyspecialize in really different
ways. So we've been workingclosely with Flour and other
groups on how we can build thisinteroperability between our
frameworks and try to get to apoint where we have a defined
standard for some of those lowerlevel components because

(26:49):
ultimately, we're solving thesame problems over and over
again between our differentimplementations and there's not
really a need to do that. Ifyou've done it once and if
you've done it the right way,then you should be able to
leverage that core piece offunctionality and then just
import it into whatever libraryyou want to.
That's really the open sourceethos. It's building on top of

(27:11):
the shoulders of giants. Sothat's the direction that we're
hoping to head. And so at theFlower Summit, we've gotten to
the point now where we canactually run Flower workloads.
And this is acompetitorcollaborator of ours,
but we can run their workloadson top of OpenFL infrastructure.
And getting into the pieceswhere we specialize and we do

(27:33):
have differentiation, So Flowerhas done a great job building a
large federated learningcommunity. They've done wonders,
I think, for the scaling offederated learning and the
visibility that's on it. Andthey have a very close research
tie as well. So they're seeing,I think, the gamut of different
things that people want to dofor privacy preserving AI.

(27:54):
OpenFL, we've had, because ofour history in security and
privacy, confidential computingand how you really think deeply
about preventing threats forprofited learning and these
distributed multi partyworkloads, that's an area that
we've been thinking through forquite a while too.
And we have the benefit, beingfrom Intel, of actually having

(28:15):
invented a lot of theirtechnologies for confidential
computing like Software GuardExtensions. So you can run
OpenFL entirely within thesesecure enclaves, which means
that even local root users donot have visibility into what is
actually happening in theapplication. And if you engage
other services on top of that,like Intel Trust Authority, that

(28:36):
allows you to actually remotelyverify that someone else is
running the workload thatthey're supposed to. So part of
the vision here and why we're soexcited to be working with Flowr
is that now you can run, as partof the Flowr community, this
very large community, you canrun these workloads now inside
of these confidential computeenvironments on Intel hardware

(28:57):
using OpenFL. So there's kind ofa chain of how all of these
things flow, but that's one ofthe directions that we're really
excited to be undertaking withthe wider federated learning
community that's out there.

Chris (29:09):
So Patrick, was really interesting for me. I'm learning
a lot. And you got me thinking,I'm kind of starting to think
about OpenFL in my own life, inmy own world. I'm really kind of
focused on kind of agentic usecases and out on the edge with
kind of physical AI devices thatare doing that. And I'm just and

(29:33):
you really got me thinking aboutall the ways that we could apply
federated learning in thoseenvironments.
I'm kind of wondering, therewhat what is that is, you know,
obviously a big wave of activitywe're especially seeing, you
know, in the last year or so.What is kind of the story around
doing federated learning across,you know, physically, not just

(29:57):
within, you know, different datacenters and stuff like that,
where you have it, but edgedevices where you're storing a
ton of data in those devices,and you're you're running
agentic, you know, operationsand those and you're wanting to
try to to apply federatedlearning to that environment.
What's the thinking about wherethat's going and where it's at
now and where it might be goingforward?

Patrick (30:19):
Yeah. So I mean, it's going be a big area. And we're
fully anticipating that this issomething that we want to go out
and support. So for AgenTic, youhave the neural network is one
of the components, and then youhave the tools that are actually
performing operations based onwhatever information is coming
from that neural network. So ata fundamental level, we can
absolutely support these AgenTicuse cases by training that

(30:43):
neural network and doing this ina privacy preserving way.
So I think one of the areasthat's not necessarily that well
studied yet, and I think there'smore and more focus on this, but
how LLMs can memorize data in away that certain other neural
networks cannot. And so that'sreally a hot research area. But

(31:06):
depending on, I think, how youtrain these models and then
ultimately how they're deployed.So if you're using privacy
enhancing technologies on top ofjust this architecture where
you're training at the edgealready where the data is, then
you're going to get a lot moreconfidence that there's not
going to be your informationthat's somehow exposed where the
model ultimately ends up going.

Daniel (31:28):
Yeah. And this would be like in terms of memorization,
what you're talking about herewould be like, Hey, I'm training
on In this device, let's sayit's just a bunch of people's
clients and there'scommunications on those clients
that have personal information.In theory, an LLM could be

(31:53):
trained in a distributed way,but leak that data through the
centrally aggregated model. Am Iunderstanding that right?

Patrick (32:03):
That's exactly right. And we have customers come to us
all the time and ask, how can weget assurance that my data is
not leaking into the model? Andthe best thing that we have to
deal with this, there'sdifferent types of technologies
that are out there. You havedifferential privacy that can
apply noise in such a way thatyou're trying not to expose

(32:23):
anything fundamentally aboutyour data when you share those
model weights. You have othertechniques like homomorphic
encryption, where you'reencrypting those models ahead of
time before they're actuallyeven sent for the purpose of
aggregation.
But really, not all of them iscompletely foolproof. There's no
free lunch, as we say. And thenconfidential computing, it has

(32:47):
the benefit of you can actuallytrain in these completely
constrained environments wherenot even the root user has
access to this little protectedencrypted memory enclave. But
that ultimately requires thatyou have hardware at the edge to
go and be able to perform thattype of thing. So that's really

(33:08):
where the challenge lies.
And there's other statisticalmeasures of how you can estimate
data leakage into the model. Wehave support in OpenFL for a
tool called Privacy Meter thatactually lets you train a shadow
model based on the localtraining that you've done and
then get some kind of grapharound what the percent risk is
based on the local datadistribution that you have and

(33:31):
that exact model topology thatyou've trained on. So there's, I
think, increased visibility onhow you can try to quantify that
amount of data leakage. Butthere's some costs, in the case
of some of these technologies,the cost of accuracy for the
model overall. So it's really ona per experiment, per model, and

(33:52):
per data distribution basis thatyou have to tune these things.
That's where there's a bit ofwork and recommendations that
need to be made from people whohave experience in this domain.

Daniel (34:03):
And have a Maybe this is sort of a strange question, so
humor me in this one. While youwere talking, I was kind of
reflecting on the fact thatmaybe the landscape is shifting
a little bit around privacy ingeneral and AI in the sense that

(34:23):
for whatever reason, seem towant to send a ton of their data
to third party AI providers now.And I think gradually people are
becoming more sophisticated inthat and sort of understanding
the implications around sendingyour data to third parties in
the sense of using third partyAI model providers from model

(34:47):
builders and not running that intheir own infrastructure. But
there's definitely a wider Thishas opened up the topic of
privacy to a much wider audienceand maybe people that aren't so
before there was sort of thismaybe this discussion around
federated learning amongst datascientists, researchers, those
that are trying to train modelsto be better and better. It

(35:10):
seems like now there's thiswider discussion about privacy
and AI providers and a lot ofpeople talking about this.
And certainly, we've seen peoplethat we're engaging with, of
course, to build out private AIsystems of their own. But I'm
wondering from your perspective,you're kind of in the weeds or

(35:32):
in the trenches, I guess, is thebest word in terms of helping
people with their actual privacyconcerns. Have you seen the
landscape or perception changein one way or another around
kind of AI plus privacy post thekind of ChatGPT era, if you

(35:53):
will?

Patrick (35:54):
Yeah, absolutely. So OpenFL, this is the open source
project that my team directlysupports, but there's another
kind of division of where myresponsibility lies and that's
building on top of OpenFL toreally address a lot of these
customer concerns. We're my teamis actually building a service
on top of OpenFL called Intel'sHyper Secure Federated AI that

(36:17):
makes it a lot easier forcorporate customers to go and
deploy secure Federatedlearning. And so for a lot of
the people that we're talkingto, really concerned about I
mean, they have these foundationmodels that perform really well
on their local datasets, butthey ultimately don't have
access to the data that's beinggenerated at the edge or some of

(36:37):
their sub customers that they'reworking with. They're not
necessarily experts in federatedlearning ahead of time.
And so we've heard from manydifferent parties that if there
was a service that couldactually provide a lot of the
infrastructure andrecommendations for them ahead
of time to go and deploy thiseasily, then this is something
that would make it just a loteasier for them to actually

(37:00):
perform a lot of theseexperiments and vet whether this
is something that's going towork for them over the long
term. So I talked about the useof confidential computing
earlier and how that can besuccessful for this type of
thing. That's an area that we'vebeen trying to really specialize
in and make easier for a lot ofour customer base. So if you

(37:21):
have technologies like Intel SGXthat are available across the
extent of the parties thatparticipating in this federated
learning experiment, then thatgives you some really nice
properties. Not only can youremove these untrusted
administrators from the threatboundary, but you can also
verify that your model IP so themodel weights, but even the the

(37:44):
model topology itself is notsomething that is divulged to
anyone that shouldn't haveaccess to it.
So how to protect yourintellectual property I mean,
that being, of course, data, andthat's really one of the the
main focuses of federatedlearning is not revealing that
to prying eyes, but the modelitself too. I think for a lot of
our healthcare customers,they'll spend millions of

(38:06):
dollars going through FDAapproval. And so having that
divulged to someone represents arisk to all of the work that
they've done, prior to thatpoint. So we've been hearing
this from a number of customersfor for years, but I think
there's a as you've mentioned,more visibility on it because of
generative AI. And I think thedoors that it unlocks for what

(38:31):
benefit is of actually deployingthese models in the real world.

Chris (38:34):
I'm curious as I've learned a lot through this
conversation. As we I think Iprobably came into it. We've had
previous federated learningconversations in the past with
folks. And I think I was stillkinda stuck a little bit on
kinda distributed data being thedriver of federated learning.
And you mentioned earlier that,you know, it was that, but more

(38:55):
than that, it seems to me inthis conversation that that
these concerns around privacy,which can take many different
forms, you know, from fromprotecting, you know, individual
personal data to IP protection,to regulation, to whatever.
Would it be fair to say thatthese might be the primary
drivers of federated learning?Because it seems like that's

(39:18):
really where this conversationhas gone over time rather than
what I was expecting, which wasmore just distributed, you know,
and I brought up the edge thinga little while ago. I I'm just
wondering, do you think am Igetting that? Am I on the right
track or in terms of gettingwhat the drivers are these days?

Patrick (39:34):
Absolutely the right track. And when I talked earlier
about the different participantsand the architecture for OpenFL,
where I mentioned thecollaborators and the
aggregator, that's reallysufficient for a single
experiment when everyoneinherently trusts each other or
there's some central body. Sothe parallel here with the
University of Pennsylvania andthe Federated Tumor Segmentation

(39:57):
Initiative, which was thisworld's largest healthcare
federation, everyone trusted theUniversity of Pennsylvania that
was ultimately deploying theseworkloads. As you scale
federated learning and you havepeople that you don't
necessarily know that you'rewelcoming into the mix, you need
to have some other way ofestablishing that trust. And so
governance is really the piecethat's missing from OpenFL, and

(40:20):
that's where we built on top ofthis with the service that we've
established.
So how you can vet the modelsahead of time, how you have a
central platform of actuallyrecording that different parties
have agreed to the workload thatis going to run on their
infrastructure and having thisunmodifiable way of establishing

(40:41):
what the data sets are thatyou're going to be training on,
who the different identities arethat are actually participating
in the experiment. Governance isa huge concern for a lot of the
customers that we've beentalking to. And if you want to
have cross competitive types offederations where you might have
two different pharma customerswho have a lot of data they've
generated internally, they havemutual benefit by working

(41:02):
together for training either oneof their models on their
competition's data and theymight have some kind of
agreement that's set up for howwhat ultimate model is
generated, that they have arevenue sharing agreement or
that type of thing. Having aplatform for being able to
establish that type ofcollaboration in a competitive
environment is really where wesee federated learning going

(41:25):
over the long term. And we'retrying to figure out a way to
get there.

Daniel (41:30):
And yeah, you already were kind of going to maybe a
good place to end ourconversation here, which is
really looking towards thefuture. You've been working on
OpenFL and these other effortsfor some time now and been
engaged with the community. Asyou look forward, what's most

(41:51):
exciting for you in the comingyears?

Patrick (41:53):
Yeah, what I think is really exciting is, I mean, the
collaboration between thedifferent parties that are out
there, I think right now isreally, I think motivating for
me personally, because there'sthe spirit right now where
everything is new and excitingfor people who are deep into
this field and people want tofigure out how to just push

(42:15):
everything forward. And I thinkgenerative AI has really been a
catalyst for that in terms offiguring out how we can get
access to this siloed datathat's out there and how we can
do it in a way that actuallyenables industry to take up
these things. Because we don'twant for federated learning to

(42:35):
sit in the research worldforever. We want to actually
take this forward and make itone of the main methods of how
you do machine learning at scalewhen you have these privacy
concerns that are, of course,extremely common today. They're
common for companies, they'recommon for individuals.
So opening up those silos isreally one of the things that I
think there's going to be a lotof benefit by doing that. And

(42:59):
it's going to come Thatbenefit's going to come in the
form of much more Or we expectmuch more accurate models over
the long term and much morecapable models because of just
the increased access to data.

Daniel (43:10):
Awesome. Well, is very exciting. I hope to have you
back on the show very soon, nextyear, we see some of that
playing out. Appreciate yourwork and, you know, the team's
work, the wider the widercommunity's work on, what you're
doing. And, yeah, keep up thegood work.

(43:32):
Thanks for taking time.

Patrick (43:33):
Thank you for having me on the show, Daniel and and
Chris. Really appreciate it.

Jerod (43:43):
All right. That is our show for this week. If you
haven't checked out ourChangelog newsletter, head to
changelog.com/news. There you'llfind 29 reasons. Yes.
29 reasons why you shouldsubscribe. I'll tell you reason
number 17. You might actuallystart looking forward to
Mondays. Sounds like somebody'sgot a case of the Mondays. 28

(44:07):
more reasons are waiting for youat changelog.com/news.
Thanks again to our partners atfly.io, to Brakemaster Cylinder
for the Beats, and to you forlistening. That is all for now,
but we'll talk to you again nexttime.

All Episodes

Episode Transcript

Popular Podcasts

On Purpose with Jay Shetty

The Breakfast Club

The Joe Rogan Experience

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Federated learning in production (part 1)

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}On Purpose with Jay Shetty

The Breakfast Club

The Joe Rogan Experience

All Episodes

Federated learning in production (part 1)

On Purpose with Jay Shetty