Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Gianclaudio Malgieri (00:00):
We all, as
a scientific and technological
and policy community, shouldconsider privacy announcing
technologies bigger and shiftfrom PETs to FETs.
So fairness-enhancingtechnologies, which is not so
difficult to reach.
We just need to think a bitmore critical and a bit broader.
(00:23):
And what are the real goals?
The real goals are protectingthe most impactful groups, the
most marginalized and impactedgroups in the digital
environments.
Debra J Farber (00:37):
Hello, I am
Debra J Farber.
Welcome to The Shifting PrivacyLeft Podcast, where we talk
about embedding privacy bydesign and default into the
engineering function to preventprivacy harms to humans and to
prevent dystopia.
Each week, we'll bring youunique discussions with global
privacy technologists andinnovators working at the
(00:57):
bleeding edge of privacyresearch and emerging
technologies, standards,business models and ecosystems.
Welcome everyone to theShifting Privacy Left podcast.
I'm your host and residentprivacy guru, Debra J Farber.
Today, I'm delighted to welcomemy next guest, Gianclaudio
Malgieri, Associate Professor ofLaw and Technology at Leiden
(01:21):
University; Co-Director of theBrussels Hub Hub; , Professor
the Computer Law and SecurityReview Review; A of the book
Vulnerability and DataProtection Law Law; and expert
in privacy, data protection, AIregulation, law and technology,
EU law and human rights.
Today, we're going to bediscussing his recently
(01:44):
co-authored paper on thefairness of privacy enhancing
WelcinemWelcome Jean-Claudio A!
PH
.
Welcome,
Jean-Claudio.
Gianclaudio Malgieri (01:50):
Thank you,
Debra, I'm very happy to be
here.
Debra J Farber (01:53):
Excellent.
Well, I know we're going tohave a really lively
conversation, but before we getinto your paper, I'd love for
you to tell us a little bitabout some of your academic work
with the Brussels Privacy Hub,and you know how you got into
privacy to begin with.
Gianclaudio Malgieri (02:07):
Yeah, the
Brussels Privacy Hub is a
special think tank based in theFreie Universität Brussels in
Brussels, and the position ofthe hub in Brussels is really
helpful for engaging with policymaking, academia and all
countries that in Europe arevery active in terms of academic
(02:28):
efforts for privacy technology,regulation and policy.
But I'm also, as you said,working full-time in Leiden,
which is the oldest Dutchuniversity and one of the
biggest law faculties in theNetherlands, so I'm trying to
exploit these links.
But I'm also Italian, so I'mconnecting with the
(02:50):
Mediterranean tradition onprivacy and data protection,
also trying to connect bridgeswith US because I'm part of the
privacy law scholarshipconferences in the States.
The hub is trying to push onseveral different aspects during
research, but also trying topush for important debates.
(03:10):
Just to make some examples ofour activity.
I would like to mention threemain topics that we are
exploring now.
The first is impact assessmentsand fundamental rights, so how
technologies can be consideredfrom an impact assessment
perspective.
In particular, we looked at theArtificial Intelligence Act in
the European Union, which wasvery important for the impact
(03:34):
assessment discussion.
The Brussels Privacy Hubpioneered a letter signed by 160
university professors to pushthe legislators to add a solid
fundamental rights impactassessment in the final text of
the European Union law on AI andwe were successful.
It's now there.
(03:55):
So this is just one of thethree main things we do.
A second example isvulnerability and data
protection, which is also, asyou said, the name of my book.
We found, within the hub, agroup called Vulnera.
It's a research network anddissemination platform where we
(04:15):
try to focus on vulnerabilitiesand different vulnerabilities in
different situations and indifferent groups.
And the last but not least,part of our research and
activity is about datagovernance and data transfer,
which is due to the tradition ofthe Brossel Privacy Hub that
was funded by Paul D'Arte andChris Kuhner.
(04:36):
Chris Kuhner, in particular,was one of the great scholars on
the topic of data transfer anddata governance.
Debra J Farber (04:42):
Oh wow, I
learned a lot.
I had no idea about some of thethings that had transpired.
That kind of led to theBrussels Privacy Hub, and it
makes sense that you're locatedin Brussels to have that kind of
effect on policy in the EU.
You recently co-authored thisarticle and the title is the
Unfair Side of Privacy EnhancingTechnologies Addressing the
(05:03):
Trade-offs Between PETs andFairness.
Maybe you could tell us whatinspired you all to write on
this topic, like fairness andPETs.
Gianclaudio Malgieri (05:13):
Yeah, sure
, I think what inspired me and
my two co-authors is mostly thenarrative and how to say,
distorted narrative aboutprivacy and anti-technologies
that big industrial lobbies arepushing for in the EU, us and in
(05:35):
general in global discussion ontechnology regulation.
We had a lot of emphasis on theimportance and benefits of
privacy enhancing technologies.
In the last years we hadimportant initiatives about
privacy enhancing technologiesand a lot of explanation on the
(05:56):
importance, a lot of marketing,a lot of advertisement on how
great privacy enhancingtechnologies are, and we could
agree to a certain extent.
The problem is that thenarrative is incomplete because,
as maybe we will say later,fairness and privacy and data
protection are much beyondanonymization and
(06:18):
pseudonymization.
It's about power control, powermanagement and also mitigation
of power imbalance in thedigital landscape.
So it's not just about notidentifying individuals.
It's also about controlling andmanaging power imbalances for
(06:44):
big dominant platforms, and sofor us, the main trigger was
privacy and antidepressanttechnologies are important, but
are not the solutions for thedigital policy challenges.
It's not the solution we arelooking for.
Debra J Farber (07:01):
So is it that
it's not part of the solution,
or is it not sufficient and weneed additional areas to fill
those gaps?
Gianclaudio Malgieri (07:07):
Yeah, of
course they can help, but there
are several problems.
They can help in general toreduce the amount of data and so
also to comply with some of theimportant principles in data
protection law, both in Europeand US, for example, purpose
limitation, data minimization,of course.
(07:30):
But I would like to explore thetwo parts of your question
First, why they're notsufficient and second, why they
can also be somehow detrimental,at least in their policy impact
.
So for the first part, whythey're not sufficient, as I
tried to explain a few minutesago, privacy and data protection
(07:53):
are about power control.
I can manipulate people and Ican notch people and I can harm
people online in their digitallife, even if I cannot
explicitly single them out, evenif I cannot identify people.
The problem is they're notsufficient because there's the
(08:13):
whole harm problem that is notentirely solved just by
anonymization, pseudonymization,federated learning, synthetic
data and so on, and the wholeproblem of just pushing on
privacy-enhancing technologiesis that we are losing and we are
missing the main part ofcompetition and power.
(08:38):
I would like just to explain ina few sentences this Basically,
what's happening with privacyenhancing technologies is that
big companies with greatcomputational capabilities, with
huge amount of training datasets are the companies best
placed to practice and toimplement privacy enhancing
(08:59):
technologies.
They will also have legalbenefits from it because if they
can even anonymize their dataprocessing, they might escape
from most of the GDPR so GeneralData Protection Regulation
duties.
And it's a paradox that thebiggest company will be the
companies that will not beaccountable for the GDPR because
(09:22):
they will be able to anonymizeor pseudonymize, etc.
At the same time, smallercompanies that will not have the
power, the computational power,the policy power, the money to
develop these privacy-enhancingtechnologies, will be the ones
that will be mostly challengedby GDPR rules, so by data
(09:42):
protection rules.
Challenged by GDPR rules, so bydata protection rules.
In other terms, privacyenhancing technologies are not
the solution because they willcreate a distortion effect also
on the markets, where the lessharmful actors, like small and
medium enterprises, will be theones that will still need to
comply with the law, while thebiggest players will probably be
(10:05):
partially exempted from therules.
And maybe just one final thing.
We will explain it more.
The use of mass privacy inanti-technologies might also be
detrimental to some of the mainvalues and principles in data
governance and data regulation,which is diversity and fairness
(10:26):
considered in a broad sense.
For example but we will explainit later Synthetic data or
differential privacy tend not toconsider minority groups, and
this is problematic fordiversity and bias detection.
Debra J Farber (10:43):
Just fascinating
.
I mean you know I've been sucha champion of shifting left into
the product and developmentlife cycles and that ways to do
data minimization includeprivacy enhancing technologies.
But if you look at it as amonolith and as just one big
thing that just maybe takesorganizations outside of being
covered by regulations, then youkind of miss the forest for the
(11:04):
trees that maybe potentially itcan be abused or monopolistic
power could be abusive by usingthese technologies.
So I'm really excited to divein, if you don't mind, telling
us how you and your teamapproach this topic in your
paper, and then we'll dive intothe specifics, sure into the
(11:29):
specifics?
Gianclaudio Malgieri (11:29):
Sure.
So in this paper we tried toaddress the topic of unfair side
of VETs from two perspectivesthe legal one and the computer
scientist one.
From the legal perspective, weaddress mostly the concept of
fairness in its evolution anddevelopment, starting from the
law, so from the General DataProtection Regulation, from
per-information practices, fromconsumer protection definition
(11:52):
in Europe and beyond Europe, thetwo legal authors, so me and my
great co-author AlessandraCalvi from Freie Universität
Brussaux, who was also the maindriver behind the paper, we
tried to analyze the concept offairness and how fairness has
been developed.
(12:12):
First we have fairness asdiversity, so fairness as
non-discrimination, which is themost accepted meaning that
computer scientists seem toadopt when they mention fairness
.
But there's also a concept offairness related to power
imbalance and power control andimbalance mitigation, which is a
(12:36):
concept that has been growing alot from consumer protection
and now also data protection.
I wrote the paper about theconcept of fairness in the GDPR
four years ago and theconclusion from a linguistic
analysis of fairness in manydifferent systems of legislation
in many different legalcountries, legal frameworks was
(12:57):
fairness as loyalty and fairnessas equal arms power control In
parallel.
The technical co-author,professor Dimitris Kotsinos from
University Sergi in Paris.
He analyzed with us thedifferent privacy enhancing
(13:18):
technologies, looking at theirlimits from also the perspective
of fairness that we try todevelop in the legal terms.
So it was kind of a dialoguebetween different disciplines
trying to understand firstfairness and second, how PETs
are not really fair, friendly,let's say.
Debra J Farber (13:37):
Fascinating.
At first glance the concept offairness seems kind of
straightforward to most people,but your paper really highlights
that the concept can meandifferent things to engineers
versus sociologists, you know,with potential fairness problems
that include, like you said,bias, discrimination, social
injustices and market powerimbalances.
I know you talked a little bitabout it already, but can you
(14:00):
unpack maybe each of thosefairness problems and how they
link to privacy?
Gianclaudio Malgieri (14:07):
The link
with privacy is both in the law
and in a logical reasoning as aconsequence of the concept of
privacy.
So in the law we have fairnessas one of the principles of data
protection.
It's, for example, focusingmostly on European Union law
because we know that in theStates, for example, we don't
have a federal law on privacyand data protection.
It's, for example, focusingmostly on European Union law
because we know that in theStates, for example, we don't
have a federal law on privacyand data protection.
(14:29):
But in the European Union, thefundamental rights to privacy
and data protection is mentionedin Article 8 of the European
Charter of Fundamental Rights,and that article refers to
fairness.
So there is a logical linkbetween fairness and privacy
that even the legislatorsseveral years ago because the
(14:49):
article I'm referring to in thecharter is from 2000, so 24
years old the legislatorsalready identified these links
and also the GDPR, the GeneralData Protection Regulation, has
an explicit reference tofairness in the guiding
principles of data protection.
As you said, and as I saidbefore, there are different
(15:10):
declinations, differentinterpretations of fairness.
We have fairness as biasmitigation, fairness as fight
against discrimination, fairnessas equality against social
injustices, and so not justequality but equity and fairness
as market power, imbalances,mitigation.
I think all of theseinterpretations are correct and
(15:33):
they do not contradict with eachother.
They respond to the samechallenge, which is mitigating
harms that algorithms and datatechnologies can produce.
Fairness is kind of a safeguardagainst these harms and also,
(16:04):
fairness is the if you allow meto list legal concepts is mostly
an ethical concept.
It's mostly an ethical concept,sure, because fairness is not a
clearly definable concept thatlawyers can clearly define.
Indeed, yeah, as you said, youasked me to unpack it.
It's not easy to unpack it, butI can say that bias, for
example, and discrimination areinherent in data processing.
Because, because, of course,the effects of data processing
is that there might be anincomplete or non-diverse enough
(16:25):
data sets that can produce twounfair conclusions and unfair
automated decisions.
But what about social injustice?
Social injustice is aconsequence of this.
If I process data in a way thatis incomplete and doesn't take
into account minorities,marginalized groups, people at
margins, social and economicminorities, I will be processing
(16:51):
data and taking decisions thatwill be unfair, and we have a
lot of examples.
I am in the Netherlands now.
In the Netherlands, we had alot of scandals based on social
injustices based on inaccurateand unfair data processing for
public administration.
There was a scandal about childbenefits, but we don't have
(17:12):
time to address this now.
Just to say this is importantand the other part.
Just to conclude fairness, asmarket power imbalance
mitigation is also connected todata processing.
Why?
Because the big power imbalancethat we observe between
individuals and companies andbig techs in the digital
(17:33):
environment is based on the hugeamount of data that big techs
can process upon us.
I can just mention, verybriefly and simply, shoshana
Zuboff's work the Age ofSurveillance Capitalism.
Basically, what we observe nowis that capitalism is based on
(17:53):
data and surveillance andbehavioral surveillance.
Exactly, data protection is thetool to look at power imbalance
, because data is power.
Debra J Farber (18:03):
Again so
fascinating.
In the United States, we talkabout privacy, but we often
don't talk about data protectionas a whole, where in the EU,
privacy it's a piece of the dataprotection mandate, with
privacy being an enshrined right.
I think a lot of these big techcompanies that you reference
are run by people and then haveemployees who also are not
(18:26):
thinking in terms of dataprotection, thinking larger than
how do I make sure that thisperson has control over their
own choices about how their datais used?
Right?
It is really great to hear fromyou this reminder to think
larger about societal impacts,the socio-technical
understanding of fairness, andespecially wanted to also
(18:46):
mention that in the EU, the AIAct also has a requirement
around fairness, which kind ofleads me into the next question,
where let's dive into some ofthe analysis of the paper.
But the first section was onPETs for machine learning and AI
, and then you know, how doesthat relate to fairness?
So let's first talk about dataobfuscation.
(19:07):
That would be anonymization,pseudonymization, synthetic data
, differential privacy, each ofwhich builds upon the concept of
data alteration.
How are they, as a group,relevant as solutions,
privacy-enhancing solutions forAI and machine learning needs?
And then maybe we could gothrough them more specifically
(19:29):
in my next question.
Gianclaudio Malgieri (19:31):
Sure.
So I think you addressed themain point.
Data obfuscation has beenconsidered one of the most
important privacy preservingpractice for AI-driven
technologies.
You mentioned anonymization,pseudonymization, synthetic data
(19:53):
and differential privacy.
They are different but ofcourse they react to the same
challenge, which is reducing theidentifiability of single users
, single individuals, singledata subjects in the digital
environment.
(20:13):
So there is an overarchingissue, which is privacy.
Harms are not just individualharms, they can be collective
harms.
Privacy harms, not just inEurope but also in wonderful
scholarship in the States, hasbeen identified as harm not just
(20:34):
to my private life, my personallife in my toilet or in my
bedroom, but also my work life,also democracy and freedom of
speech as a connection to myinformational freedom.
So just to say anonymizing,pseudonymizing, obfuscating data
(20:57):
etc.
Is not maybe the solution tocollective harms to privacy,
because even if I cannotidentify you, I can identify
your group or I can identify thebest ways to target you or to
limit your freedoms inconnection with your digital
(21:20):
life, limit your freedoms inconnection with your digital
life.
So even if I don't exactly knowyour data, your personal data,
your identifiable data, I canstill target you.
This is something I thinkmostly relevant for this
discussion about synonymization,synthetic data, etc.
Something else I wanted to sayand I already mentioned it
(21:41):
before is that usually if we,for example, focus on synthetic
data and differential privacy,which are very different
practices because the firstsynthetic data is based on, as
we can simplify, a reproductionof a data set, so it's not based
on real individual data.
But this synthetic data has alot of a data set, so it's not
based on real individual data.
But this synthetic data, as alot of computer scientists have
(22:05):
already identified, tend toignore minority groups, tend not
to look at minorities andoutliers, and this is also for
differential privacy.
Differential privacy issomething else.
Differential privacy is lookingat aggregated data and making
analysis on aggregation.
(22:25):
But the statistical aggregation, in order to protect privacy
and to limit re-identificationof single individuals in the
aggregation, need to deleteoutliers, need not to consider
the upper and the minimumoutliers, so they cannot
(22:45):
consider different groups.
They need to look at theaverage.
So this is the main problem.
Right Data obfuscation tends tosimplify all the humanity or
all the data sets to an averageperson, and this doesn't help to
mitigate biases but also torepresent society.
(23:07):
If we have to take a decision,even a democratic decision based
on AI and we cannot really knowwhat are the single groups and
the different minorities andoutliers in the group, because
we cannot identify them and wedon't want to re-identify them.
And outliers in the group,because we cannot identify them
and we don't want to re-identifythem.
We might have problems ofrepresentation problem to mostly
collective harms of privacy anddata protection.
(23:29):
I hope this answered yourquestion.
Of course it's not easy toanswer in a few sentences.
Debra J Farber (23:34):
Yes, no, that
was really helpful.
Let's go through some of thosedata obfuscation PETs, maybe
briefly explain their intendedbenefit, maybe from a GDPR
perspective, and then, ifthere's anything specific about
each one of them that ties tofairness, that'd be helpful to
understand the context aroundthat.
But if it's already thesummation you just gave us, I
(23:55):
don't want you to repeatyourself, so just let me know.
But let's start withanonymization.
Gianclaudio Malgi (23:59):
Anonymization
is, you know, a bit of an
illusion.
We know it's very hard toanonymize data if we still want
to use data right, and then ofcourse it depends on which is
the purpose of our dataprocessing activity.
But in general in the GDPR, soin the European Union data
protection law, it is very hardto reach the anonymization level
(24:22):
.
There is a big discussion aboutwhat is anonymization, because
the GDPR seems to take arisk-based approach, while the
guidelines of the European DataProtection Board which actually
dates back to the previousentity, the entity before the
(24:42):
data protection board wasfounded so the Article 29
Working Party opinion theseguidelines generally refer to
anonymization as a zero risk ofidentification approach.
So basically, if there's even aminimum risk of identifying
someone, it's not anonymous.
Of course it's impossible toreach that level and that
(25:03):
standard right, because intoday's data processing
environment it's very easy toidentify someone based on some
proxies, based on a lot ofaggregated data that we can use
to infer who is a specificindividual.
So we know there's a lot ofscholarship on that.
(25:24):
Just let's say thatanonymization is a theoretical
concept but not a practical one,if you agree.
Debra J Farber (25:32):
Yeah, no, in
fact, it is kind of fascinating
because it is one of the fewtechniques that's written into
GDPR and yet it is not thateffective, because you could
combine a bunch of data setsthat can re-identify.
So anonymization techniques caneasily be broken these days Not
all of them and not all of themeasily but is not the panacea
(25:54):
that many in corporationsthought it might be to take
companies out of the regulation?
What about pseudonymization?
So things like tokenization,masking, generalization and
other techniques.
Gianclaudio Ma (26:16):
Pseudonymization
is much easier to beat because
pseudonymization doesn't meanthat we cannot identify
individuals anymore, cannotidentify individuals anymore.
Pseudonymization means that weprotect data in a way that
privacy attacks are less harmful.
Why?
Because the identification of adataset is kept separate from
the dataset itself.
(26:37):
At least this is the GDPRdefinition, so the European
Union definition ofpseudonymization.
There is a legal difference anda legal implication if we have
anonymization orpseudonymization.
If we apply anonymization,which I said is very hard in
practice, the GDPR, so theEuropean protection law does not
(26:58):
apply at all, and also theUnited States law, like the
national laws, for example,colorado, washington, virginia,
different laws that we have inthe States wouldn't apply
because anonymization doesn'tallow us to identify people.
For pseudonymization, thesituation is more complex
because the GDPR applies,because the GDPR applies.
(27:20):
So even if we pseudonymize datathrough tokenization or masking
etc.
We should still comply withGDPR rules.
So pseudonymization doesn'tsolve the compliance problem.
But if the pseudonymization isin practice, the data
controllers or companies thatdecide how to use data and why
(27:43):
they can prove that theyprotected data, and this is
helpful for daily compliance.
So if the regulator wants tocheck about compliance, they can
always say yes, I applied agood protection, which is
pseudonymization.
Of course it depends on whichkind of pseudonymization.
Of course it depends on whichkind of pseudonymization.
Just to summarize,anonymization in case of
(28:06):
anonymization, we are out of theGDPR.
In case of pseudonymization, westill need to apply the rules
of the GDPR, but we have sortsof safeguards in place that will
excuse us and will protect usfrom a regulator perspective.
Debra J Farber (28:22):
And then what
about synthetic data?
Yeah, synthetic data.
Gianclaudio Malgieri (28:25):
well, it
really depends on the purposes
for our data processing.
We can say that synthetic dataare a form of, let's say, data
obfuscation.
That might be very useful if wewant to train algorithms
without using training datasorry, without using personal
(28:46):
data, personal identifiable data.
So synthetic data is a form, wecan say, of data minimization
that is very useful for, forexample, reducing the legal
risks and so the possiblesanctions if we do data scraping
(29:07):
.
So you know, most of data, mostof training systems, training
systems for AI are based onscraping data from social media,
from big databases.
It's basically the download orthe processing of huge amount of
publicly available data onFacebook, instagram, twitter,
(29:30):
google.
Whatever Synthetic data mightbe a solution to avoid the harms
produced by scraping, but it'snot harms to individuals, it's
harms to business interestmostly, and also privacy harms,
yeah you know, and it reallydepends on how we process.
what is the purpose for thissynthetic data?
(29:50):
I think there's no singledefinition of synthetic data
from a legal perspective.
Debra J Farber (29:54):
Yeah, that makes
sense.
It's a relatively newdesignated, you know, privacy
enhancing technology, so I don'tthink it made it into the
regulation.
And then the last for thatsubheading would be differential
privacy and then, if you wantto also link it back to fairness
, that'd be helpful.
Yeah.
Gianclaudio Malgieri (30:09):
So, as I
already said, differential
privacy is a very problematicpractice because, in a sense, it
reduces a lot the risks ofidentification.
So this is good in terms of thetraditional view of privacy,
right, the computer scientistview of privacy, privacy as
(30:31):
known identification.
But, as I said before,differential privacy is mostly
based on aggregated analysis ofdata.
The aggregation of data can beuseful for companies because,
for example, they don't need toidentify individuals.
Sometimes, if I just need tounderstand how effective was my
(30:58):
marketing activity on socialmedia, I can just consider
differential privacy aggregation.
So, basically, I just analyzehow my behavioral advertising
were translated into somebenefits or time spent by my
users online.
I don't really need to identifyindividuals for that my users
(31:23):
online I don't really need toidentify individuals for that.
The problem is that ifdifferential privacy, as I
already said, is considered ananonymization technique an
anonymization technique, sorryit might exclude the full
application of data protectionrules, which has
anti-competitive consequences inthe digital market, in
particular against smallerenterprises.
(31:43):
And, on the other hand, inorder to reduce identifiability,
differential privacy needs tocut the outliers.
And so, as I was saying,differential privacy might be
problematic for representationof minorities and marginalized
groups, a disclaimer that I amtrying to add and I emphasize
(32:06):
now, is that all thesetechnologies cannot be
considered in silos.
So we are speaking a bittransversely now, but it really
depends on what is the specificbusiness application of these
technologies.
So my statements might be verydifferent if we consider one
aspect or another, oneapplication or another, one case
study or another.
Debra J Farber (32:28):
That makes a lot
of sense.
No, definitely.
And then there's also the papergoes into detail on encrypted
data processing tools, as wellas federated and distributed
analytics.
And you know, in interest oftime and instead of going
through each of thosespecifically, do you want to
make any connections for theaudience about those
(32:48):
privacy-enhancing technologycategories and fairness and what
you found in your research?
Gianclaudio Malgieri (32:52):
Sure, yeah
, I mean, I think an important
aspect of the paper, as you alsosuggest, is that we do not say
that PET should be avoided.
There are some benefits inprivacy-enhancing technologies.
We just say that they should bejust considered one of the
possible safeguards in place,together with many others.
(33:14):
So for encrypted data, which isalso considered in legal terms
a form of enhancedpseudonymization, we suggest
that privacy-enhancingtechnologies are a good
safeguard.
We just say that the wholefairness discussion, as I said
before, in terms of bias,detection, diversity,
(33:38):
representation, power mitigation, is not addressed by, for
example, encryption.
Debra J Farber (33:45):
Awesome.
Thank you for that.
So we kind of just went throughan exploration of specific
groups of privacy enhancingtechnologies, but now I want to
turn to some of the technicaland regulatory solutions that
address some of these PETshortcomings.
Your team lists three main PETshortcomings in its research.
When it comes to PETs and againyou've alluded to these, but
(34:08):
I'll restate them Bias discovery, harms to people, protected
groups and individuals, autonomyand market imbalance.
What technical and regulatorysolutions do you propose to
address each of theseshortcomings?
First, let's start with PETsand bias discovery.
Gianclaudio Malgieri (34:26):
We are not
sure that we can really really
propose immediately applicablesolutions.
But of course, I think, as Isaid before, privacy and
anti-technologies should not bethe sole safeguards in place.
So for BIOS discovery, there'sa lot that we can do.
First of all, we shouldn't lookalways for automated solutions.
(34:50):
So I think this is importantfrom also a legal scholar as me,
as a message that automation isnot always the solution to
automation problems.
If some problems were inherentin automation, the solution
might be just different, likesocial business, et cetera.
(35:13):
I will try to explain better.
For bias discovery, for example,one of the most interesting
ongoing discussion isinvolvement of impacted groups
in the assessment of atechnology, in the assessment of
arms and in the assessment ofimpacts of technologies and
fundamental rights.
If we need to discover biasesof AI, which now are also
(35:39):
problems, for example forgenerative AI, like
hallucinations or misalignment,et cetera, we need impacted
groups to stand up and to helpthe AI developers to identify
gaps and issues.
Basically, what I'm saying hereis that we should look at
business models, not just thetechnologies.
(36:01):
We should look at how differentbusiness models address
solutions and decisions and howthese decisions can be modified
and improved and how we canempower impacted groups.
I don't think we will ever havean automated bias discovery
solution, but of course thereare very good bias discovery
(36:25):
solutions that might benefitfrom participatory approaches,
from participation of impactedindividuals in the impact
assessment.
Debra J Farber (36:36):
Fascinating.
What about people belonging toprotected groups?
What are some of the?
You know that is a shortcomingthat was highlighted with PETs,
that it doesn't appropriatelyaddress those marginalized
groups or protected groups.
Would you suggest a similartechnical and regulatory
solution as you just did withbias discovery, or is there
something else?
Gianclaudio Malgieri (36:56):
Yeah, I
mean, as I said before, the
biggest problem about impactedgroups is that they are
underrepresented and they arethe most impacted groups, so the
groups that have the mostadverse impacts in terms of
technology applications.
So there's a problem here,which is a problem of democratic
participation, but also aproblem of decision-making and
(37:21):
fairness in practice.
Some of the solutions is indeedparticipation and
multi-stakeholder participation.
I'm just publishing now I meannext month a co-author and I
will publish an article aboutstakeholder participation.
The co-author is MargotKaminski from Colorado Law
School and the journalist is theYale Journal of Law and
(37:44):
Technology.
We are trying to discuss howprivacy governance so data
governance and AI governance canbe improved by multistakeholder
participation, in particular,for people belonging to
protected groups.
There's a problem, of course,and the problem is how to define
these groups.
Should we just rely onundiscrimination laws defining
(38:07):
protected groups, or should werely on something else?
This is an ongoing discussion.
We don't have time now toaddress this, but we can, of
course, start from the mostvocal and most visible groups
impacted by technologies.
I can make three, four examplesChildren, older people,
(38:27):
racialized communities, victimsof gender-based violence, lgbti
plus communities, and I could goon, but we could start from
these groups and look at how,together with privacy enhancing
technologies, diversity of thesegroups could be considered.
So, just to be very practical,we apply privacy enhancing
(38:48):
technologies, for example, in abusiness model, but then we
check the impact with impactedgroups.
So, basically, we put theprivacy enhancing
anti-technology's effects into abigger and broader
multi-stakeholderdecision-making where impacted
groups' representatives canexpress their views.
Debra J Farber (39:08):
That's awesome.
I really look forward toreading that paper when it comes
out, In addition to the paperwe're discussing today, which I
will include in our show notes.
I will also update the shownotes to include a link to your
future paper once you publish it.
The last but not least areawhere there's a shortcoming
would be individual autonomy andmarket imbalances.
(39:31):
Talk to us a little bit aboutwhat potential solutions to this
shortcoming would be.
Gianclaudio Malgieri (39:37):
Yeah, of
course we cannot discuss, as I
said before, privacy andanti-technologists in general.
We should always look at howsingle privacy and
anti-technologists practices areaffecting some of the fairness
components in practice.
But what I might say is thatmarket imbalance should be
(39:59):
regulated not just for privacyin the narrow sense, but we
should consider a lot ofdifferent obligations that can
reduce market dominance.
I will make a simple example Inthe European Union, two years
ago, the Digital Markets Act wasapproved.
The Digital Markets Act is animportant power rebalancing tool
(40:23):
, imposing a lot of duties interms of competition law and
fair access to data and alsoconsent to data processing.
So, just referring also toindividual autonomy that you
mentioned and the DMA, theDigital Markets Act is an
important tool that complementsprivacy and data protection.
(40:45):
Just to say, privacy andanti-technologies are a great
tool that should be complementedby specific rules in terms of
market control.
This is clear, for example, inreducing abusive practices.
That can happen when big techsmight manipulate individuals or
(41:07):
might exploit dependencies,because this is another problem.
I didn't mention that term sofar, but dependency is the
problem that we really want toaddress.
We depend on social media, wedepend on big techs, we depend
on social giants, and thisdependency is the real power
imbalance problem.
(41:27):
So the states should take aposition against these
dependencies, either imposingrules and fundamental rights
enforcement duties on big tax orprohibiting some abusive
practices.
Debra J Farber (41:44):
It's a lot to
think about.
I'm not sure there's thepolitical will to make it happen
, but we'll see if we can get afederal law that embodies all of
that.
What were some of your team'sconclusions at the end of
writing this, and where mightthere be some areas where you
might want to do some moreresearch, or more research is
needed?
Gianclaudio Malgieri (42:02):
Sure, I
think a lot of research is still
needed.
Just to make some examples, wecouldn't go deeper on each
single privacy-enhancingtechnology in practice, and also
we should look at how, forexample, generative AI is
altering the discussion.
So our paper didn't considergenerative AI challenges, but of
(42:25):
course this is perhaps chaptertwo of our activity.
How can privacy-enhancingtechnologies help or not help
for hallucinations andmisalignment of generative AI
systems, where fairness is a bigproblem, because we know that
hallucination and misalignmentcan produce discrimination on
(42:50):
generative AI.
For example, chatbot or imagesearch engines can produce
stereotypes, can induce harms.
So of course, these are some ofthe areas that we need to
investigate in the future andit's just part of the problem.
Debra J Farber (43:08):
I think that
really sums up a lot of what's
needed.
In fact, I'll be on the lookoutfor some working groups or
standards or just more researchcoming out on the topic.
I think one of the things I'vebeen thinking about either doing
myself or kind of surprised Ihaven't seen much out on the
market around them but is alisting of all of the privacy
enhancing technologies based ondifferent use cases, but also
(43:29):
based on what is the privacyguarantee that the organization
wants to ensure by using thatPET and then working backwards
to see or, I'm sorry, by using aPET and then working backwards
to see which PET or set of PETswould get that job done.
But this conversation has mademe really think about.
(43:49):
We need to think broader thanjust can we do the thing?
Can we achieve this end goaland instead broaden it to also
include are we being fair to theindividual and to, like society
generally, the group ofindividuals?
So really a lot to think about.
Thank you so much for your timetoday.
Are there any words of wisdomthat you'd like to leave the
(44:11):
audience with before we closetoday?
Gianclaudio Malgieri (44:13):
I think we
all, as a scientific and
technological and policycommunity, should consider
privacy announcing technologiesbigger and shift from PETs to
FETs.
So fairness announcingtechnologies, which is not so
(44:34):
difficult to reach, we just needto think a bit more critical
and a bit broader.
And what are the real goals?
The real goals are protectingthe most impactful groups, the
most marginalized and impactedgroups in the digital
environments.
Debra J Farber (44:52):
What a great
idea and really elevating it
beyond just privacy to meetfairness.
So you'll meet a lot of goalsthere, right Including just,
especially if you apply to AI.
Well, thank you so much,jean-claudio.
Thank you for joining us todayon the Shifting Privacy Left
podcast.
Until next Tuesday, everyone,when we'll be back with engaging
(45:12):
content and another great guest.
Thank you so much, bye-bye.
Thanks for joining us this weekon Shifting Privacy Left.
Make sure to visit our website,shiftingprivacyleftcom, where
you can subscribe to updates soyou'll never miss a show While
you're at it.
If you found this episodevaluable, go ahead and share it
(45:34):
with a friend, and if you're anengineer who cares passionately
about privacy, check out Privato, the developer-friendly privacy
platform and sponsor of thisshow.
To learn more, go to privatoai.
Be sure to tune in next Tuesdayfor a new episode.
Bye for now.