Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
[UPBEAT MUSIC]
(00:03):
Welcome to Data Nation.
I'm Munther Dahleh, and I'm thedirector of the MIT's Institute
of Data Systems and Society.
Today on Data Nation,Liberty and Scott
are looking into thetruth behind data privacy.
[UPBEAT MUSIC]
Your rights are beingviolated, and most people
(00:27):
don't even know it.
Your data is used bygovernments, corporations,
charities, and many others tomanipulate, convince, sway,
and sell you inevery way imaginable.
The newest, biggest,baddest guy on the block,
the most influentialindustry today is data.
(00:47):
So it begs the questionof what kind of toll
does that take onus as a society?
Data in itself isn't bad.
But sometimes, even the bestintentions cause poor outcomes.
And in some cases,people can intentionally
use data to cause damage.
So racial discriminationtakes place
not just in the realworld, but online, too.
Facebook's advertisingtools allowed advertisers
(01:09):
to discriminate based onrace and excluded users
from seeing their ads.
Looking back atOctober 2016, we saw
a group of ProPublicajournalists
wanted to expose this, so theybought a housing-related ad,
and intentionally excludedcertain users from seeing
this ad, includingHispanics, Asian-Americans,
and African-Americans.
It was as simpleas clicking a box.
So the journalists,all they had to do
(01:31):
was choose which racesthey wanted to see the ad,
showing a preference for certaingroups and excluding others.
Very clear violation ofthe Fair Housing Act.
And Facebook pledged to fix it.
The data economy is a vast worldof people buying and selling
data.
There's tons of movingparts, and a lot
of things that arecompletely unknown about it,
(01:52):
sort of in thisdark, shady world.
And inevitably, thatmeans a lot of people
are really worried about it.
I think the real question is,Should we be worried about it?
What exactly are theproblems when it comes
to how data is being used?
If there's anyone whoknows, it's Kevin Roose.
Kevin is an award winningtechnology columnist for The New
York Times.
(02:13):
Before joining The Times, he wasa writer at New York Magazine,
and a host andexecutive producer
of Real Future, a documentaryTV series about technology
and innovation.
So when it comesto data, our data
is used differently bycorporations, governments,
and individuals.
But how is itactually being used?
And is it a net positivewhen it's being used?
Do the positive casesoutweigh the negative ones?
(02:35):
I think it's really importantthat consumers understand
how their data is being usedand what is being collected.
And I also think it's importantthat on the regulatory side,
we have something likea national privacy bill
to prevent theworst exploitation.
I think you really have to breakit down into constituent parts,
an algorithm that is designedto show toaster ads to people
(02:58):
is, I would argue,substantially different than one
that is being used tohijack people's attention,
to show them things that arelikely to produce outrage
and mistrust.
The same machine learningtechniques are used for both.
They're used to keep youglued to your TikTok feed,
they're also usedto sell you things,
(03:20):
and so I think we really need tobe specific about what problems
we're trying to solve, and thentailor our intended solutions
to those specific problems.
Scott and I were justtalking about that ProPublica
investigation where theyexposed how Facebook was
ad targeting with housing ads.
(03:41):
And those housingads were targeted
at people based upon race andviolated the Fair Housing Act.
Do you see this in everydaylife in many other cases?
Have you seen this kind ofvery specifically-targeted
advertising?
So if you're advertising,say, menstrual products,
(04:01):
you don't necessarily wantto pay to show those to men.
You might want tojust say, I'm going
to target this product at womenbetween the ages of 18 and 35,
or something like that.
And Facebook willlet you do that,
and it will let you do thatnot just for gender or age.
(04:23):
But until somewhatrecently, it would
let you do that by ethnicgroup, ethnic affinity group.
So Facebook gotinto some hot water
when it turned out that peoplewere, in fact, advertising jobs,
houses, various otherkinds of categories
for which federal regulationprohibits racial segmenting
(04:47):
using their advertising tools.
I'm pausing this interviewto make a quick editor's
note for our listeners.
We recorded thisinterview with Kevin
before the Supreme Courtoverturned Roe v. Wade.
And this nextquestion that I asked
Kevin is about howpeople's location
data can be used against them.
And I actually use theexample of location data
(05:09):
being used to specificallytarget women seeking abortions.
At the time of therecording, this question
was an absolute speculativeworst case scenario
that I couldn't evenreally imagine happening.
But now, it's theunfortunate reality
for many women,especially women living
in states where afterthe overturn trigger
(05:30):
laws automaticallywent into effect
and criminalizedgetting an abortion.
So as we continue thisconversation on data privacy
and how data can beused against you,
keep in mind that thisis now a very real threat
and present current situation.
It seems like some of thisstuff has been done unknowingly.
(05:50):
The algorithm targeting--people have done this
without meaning todo anything bad,
including some ofthe bias that comes
into algorithms and the data.
But what about when thingsare intended to be bad?
So I remember there was aNew York Times investigation,
and it looked at how yourlocation data was not actually
private so that they trackedsomeone and could figure out
(06:12):
who this woman was basedupon her location data,
even when it was supposedto be completely anonymous.
So do you have thepotential for something
to happen that'struly meant to be bad?
For example, tryingto use cell phone
location data to figure outwho goes to abortion clinics.
And then selling the data ofwho goes to abortion clinics,
(06:33):
and then anti-abortionprotesters can go
stand outside Mary Smith'shouse because they know
she went to an abortion clinic.
Do you have that riskwith this type of data?
Absolutely.
So I think that'swhy it's important
that even though the intentionsof people in the tech
industry who arebuilding these products
may be benign, if the outcomesare malign, if the outcomes are
(06:57):
bad, I believe thatwe still need to hold
them responsible for that.
I mean, just having goodintentions doesn't matter
if you're building somesurveillance dragnet that's
going to be used toarrest and prosecute women
seeking abortions.
And I think once you have somedata showing that is happening
(07:17):
or might reasonably happen,then you have a moral imperative
to shut that down.
[MUSIC PLAYING]
So the problems with the dataeconomy are definitely there.
It seems that we all have tolive with the consequences.
Right.
And while most of thetime we don't feel
burdened by beingtargeted by an algorithm,
(07:39):
there are times that it canbecome a psychological stressor
for some people.
There's many women who believethat Instagram have preyed
on their maternal instinct.
One example is a woman whenher son was born last year,
she posted a photo of himevery day on Instagram.
And immediately followingthat, her Explore page
(08:00):
was populated withimages of babies
with severe health conditions,cleft lip and palate, missing
limbs.
And she believes thatInstagram intentionally
preyed on her vulnerabilityas a new parent.
Other people sharethe same feeling
that Instagram preyson your insecurities.
Nicole Gill, the co-founder ofthe advocacy group, Accountable
(08:22):
Tech, believesthat Instagram was
damaging to her mental healthand the postpartum period
constantly showing her postsof how to lose baby weight.
I think the realquestion is, How did they
actually get this data?
Is it bought?
Is it given out by Instagram?
Where did it come fromin the first place?
And people they claim theirphones are listening to them.
(08:45):
Is that really happening?
And the person who knows theins and outs of the data economy
is Dean Eckles.
Dean is a professor at MITaffiliated with both the Sloan
School and MIT IDSS, andwas previously a scientist
at Facebook and Nokia.
I want to startwith what I believe
is the first question onpeople's minds, and that is,
(09:06):
Are our phones, our technologyreally listening to us
to collect data andgive targeted ads?
If I pick up my significantother's phone and say, puppy,
get a puppy, cutepuppy, buy a puppy,
is he going to magicallyget some puppy ads?
Do our phones really havethe capability of doing that?
(09:27):
I think just given how leakyFacebook has been recently,
I think if that was true,we would have heard about it
from internal information.
And then second, is justtechnically difficult.
One of the things that'snice about a mobile phone
is we carry it around with us.
That means that it'sbattery powered.
So if you start usingthe mic all the time,
your phone is goingto die pretty quickly.
So I think for apps thata lot of people are using,
(09:50):
it's not reallytechnically feasible.
Now, on the otherhand, if you've
got Google Home, or Alexa,or something similar
that's just pluggedinto the wall
and it's potentiallylistening, there's
really no technical reasonthat that couldn't have
that kind of a process running.
I think a lot of people want toknow how their data is actually
(10:12):
being acquired.
What data are algorithmstaking from the everyday user?
And how is it beingbought and sold?
Some of the way you frame thatreflects how a lot of people
think about this is this ideathat social media companies
are actually taking data.
I'd say one of theways of describing
this is actually that everyoneis giving them data for free.
(10:33):
And by everyone, Idon't just mean users,
I actually mean other companies.
So part of the way that theseadvertising giants know things
about individual peopleis not that they're
trying to buy this dataor collect this data
in some particular way.
It's that actually, advertisersare just constantly giving them
that data for free, and maybein a really superficial way,
(10:54):
like you looked at some shoeson a retailer's website,
and then those shoes arefollowing you around.
So that's a casewhere it's really
just advertisers are constantlyuploading data about people
and their behaviorsonline and offline
to these advertising platforms.
So a lot of what's happeningis that the data is flowing
into the platforms for free.
(11:15):
Facebook isn't really buyingyour data or selling your data.
Everyone is givingit to them for free
in order to be ableto potentially run
more targeted ads, ads thatare targeted based on the data
that those advertisers have.
So I like to pointthis out a lot.
30 years ago, wehad The White Pages
where once a quarter,a magical book would
(11:35):
be dropped on your doorstep, andit's a list of people's names,
addresses, and phone numbers.
Today, the amount of data theindustry has is astounding.
And it looks like as a timegoes on, more data is available.
So it's more than just knowingsomeone's name and address
or that someone iscollege-educated or not.
We can get things likepurchase habits, credit scores,
(11:56):
and they're accessible tobasically the average person who
wants to spend a little bitof money to get that data.
So is this prevalenceof data going
to be normal inthe next 20 years?
Or will there bea backlash where
industry pulls back andanonymizes a lot of this data?
Yeah, I think that'sa great question.
That puts that in thebroader historical context.
And of course, justbecause something's
been happening fora while doesn't
(12:17):
mean that weshouldn't be worried
about some of the privacyimplications of that.
So I think while there'sreason for scrutiny
of these big advertising giantslike Google, and Facebook,
and Amazon, a lotof the scrutiny
could also be more generallyon this space of data brokers.
So there's a lot of ways thatdata is flowing around, often
involving the companiesthat you haven't necessarily
(12:38):
heard of their names.
This is happening a littlebit more behind the scenes.
And so that'sactually an area where
I'd like to see more of theregulatory scrutiny focused.
Maybe it's not as excitingfor Congress people
to be yelling at executivesof these companies
that no one has heard of asopposed to yelling at a Google
exec.
(12:58):
But that's where I thinka lot of the scrutiny
should be from a regulatoryperspective is these cases where
there's not really that muchof a direct relationship
between the consumer andwho ended up with this data.
The data is flowing aroundin pretty opaque ways.
With all this data flowingaround, a lot of people
are worried that their locationdata isn't anonymous and can
(13:20):
easily be tracked.
There was a New YorkTimes article about that.
So the government is alreadyusing people's location data
to use as evidenceagainst them in crimes.
But do you think that thegovernment could take on this
"Big Brother" 1984 really scarytype of approach in tracking
(13:41):
everybody?
Yeah, I think locationdata is definitely
an example of sensitivedata, and data
where someone can say, oh,this is somehow aggregated,
or this is somehowanonymized, or this
is going to be linked basedon this other identifier
that it's hard tolink to a person.
We know that if data is highdimensional, that if there's
a lot of individual bits ofdata, then everyone is unique.
(14:04):
So it's like afingerprint, you'd say.
Statisticians would saythat high dimensional space
is a lonely place.
If there's a lot ofcharacteristics describing
an individual, thenthey're probably
not really near any otherindividuals in that space.
And it may be easy to withother side data, other data
that you have fromanother source
to figure out whothat individual is,
(14:25):
even if the data is in someway ostensibly anonymous.
So I think that is a concern.
And one of thereally exciting areas
of theory moving intopractice right now
is, What are someof the techniques
to guarantee that data thatis released or transferred
or statistics that are reportedabout data that those don't
reveal too muchabout individuals,
(14:47):
that those areprivacy-preserving?
So I'm thinking of techniqueslike differential privacy, which
is being used bythe US Census Bureau
as well as Apple, andGoogle, and Facebook.
[MUSIC PLAYING]
So we know that data can beeasily accessed, given away,
and bought.
And data collection isn't new.
But like Dean said, justbecause something can be done
(15:08):
doesn't mean it should be done.
So how much of our personaldata is actually private?
What is the governmentallowed to access?
I mean, I'm sure there are lotsof data floating around that
could be used in serious crimes.
But what's the balance here?
That's a goodquestion because Dean
mentioned that it's likely ourphones aren't listening to us.
Doesn't mean they can't,but they likely aren't.
(15:29):
But the Amazon Alexalistens and it records.
But I guess the question is,Is that always a bad thing?
In the case of the VictorCollins murder trial,
Alexa really couldbe an expert witness.
If you remember, thebody of Victor Collins
was found floating in a hottub at his friend, James Andrew
(15:50):
Bates' home.
And Bates was latercharged with his murder.
After an Echo device wasfound on the Bates' property,
prosecutors in themurder trial requested
that Amazon provide allthe recordings and the data
from the device.
But Amazon refused to comply.
And there's a similarstory taking place
in the case of Sylvia Galva, aFlorida woman who was murdered.
(16:13):
Adam Crespo, her boyfriendwas charged with the murder.
And Crespo's attorneybelieves that recordings
from their home, Amazon Echo,could have witnessed the crime
and provide essential evidenceto exonerate his client.
So it begs the question,can and more importantly,
should data from these recordingdevices be used for good?
(16:35):
How do you balance theprivacy of the individual
with the power of Alexa's data?
And Kevin Roose is veryfamiliar with the privacy
side of the data economy.
We always blame the socialmedia companies for everything.
But on the same side, we'veall heard about cell phone data
in terms of catchingsomebody with a crime.
(16:55):
And it's actually thebig tech companies
that are protecting us,or maybe they're not.
Maybe you're going totell me they're not.
They're protecting us againstthe "Big Brother" government.
So is really the onlything protecting us
in some capacity fromthis police state
is the big tech companies?
I don't know that they'rethe only thing protecting us.
Things like privacy lawscould also protect us.
(17:16):
But I do think thatthe platforms thus far,
the big ones, have beenrelatively good at turning down
frivolous or unwarranted datarequests from governments,
both in the US and abroad.
There are lots ofcountries abroad where
an authoritarian regime mightwant to get its hands on, say,
(17:38):
a list of users whohad liked LGBT content
or something like that.
And so that's somethingthat the platforms have,
to my understanding, beenreasonably strong in turning
down those requests, in makinggovernments get warrants,
and things like that before theyare able to access user data.
(17:58):
I don't think they're doingthat out of sheer altruism.
I think they also have a vestedinterest in not just handing
over all their datato whatever government
decides to ask for it.
But I do think that it has beenlargely up to the tech companies
to fight this surveillance, thisquest for more control of data.
I mean, the example of thatApple's long protracted battle
(18:23):
with the FBI over access toiPhones in the case of murder
suspects and things like that.
And they haveactually fought fairly
hard over the years for theiruser's right to privacy.
Kevin, should peoplebe doing anything
to protect theirprivacy and data?
And if so, what advicedo you have for them?
(18:46):
I generally am not all thatworried about the privacy of my,
say, individual transactionson the internet.
Like I don't think targetedads are all that appealing.
But I also get a lotof untargeted ads,
and they're not so great either.
And I don't get creepedout when something's like,
oh, you bought a toaster, somaybe you want some toast.
(19:09):
That just makes sense to me.
I do think that most peoplehave what's called privacy
through obscurity.
Like no one reallycares about 99.99%
of the data that's out there.
But if you becomeinteresting for some reason,
and if that's to a government,if you're a dissident,
or if you experience amoment of viral fame,
(19:32):
if you suddenly come underinvestigation for something,
there is just a wealthof data out there.
So I think most people shouldbe conscious of their data
footprint and know, forexample, the basics of how
online advertisingworks and maybe
take some steps topreserve your privacy.
I turn off all the apptracking stuff on my iPhone.
(19:54):
I use encryptedmessaging services.
Those are reasonable steps.
But even so, I'm not underany illusion that I live--
if someone really wanted to findout what brand of toaster I own,
or maybe even whatmy phone number is,
it's not that hard to do.
So, Kevin, I'veheard that you have
a really greatstory about privacy
and how private ourdata actually is.
(20:16):
I will say thata few years ago--
for a story--
I was curious about hackingand what my own data
trail was like out there.
And so I invited a group ofworld class hackers to hack me.
I just said, spend thenext two weeks hacking me
in any way you want.
Figure out whatever you want.
(20:37):
No limits, and then I'lldo a story about it.
And so they went totown, and the stuff
that actually gave away myprivacy was not what I thought.
I was sure that they were goingto find some shadowy data broker
and find a file on the darknet that had my address in it.
But really, it was that I hadlike posted a photo of my dog
(20:58):
on social media.
And if you zoomed in reallyclose on the dog photo,
you could see his tag.
And on that tag was my address.
[LAUGHTER]
I love that.
It was actually notthe thing that I
thought I was doing thatwas risky that ended up
giving away my address.
And I think that's probablytrue for a lot of people.
Like your biggest exposureis not where you think it is.
(21:21):
[MUSIC PLAYING]
The data economycan be very useful.
I love it when Amazonreminds me to buy
dog food becauseotherwise, Henry wouldn't
have dog food that week.
But it has a dangerousand really insidious side,
and our governmentis not stepping in.
(21:42):
So even when currentgovernment regulations
hold tech companiesaccountable, it really
doesn't affectthem all that much.
Google was fined $170 millionfor violating children's privacy
on YouTube.
They were accused ofillegally collecting data
from children who watchchild-related videos
and selling it to companiesfor targeted advertisements.
So $170 millionsounds like a lot,
(22:03):
but it amounts to only two daysof Google's post tax earnings.
It's not just thebig tech companies
that we should be worried about.
It's really thesneaky little apps
that we don't know who'sbehind or what they're
doing with our information.
I recently saw an ad forpalm reading on Instagram.
Shows you what I'm looking at.
(22:24):
It instructed me to get a closeup picture of my entire hand,
including my fingers,so that I could
get this detailed, intricatepalm reading for free.
But there's no such thingas a free palm reading.
I'm actually giving upmy Instagram username,
which is my full nameand my fingerprints.
The terms andconditions of this app
(22:45):
stated that they could keepand use them in perpetuity.
So there are countless otherexamples of your personal data
being used forworrisome purposes.
But here's thecrux of the matter,
an unimaginable numberof laws and rules
have been put in place toprotect your rights as a citizen
of the United States.
When it comes to protecting yourdata, there are virtually none.
So what is it wecan do, Liberty?
(23:06):
I think it's clear fromwhat Dean and Kevin said
that it's really timefor our government
to start regulating it.
We as citizens need tourge lawmakers to make
laws to regulate big tech.
And I think in themeantime, you should just
take a scroll throughyour phone and delete some
of those sneaky little apps.
(23:27):
It might be time for meto delete that random vote
tracking app I downloaded.
[MUSIC PLAYING]
Thanks so much for listeningto this episode of Data Nation.
This podcast is broughtto you by MIT's Institute
for Data Systems and Society.
And if you want to learnmore about what IDSS does,
please follow us atMIT IDSS on Twitter,
(23:49):
or visit our websiteat idss.mit.edu.
[MUSIC PLAYING]