Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Welcome to the deep dive. We are jumping straight into
the digital arena today exploring the phenomenal, almost unbelievable growth
of social networking and well the constant sophisticated security battle
that's raging just below the surface.
Speaker 2 (00:15):
It's a battle that's absolutely necessary because of the sheer
scale we're talking about. I mean, by June twenty twenty,
research show the Internet head balloon to over four point
eight billion users.
Speaker 1 (00:25):
Four point eight billion, that's what sixty two percent of
the entire global population suddenly connected exactly.
Speaker 2 (00:31):
And if you think about social networking as the central
way modern humans communicate, you start to see why securing
it is maybe the highest priority challenge we have right now.
Speaker 1 (00:42):
Right and that massive expansion, it didn't just happen smoothly,
and it definitely came with the security cost. So our
mission for you today is pretty clear. We're going to
distill the history of this explosion, pinpoint the biggest threats,
everything from psychological tolls to high tech cybercrime.
Speaker 2 (00:57):
And then walk through the really cutting edge technical care
honor measures people are deploying. It's this constant arms race.
Speaker 1 (01:03):
Yeah, we've got sources covering the psychology, the politics, the
really technical stuff. It should be a fascinating synthesis.
Speaker 2 (01:10):
It really is. So where do we start the origins?
Speaker 1 (01:13):
Let's do it, because if you think the story starts
with I don't know the like button, you're missing a
couple of decades. The foundations were actually laid way back
in nineteen ninety seven. You had six degrees dot com, right,
six degrees.
Speaker 2 (01:26):
They actually got up to three point five million users,
which was huge.
Speaker 1 (01:30):
Then, maybe more memorably for some folks, AOL instant messenger, AM.
Speaker 2 (01:35):
Oh yeah am. That was really the precursor, wasn't it.
It brought in things we take for granted now like
real time chat, persistent friend lists.
Speaker 1 (01:43):
That was the blueprint. Basically, those early features paved the
way for everything that came later. That mid two thousands explosion.
Speaker 2 (01:50):
Definitely, and two thousand and two is a really pivotal
year in that. What happened then, Well, you got Friendster,
which was one of those early kind of original networks,
maybe leaned a bit into dating I remember friends tr
but maybe even more impactful long term was LinkedIn, also
launched in two thousand and two.
Speaker 1 (02:05):
Oh. LinkedIn, Okay, that's different.
Speaker 2 (02:06):
Totally different. It's the perfect example of a niche platform
that just completely redefined a whole area of connection. It's
all about professional networking, and now over seven hundred million users.
It fundamentally changed modern recruitment. I mean, companies use it
constantly to find people, screen candidates.
Speaker 1 (02:25):
Yeah. Absolutely, it's indispensable for many jobs now. But okay,
scale brings problems. As these platforms got bigger and bigger,
that initial dark side started to show up.
Speaker 2 (02:37):
Right at first, it was maybe just you know, data
mining concerns, but then it quickly escalated to what organized
phishing attempts, botnet attacks starting to use these platforms, malware spreading.
Speaker 1 (02:47):
Like wildfire, and the problems didn't stay purely technical, did they.
They spilled over into the social realm, sometimes in really
severe ways.
Speaker 2 (02:55):
Yeah. The sources talk about this concept called digital dramatization,
which sounds a bit academic, but it points to the
really serious, sometimes unintended consequences of broadcasting your life in real.
Speaker 1 (03:06):
Time, well kind of consequences.
Speaker 2 (03:08):
Well, it covers things like cyberbullying, online vengeance, but tragically,
it also includes things like suicides, even murders being broadcast
live over platforms like Facebook Live.
Speaker 1 (03:19):
That's yeah, that's incredibly chilling, just a horrific side effect
of that instant connectivity. And speaking of negative consequences, let's
pivot slightly to the psychological toll, because the research on
why people use these networks is fascinating.
Speaker 2 (03:34):
It really is. There's a paradox, right, we call.
Speaker 1 (03:35):
Them social networks, but most people aren't primarily using them
to socialize exactly.
Speaker 2 (03:41):
Studies show the majority use them mainly to consume information
to scrolling, reading, watching.
Speaker 1 (03:46):
And what does that passive consumption do to people?
Speaker 2 (03:49):
Well, that seems to be the key disconnect. Researchers found
It often leaves users feeling I guess, unfilled and unsatisfied.
You're constantly bombarded with everyone else's highlight reels of perfect vacations,
the amazing relationship.
Speaker 1 (04:02):
Right, the curated perfection, and.
Speaker 2 (04:04):
It inevitably triggers envy, sometimes depression, and definitely that fear
of missing out. Fomo.
Speaker 1 (04:10):
Yeah, fomo is real. The sources even suggest that social
networking can act almost like a new drug.
Speaker 2 (04:17):
It triggers a similar dopamine response in the brain, kind
of like addiction, and the research points out this compulsive
use can lead people to disengage from developing real world skills.
Speaker 1 (04:27):
Which can contribute to bigger societal issues like unemployment, because
you're spending so much time consuming instead of I don't know,
learning or doing.
Speaker 2 (04:35):
That seems to be the argument. It's presented as a
kind of fundamental crisis of attention.
Speaker 1 (04:39):
Okay, so if that passive data consumption creates a personal crisis,
the collection of all that data creates a potential political one. Right,
we really have to talk about Cambridge analytica here. That
feels like the moment data exportation got undeniably political.
Speaker 2 (04:56):
Absolutely a landmark case. It wasn't just a simple data theft,
which I think is common misconception.
Speaker 1 (05:01):
So how did it work then? What was the mechanism?
Speaker 2 (05:03):
It actually started with academic research Back in twenty thirteen,
researchers at Cambridge University showed you could predict detailed psychographic
profiles pretty accurately just by analyzing someone's social media activity,
like their likes.
Speaker 1 (05:16):
Okay, so the potential was known.
Speaker 2 (05:19):
Yes, and then a researcher named Alexander Cogan weaponized that knowledge.
He created one of those personality quizes on Facebook.
Speaker 1 (05:27):
Yeah remember those oh yeah, which Disney character are of
that kind of thing?
Speaker 2 (05:30):
Pretty much But the real trick, the sort of malicious
genius of it, wasn't just getting the data from the
people who took the.
Speaker 1 (05:37):
Quiz, ah right, There was more to it.
Speaker 2 (05:40):
Way more. Taking the quiz gave Cogan access not only
to that user's personal data, but also the data of
all their friends on.
Speaker 1 (05:48):
The platform, without the friends even taking the quiz.
Speaker 2 (05:50):
Without them knowing or consenting at all. And Cambridge Analytica
used that massive pool of data yours and your friends
to build these incredibly detailed profiles. We're talking over five
thousand data points on something like two hundred and thirty
million US adults.
Speaker 1 (06:04):
Wow, all for political ad targeting and manipulation exactly.
Speaker 2 (06:08):
It just starkly revealed the immense power of this kind
of surveillance capitalism.
Speaker 1 (06:13):
Okay, So if we take that idea is granular surveillance,
detailed profiles, and scale it up to a national level,
the sources point towards what they call the ultimate social threat.
Speaker 2 (06:22):
You mean the Chinese government's concept of a citizen's score.
Speaker 1 (06:26):
Yeah, explain that a bit.
Speaker 2 (06:28):
Well. It evolves using technology like facial recognition combined with
analysis of online behavior, social media activity, financial transactions, all
sorts of data to do what to create a constantly
updated score evaluating a citizen's behavior. Good behavior gets rewarded,
bad behavior gets punished punished. How it could affect anything
from loan applications to travel restrictions, even access to certain
(06:51):
jobs or schools. And the really frightening part beyond just
the surveillance, what's that is that this scorejudgment follows you
for life. The sources argue it fundamentally hinders the human
ability to reinvent yourself or move past mistakes as it
locks you in.
Speaker 1 (07:09):
That's dystopian, truly chilling on a societal level. Okay, let's
bring it back down to the individual user though, because
alongside these huge systemic things, we're still facing the everyday
cyber threats like malware.
Speaker 2 (07:20):
Oh yeah, malware still rampant, keyloggers snatching your passwords as
you type, infostdealers grabbing files, banking malware trying to empty your.
Speaker 1 (07:27):
Accounts, And how does it usually get onto people's devices?
Speaker 2 (07:31):
Often through classic methods, malicious email attachments, maybe dodgy links,
sometimes bundled with pirated software you might download.
Speaker 1 (07:39):
And then there's fishing, the old, reliable.
Speaker 2 (07:43):
Still incredibly effective. It's low tech social engineering right, attackers
pretending to be someone you trust, Microsoft, Amazon, Netflix, your bank.
Speaker 1 (07:51):
Trying to trick you into clicking a link and giving
up your login details or credit card number.
Speaker 2 (07:55):
Exactly, And it works because honestly, people are still really
bad with passwords.
Speaker 1 (08:01):
Don't say it.
Speaker 2 (08:02):
The research confirmed it. Even in twenty nineteen, after all
the major breaches we've heard about, the most common passwords
were still things like one, two, three, four, five, six
and password size.
Speaker 1 (08:11):
We laugh, but it's also kind of sad, isn't it.
We know better, but convenience wins.
Speaker 2 (08:15):
It often does, which perfectly leads us into the technical
fight back because researchers know about these human weaknesses. Given
this huge challenge, how do you share massive social data
sets for research without exposing individuals? What are they building?
Speaker 1 (08:30):
Right? How do you get the value without the privacy violation?
That's the core problem for privacy preserving analytics precisely.
Speaker 2 (08:36):
The constant fear is the identity disclosure attack, someone figuring
out who's who in supposedly anonymous data.
Speaker 1 (08:45):
So what was the first big technical defense.
Speaker 2 (08:48):
The foundational technique really was something called k anonymity, introduced
way back in two thousand.
Speaker 1 (08:53):
And two, k anonymity. Okay, what's the principle.
Speaker 2 (08:56):
The basic idea is to make any single person's record
in the data set indistinguishable from at least K minus
one other records, usually through generalization like replacing an exact
age with an age range, or suppression just removing certain data.
Speaker 1 (09:11):
Points so you blend into a small crowd of K people.
Speaker 2 (09:14):
That's the goal. Yeah, but it had flaws.
Speaker 1 (09:17):
If it's been around since thousand and two, yeah, I
guess it wasn't perfect. What went wrong?
Speaker 2 (09:20):
It was vulnerable to what's called a homogeneity tack. Imagine
your group of k people all look similar based on
the generalized data, like they live in the same zip code.
Now what if almost everyone in that group shares the
same sensitive attribute, say they all have a specific medical condition.
Even if one person's data on that condition is suppressed.
Speaker 1 (09:40):
You can pretty much guess their status because everyone else
in their anonymous group has it exactly.
Speaker 2 (09:45):
The lack of diversity within the group broke the anonymity.
It was also vulnerable to background knowledge attacks, where an
attacker uses external info to reidentify someone.
Speaker 1 (09:56):
So generalization wasn't enough if the group itself was too
similar in.
Speaker 2 (10:00):
Er right, so that led to the next step, L diversity.
Speaker 1 (10:02):
Okay, how does L diversity improve things?
Speaker 2 (10:05):
It adds another constraint. It says that within each of
those groups, the ones that look similar, there must be
at least little distinct, well represented values for the sensitive attribute.
It forces diversity into.
Speaker 1 (10:17):
The groups, making it harder to infer anything specific about
one individual.
Speaker 2 (10:21):
Much charter. But the real cutting edge, now, the sort
of gold standard people aimed for is differential privacy.
Speaker 1 (10:27):
Differential privacy heard, the term sounds complex. What's the core idea?
Speaker 2 (10:31):
Instead of just generalizing or suppressing, Differential privacy involves adding
carefully calculated mathematical noise to the data, or more accurately,
to the queries run on the data. Adding noise doesn't
that mess up the results? That's the clever part. The
noise is precisely calibrated. It's enough to make it mathematically impossible,
(10:52):
or at least very difficult, to tell if any single
individual's data was included in the data set.
Speaker 1 (10:57):
Okay, protecting the individual.
Speaker 2 (10:59):
But it's not enough to significantly change the overall aggregate
results or statistical trends needed for research. And crucially, it
allows organizations to actually quantify the level of privacy they're providing.
It gives a mathematical guarantee that sounds.
Speaker 1 (11:13):
Much more robust. Okay, let's shift to a really specific
challenge location data. Our phones are constantly broadcasting where we
are for location based services LBS.
Speaker 2 (11:22):
Like maps, ride sharing, local recommendations. Yeah, very common.
Speaker 1 (11:25):
How do you protect privacy there?
Speaker 2 (11:27):
One approach is location k anonymity. It's similar in spirit
to the original K anonymity.
Speaker 1 (11:32):
How does it work in practice?
Speaker 2 (11:33):
Instead of your phone sending your exact GPS coordinates directly
to the map service, you might use a middleman, a
trusted third party sometimes called a location Trusted service or LTS.
Speaker 1 (11:46):
And what does this LTS do?
Speaker 2 (11:48):
Your phone tells the LTS your location. The LTS then
finds an area called a cloaking zone that includes you
and at least k other users nearby. It then sends
that zone, not your specif point to the map service provider.
Speaker 1 (12:02):
So the service knows someone in this block needs directions,
but not exactly who or where within the block. You're
hidden in a small geographic crowd.
Speaker 2 (12:09):
That's the idea, blurring your precise identity.
Speaker 1 (12:12):
Okay. Another threat factor the network itself, especially wireless networks.
We hear about rogue access points raps. What's the danger
there for say a social media user?
Speaker 2 (12:22):
Big danger potentially imagine you're at a coffee shop or airport.
RAP is basically an unauthorized Wi Fi hotspot set up
by an attacker, often mimicking the legitimate network name like
cafe guest Wi Fi.
Speaker 1 (12:34):
The classic evil twin attack Exactly if.
Speaker 2 (12:37):
You connect your phone or laptop to it and then
log into your social media, the.
Speaker 1 (12:40):
Attacker running the RAP can potentially intercept your username, password,
session cookies, basically take over your account.
Speaker 2 (12:48):
YEP, or redirect you to fake login pages, install malware.
It's a major vulnerability in public spaces.
Speaker 1 (12:56):
So how do places defend against these? How do you
even find them?
Speaker 2 (13:00):
If there's a system architecture proposed in the research, it
involves having an administrative body, maybe the coffee shop owner
or IT staff, use a dedicated Wi Fi scanner.
Speaker 1 (13:09):
What does the scanner do?
Speaker 2 (13:10):
It listens for the beacon frames that all Wi Fi
access points constantly broadcast. These frames contain key info like
the AP's MSc address, it's unique hardware ID, the network name, SSID,
security settings, signal strength RSSI.
Speaker 1 (13:25):
Okay, it gathers the data on all nearby aps. Then
what that.
Speaker 2 (13:30):
Collected data is immediately compared against a preapproved database, a
whitelist of all the legitimate access points that should be
operating in that area.
Speaker 1 (13:36):
Ah So, if the scanner picks up an AP whose
details aren't on the white.
Speaker 2 (13:39):
List, bingo, that's flagged as a potential rogue access point
that needs investigation.
Speaker 1 (13:44):
Makes sense? Okay, Shifting gears again into the really modern
stuff AI and automation and security. Content moderation is a huge.
Speaker 2 (13:53):
One, absolutely massive. The scale is just impossible for humans alone. YouTube,
for instance, apparently sees something like five hundred hours of
video uploaded every.
Speaker 1 (14:02):
Single minute, five hundred hours a minute. You can't possibly
have humans watch all that no way.
Speaker 2 (14:07):
So automation, specifically machine learning is essential. Facebook, for example,
uses mL pretty heavily to proactively find and remove harmful content.
Speaker 1 (14:17):
Like what kind of content.
Speaker 2 (14:18):
They reported, for instance, removing something like twenty six million
pieces of content related to global terrorist groups over a
period and claim that ninety nine percent of it was
removed proactively by their AI systems before any human even
flagged it.
Speaker 1 (14:31):
Ninety nine percent. That sounds incredibly effective.
Speaker 2 (14:33):
It is technologically speaking, but that remaining one percent given
the volumes and still represent a lot of harmful content
sloping through and automation still really struggles in some.
Speaker 1 (14:42):
Areas Where does it fall down?
Speaker 2 (14:44):
The big challenges are subjectivity and context. How do you
train an AI to definitively understand vague concepts like terrorism
or obscenity across different cultures and context.
Speaker 1 (14:56):
It's incredibly hard, Right, context is everything?
Speaker 2 (14:59):
Remember the contra divers over the historical napalm girl photo
from the Vietnam War. It's a famous, important photo depicting
violence and nudity. Some platforms automated systems initially flagged and
removed it completely missing the vital, historical and newsworthy context.
Speaker 1 (15:15):
Because the algorithm just saw nudity and violence pretty much.
Speaker 2 (15:19):
It lacked the human understanding of nuance.
Speaker 1 (15:21):
And you also have people actively trying to fool these systems.
Speaker 2 (15:24):
Right, adversaries, Yes, adversarial attacks are a constant problem. Sophisticated groups,
knowing their content might get flagged, actively try to modify
it to evade detection by the machine learning classifiers.
Speaker 1 (15:35):
How do they do that?
Speaker 2 (15:36):
For example, research showed isis affiliates learned to avoid certain
high risk keywords associated with terrorism. Instead, they started using
more neutral language like just saying Islamic state group, which
apparently helped their accounts stay active longer before the automated
systems caught on.
Speaker 1 (15:54):
They're literally learning how the AI works and adapting their
tactics to bypass it. It's a constant cat and mouse game.
Speaker 2 (16:00):
It really is, which brings us to maybe the most intriguing,
almost sci fi end of this adversarial spectrum. Using covert channels.
Speaker 1 (16:10):
Covert channels, so this isn't about bypassing moderation, it's about
hiding communication completely.
Speaker 2 (16:15):
Exactly, making it invisible to defenders. One fascinating piece of
research explored using steganography hiding data within other data to
run a botnet's command and control structure.
Speaker 1 (16:25):
Using Twitter hiding botnet commands and tweets.
Speaker 2 (16:28):
How they didn't hide it in the text of the tweet,
which might be detectable. Instead, they use the length of
the Twitter post itself as the secret channel.
Speaker 1 (16:36):
The number of characters yep.
Speaker 2 (16:38):
Back when Twitter had that one hundred and forty character limit,
the specific length of the tweet, say one hundred and
twelve characters versus one hundred and thirty one would correspond
to an encrypted command being sent from the botmaster to
the infected computers in the botnet.
Speaker 1 (16:52):
Okay, that's clever, but wait if a single account just
started posting tweets with weirdly specific repeating links, wouldn't that
stick out like a sourt? Wouldn't monitoring systems flag that pattern?
Speaker 2 (17:03):
Good point. That's the next layer. They needed plausible cover
for the accounts sending these lengthen coded messages. They couldn't
look like obvious bots, So what did they do? They
used another bit of AI, a markof chain model trained
on a massive data set of real Twitter usernames. This
model learned the patterns of typical usernames and could then
generate new, completely artificial usernames that sounded convincingly human.
Speaker 1 (17:27):
They generated fake, but real sounding usernames to post the
secret length tweets.
Speaker 2 (17:31):
Exactly, and to test if these generated usernames were actually plausible,
they ran an experiment using Amazon mechanical Turk, asking real
humans to rate the generated names alongside real ones.
Speaker 1 (17:42):
And what did the humans say?
Speaker 2 (17:43):
They rated the automatically generated, natural sounding user names as
highly plausible. They couldn't easily distinguish them from real accounts.
It showed they could effectively conceal not just the hidden
message channel, but also the identity of the accounts using it.
Speaker 1 (17:58):
Wow, So they built AI not just to carry out
the attack, but specifically to fool other AI detection systems
and even human intuition. That's quite something.
Speaker 2 (18:09):
It really shows the sophistication we're up against.
Speaker 1 (18:11):
Okay, we've covered a huge range here, from the psychological
pool of just passively scrolling all the way to botnets
hiding commands and tweet links using AI generated usernames. The
core tension just seems clearer than ever, This amazing convenience
of being hyper connected versus the constantly evolving, incredibly sophisticated threats.
Speaker 2 (18:31):
Absolutely, and it really brings it back to the importance
of individual vigilance. You know, you can't just rely on
the platforms or technology to protect you completely. For mobile
security specifically, the sources mentioned a useful acronym SRP.
Speaker 1 (18:45):
SRP Okay, break that down.
Speaker 2 (18:47):
S is for secure networks, being cautious about public Wi Fi.
Maybe using a vpn R is for risks awareness just
understanding the kinds of threats we've talked about, like phishing
and malware and p P is for protect personal information.
And this isn't just about passwords. Think about all those
fun quizzes or surveys you fill out online. What was
(19:08):
your first pet's name? What street did you grow up on?
Speaker 1 (19:11):
Ah? Classic security question answers.
Speaker 2 (19:13):
Exactly Attackers can collect those seemingly harmless bits of info
you share publicly and potentially use them later to bypass
security questions. If they managed to compromise part of your
log in, like your password, don't make it easy for them.
Speaker 1 (19:26):
That's a really practical point. Be mindful of all the
info you share. All right, as we wrap up, here's
one final, maybe provocative thought for you, the listener. Based
on the sources, even when you think you're being careful
deleting files using private browsing modes, that digital footprint it's
rarely ever truly gone.
Speaker 2 (19:43):
That's a stark reality. Your network provider, your internet service
provider at home, or your mobile carrier on your phone,
they maintain extensive logs. Logs are the sites you visit,
the connections you.
Speaker 1 (19:55):
Make, and that data isn't necessarily private forever.
Speaker 2 (19:58):
No, in many jurisdictions like the US, that electronic evidence
can be legally requested, obtained, and even admitted into core proceedings.
Speaker 1 (20:06):
So the bottom line is what you access online is
almost never truly private from everyone pretty.
Speaker 2 (20:12):
Much, and maybe that realization, more than any specific technical
defense we discussed, should be the biggest incentive for all
of us to be thoughtful and vigilant about how we
navigate this incredibly complex, hyperconnected world.
Speaker 1 (20:24):
A sobering but essential point to end on. This has
been a fantastic deep dive into the social networking security battleground.
We really hope this synthesized knowledge gives you some greater
insight as you navigate your own digital life. Thanks for
joining us.