All Episodes

November 13, 2025 12 mins
In this lesson, you’ll learn about: The complete toolkit and techniques for analyzing network traffic using Connection Analysis, Statistical Analysis, and Event-Based (signature-focused) Analysis. 1. Data Analysis Toolkit General-Purpose Tools These are foundational command-line utilities used to search, filter, and reshape data:
  • grep → pattern searching
  • awk → field extraction and manipulation
  • cut → selecting specific columns
    Used together, they form powerful pipelines for rapid, custom analysis.
Scripting Languages Python
  • Most important language for packet analysis.
  • Scapy allows:
    • Parsing PCAPs
    • Inspecting packet structure
    • Accessing fields (IP, ports)
    • Filtering traffic (e.g., HTTP GET requests)
    • Deobfuscating malware traffic
      • Example: Extracting useful strings from compressed Ghostrat C2 payloads.
R
  • Useful for statistical modeling and clustering of network data.
Specialized Tools
  • Netstat → enumerates active connections
  • Silk → large-scale flow analysis (CERT tool)
  • Yara → rule-based threat matching (binary/text patterns)
  • Snort → signature-based intrusion detection
2. The Three Core Data Analysis Techniques A. Connection Analysis Purpose: High-level visibility into which systems are connecting to which. Ideal for:
  • Detecting unauthorized servers or suspicious programs
  • Spotting lateral movement (e.g., odd SSH usage)
  • Identifying database misuse
  • Ensuring compliance across security zones
Primary Tool: Netstat
  • Shows all active connections + states
    (LISTENING, ESTABLISHED, TIME_WAIT, etc.)
Example Uses:
  • Spotting malware opening a hidden port
  • Identifying unauthorized remote access
  • Finding systems connecting to suspicious IPs
B. Statistical Analysis A macro-level technique designed to spot deviations from normal behavior. Techniques: 1. Clustering Group similar traffic together to identify families or variants.
  • Demonstrated by clustering Ghostrat variants through similarities in their C2 protocol.
2. Stack Counting Sort traffic by count of activity on:
  • Destination ports
  • Host connections
  • Packet types
Used to find anomalies:
  • Single visits to rare ports (2266, 3333)
  • Unexpected FTP traffic (port 21)
3. Wireshark Statistics Using built-in metrics:
  • Packet lengths (large packets → possible exfiltration or malware downloads)
  • Endpoints
  • Protocol hierarchy
Specialized Tool: Silk
  • Designed for massive enterprise networks
  • Supports both command line & Python (Pysilk)
  • Ideal for flow-level analysis, anomaly detection, and trend discovery.
C. Event-Based Analysis (Signature Focused) A micro-level technique used to identify known threats via rules and signatures. 1. Yara Signatures
  • Rules match known binary or text patterns.
  • Example uses:
    • Detecting Ghostrat via identifying strings like "lurk zero" or "v2010"
    • Multi-string matching to detect multi-stage malware
    • Matching malicious hostnames or indicators
Used for:
  • Malware classification
  • Reverse-engineering support
  • Deep content inspection
2. Snort Rules Snort provides concise detection logic for network traffic. Rule Structure Includes:
  • Action (alert, log)
  • Protocol (TCP/UDP)
  • Source/destination + ports
  • Options (content matches, flags, byte tests)
Examples Provided:
  • Detecting Nmap Xmas scans (FIN + PUSH + URG flags)
  • Detecting SMTP credential leakage (plaintext “authentication succeeded” over port 25)
Snort highlights:
  • Excellent for IDS/IPS
  • Simple to write and test
  • Widely used in enterprise SOCs
3. Practical Demonstrations A. Scapy + Yara Workflow shown:
  1. Use Scapy to load and parse PCAP
  2. Extract payloads
  3. Feed payloads to Yara
  4. Detect Ghostrat, multi-stage malware, or other known threats
This combination gives both:
  • PCAP-level filtering
  • Payload-level signature inspection
B. Scapy + Snort Two key demonstrations: 1. Automatic Snort Rule Generation
  • Tools like packet_to_snort.py generate
Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
All right, so you've done it. You successfully tap the network.

Speaker 2 (00:03):
You collected all the road traffic pckps, the flows, the logs,
all of it exactly.

Speaker 1 (00:08):
And you might be thinking, hew, the hard part's over,
But honestly, that was just step one.

Speaker 2 (00:13):
Oh. Absolutely, the real fight, the real work is just
beginning right now.

Speaker 1 (00:17):
You've got to turn this sheer volume of data into
something you can actually use into clear actionable security intelligence.

Speaker 2 (00:25):
It's a mountain of information and the tools out there,
they span a huge spectrum. You can go for these
fully automated commercial.

Speaker 1 (00:32):
Solutions expensive easy buttons.

Speaker 2 (00:35):
Easy buttons, Yeah, they abstract away the complexity. Or you
can go the other way and build your own bespoke
analysis frameworks from scratch.

Speaker 1 (00:46):
And that's our mission for this deep dive. We need
to map out that tool, look at the key methodologies.
I think there are three right connections, statistical and event
based analysis. Those are big three, and we need to
match them with the right instruments because you have to
know the right tool for the right job to get
answers fast. You really do, Okay, let's start with that.
Build your own philosophy, the real basics, the command line utilities.

(01:10):
These things are often built right into the OS, but
they're deceptively powerful.

Speaker 2 (01:15):
They really are. We're talking about the classics, you know,
things like GP for just quick searching, essentials the absolute essentials,
and then OK for complex pattern processing, and CUT for
just making the output clean. Individually, they're simple, but you
combine them, that's where the magic is. The real insight
here is speed. By piping these together, you know, connecting

(01:37):
the output of one tool right into the input of
the next, you can create a really specialized filter or
analysis tool in minutes.

Speaker 1 (01:45):
It sounds like the ultimate security hack, you know, just
quickly scripting something for an incident that just happened. But
isn't that a bit brittle? I mean, if a log
format change is just a little your whole hack breaks.

Speaker 2 (01:55):
That's the critical point they are. They're inherently disposable and
very specific. They're for quick tactical winds, not long term stability.
But their modularity means they are perfect for basic automation.
On a Linux system. You can link these little shell
scripts right into something like kron for scheduled runs, or
push the results to syslog for central reporting.

Speaker 1 (02:17):
So you can bootstrap a monitoring system really.

Speaker 2 (02:19):
Really easy sockingly easy. Yeah.

Speaker 1 (02:21):
So if the command line tools are the tactical building blocks,
then scripting languages like Python and r they're more like
the architecture.

Speaker 2 (02:31):
A great way to put it. They offer much more power,
but you know, with a bit more overhead.

Speaker 1 (02:34):
And in the security world, Python seems to have the edge.

Speaker 2 (02:38):
It has the clear edge, mainly because of packages built
specifically for this. The big one is skapy scapey. Yeah.
Skapy is purpose built to read and parse PCP data.
So it takes the dense, you know, chaotic binary structure
of a packet, the ethernet layer, the IP, the TCP
and it just translates it all into these clean, accessible fields.

Speaker 1 (02:58):
So instead of fighting with raw bites, we just get
attributes we can call in the code that's the key.

Speaker 2 (03:02):
The packet becomes an object, so you can write logic
tests directly on its content. You can, for instance, check
if the raw payload starts with the bytes for ant
get request.

Speaker 1 (03:13):
Well that's powerful.

Speaker 2 (03:14):
Or if you just need the source IP address, you
just call packetip dot src. It's that simple, and that
efficiency is why skapy is used for more than just analysis.
People use it to actively craft malicious packets. For testing
their own defenses.

Speaker 1 (03:29):
That custom framework gives you so much control. So okay,
we have the data flowing. Where do we start the analysis?
You said connection analysis is the necessary first step. What
is this simple approach actually looking for.

Speaker 2 (03:42):
It's the low hanging fruit. It focuses only on the
existing connections between machines. The fundamental question is just who
is talking to whom and on what port.

Speaker 1 (03:50):
It sounds simple, but I bet that uncovers a ton
of risk during like an audit.

Speaker 2 (03:55):
It really does. It gives you three immediate security wins. First,
you can spot unauthorized serve, you know as a machine
that should only be a client suddenly listening on Port
eighty acting like a web server. That's a red flag. Second,
lateral movement, if an attacker gets a foothold and starts
moving inside your network, you'll see strange internal SSH traffic
between machines that have no business talking to each other.

(04:17):
And the third compliance you can quickly verify that your
sensitive database ports are only talking to approved application servers
and nothing else.

Speaker 1 (04:27):
And the classic tool for this is netstat.

Speaker 2 (04:29):
Absolutely, netstat is everywhere. It's a command line tool and
its real diagnostic power is showing you the state of
those connections. The state tells you everything, okay, so give.

Speaker 1 (04:39):
Us the most important states for say, incident.

Speaker 2 (04:42):
Response, you want to focus on three. Listening That means
the port is open, it's just waiting for a connection.
It's a perfect snapshot of what services are running.

Speaker 1 (04:51):
Okay.

Speaker 2 (04:51):
Then there's established, which is an active live data flow,
and critically.

Speaker 1 (04:56):
Timewaight time waste. What's that?

Speaker 2 (04:58):
That means the connection is closing, but the socket is
waiting around for a bit to make sure no last
minute packets get lost. If you see tons of connections
stuck in weird states or a huge number in time weight,
it can signal a big problem before an attack or
even gets data out.

Speaker 1 (05:12):
Connection analysis feels very concrete, but statistical analysis that's a
step back, more of a macro view. And this is
where it gets really interesting because now we're trying to
define what normal even looks like on a network.

Speaker 2 (05:24):
Right. Statistical analysis is all about defining that norm, that
baseline over time and then just aggressively searching for deviations
from it anomalies.

Speaker 1 (05:34):
The assumption being an attack will look different exactly.

Speaker 2 (05:37):
The core assumption is that any attack, malware and insider
threat whatever, it's going to cause a noticeable, measurable blip
that's different from the baseline. This is where machine learning
and AI really shine because they can constantly refine that
understanding of normal.

Speaker 1 (05:54):
Can this approach help with malware families that are constantly changing?

Speaker 2 (05:58):
It's excellent for that. Let's take the ghost rap malware family.
A new variant might be tagged ghost or LURK zero
or version twenty twelve.

Speaker 1 (06:05):
The names change, but the behavior the.

Speaker 2 (06:09):
Command and control communication structure is almost identical. So statistical
classifiers can cluster these traffic samples together based on those
deep structural similarities. This lets you spot a brand new
zero day ghost GRAT variant just because it's network traffic
looks too much like the bad stuff you already know about.

Speaker 1 (06:26):
That's for grouping threats, but what about just finding one
single bad connection hidden in millions of good ones.

Speaker 2 (06:34):
For that, you use a technique called steck counting. You
groop all your traffic by some feature, let's say destination port,
and then you sort by count. Okay, now, when you
analyze it, you do something counterintuitive. You intentionally ignore the
high count.

Speaker 1 (06:47):
You ignore port eighty four forty.

Speaker 2 (06:49):
Three, all of it all the common stuff. You focus
intensely on the bins with very low.

Speaker 1 (06:53):
Counts because the common traffic is just noise. But one
single hit on a weird port that's the signal, precisely.

Speaker 2 (07:00):
I remember one case where this exact technique revealed a
single connection to port twenty one FTP. That one anomaly
led us to an unauthorized internal file server. It had
been running for three months, totally invisible in all the
normal web traffic. A single connection on some unassigned port
like twenty two to sixty six, that's highly suspicious.

Speaker 1 (07:20):
To do this at enterprise scale, you must need a
specialized tool.

Speaker 2 (07:24):
You do, you need something like silk. Silk operates at
the flow level, which is a higher level summary than
individual packets.

Speaker 1 (07:30):
So with silk you're trading off the ability to read
the actual content of the packet you are, but what
you gain is the speed you need to map millions
of connections. Is that the trade off?

Speaker 2 (07:40):
That's exactly the trade off. You sacrifice detail for processing
speed and storage efficiency. Silk can filter, sort, and summarize
huge amounts of flow data very very quickly. It's essential
for that macro level view.

Speaker 1 (07:52):
Okay, so that's the opposite of our next method. In
contrast to statistical analysis, hunting for the unknown event based
analysis is microfl focused. It's signature driven. Here you have
to know exactly what you're looking for.

Speaker 2 (08:04):
That is the critical distinction. Yes, in this methodology, the
analyst has to define the specific signature of a known threat,
the exact pattern of a piece of ransomware or a
unique C two channel.

Speaker 1 (08:17):
The advantage being it's super fast and effective if you
know the threat. That's how antivirus works, right, It's.

Speaker 2 (08:24):
The operational model for most AV and IPS systems.

Speaker 1 (08:27):
But the flaw is obvious. If the threat is brand new,
a true zero day, your signature detection fails instantly. There's
nothing to match against.

Speaker 2 (08:37):
That is the fundamental drawback and we rely on two
main tools here. The first is Yara Signatures. Yah Jara
is this very versatile rule based language. It's used everywhere
in malware analysis and it works just as well on
network traffic data as it does on files.

Speaker 1 (08:50):
And how does that rule structure help you, catch, say,
multiple variants of the same malware family.

Speaker 2 (08:55):
A Yar rule lets you define specific strings, text, hex values,
complex regular expression, and then you combine them with boolean
logic ads and ors. So you can easily write a
rule that matches any ghost rat variant by just saying
match the tag lurk zero or r V twenty ten.
Or you can catch multi stage malware by creating a
rule that requires matching multiple malicious host names like margot

(09:18):
c dot JP and is standbull dot net.

Speaker 1 (09:21):
Very cool. And the second tool is snort, the cornerstone
of network specific signatures.

Speaker 2 (09:27):
Right. Snort rules are incredibly robust. The syntags is really
concise and action oriented. You just specify the action alert,
log block, the protocol source and destination info, and then
you get into very detailed content options.

Speaker 1 (09:40):
Give me a really technical example, maybe one that only
uses the TCP packet flags.

Speaker 2 (09:44):
Sure, you can use snort to detect stealthy port scans
like a Christmas scan. The scan uses a really unusual
combination of TCP flags to try and sneak past firewalls. Okay,
the snort rule is simple. It just alerts anytime it
sees the simultaneous combination of the FBN PUSHH and urgent flag.
Some people call it the fpn U combo. Another great
use is logging clear text password leaks. You can just

(10:05):
look for the specific content string authentication succeeded being sent
over Port twenty five, the unencrypted SMTP.

Speaker 1 (10:11):
Port, and this is where skapey comes back in right.
It completes the circle absolutely.

Speaker 2 (10:15):
Skapy is crucial for testing those SNORT rules. If you
have a rule that seems like it's failing, sometimes you
can take a live PC tape use Skapey to surgically
change things like change a source IP or tweak a flag,
and then you replay that traffic.

Speaker 1 (10:30):
To make sure your rule is robust enough for the
real world.

Speaker 2 (10:33):
Exactly to make sure it holds up against all the
weird variations you'll actually see.

Speaker 1 (10:37):
This really brings a whole thing together. We've covered three
distinct but essential approaches. Connection analysis tells us who's talking.

Speaker 2 (10:45):
To whom right the basics.

Speaker 1 (10:47):
Statistical analysis tells us what looks abnormal, what deviates from
the norm. And event based analysis tells us what looks
exactly like a known threat.

Speaker 2 (10:55):
And you need all three. Effective security intelligence comes from
using a combination of all of them all at the
same time.

Speaker 1 (11:02):
A holistic view is essential.

Speaker 2 (11:03):
It is and connecting this to the bigger picture, you
always have to remember the strategic importance of scope. Don't
waste your time and storage running incredibly detailed packet level
analysis with Skapy and snort. If simple flow data which
Silk handles and seconds gives you the answer you.

Speaker 1 (11:21):
Need, choose the right tool for the job.

Speaker 2 (11:24):
Choosing the appropriate data, fidelity, and the right tool. That's
the single biggest thing that determines success in this field.
It saves you massive amounts of time.

Speaker 1 (11:33):
A vital point about efficiency. So here's something for you
to maybe chew.

Speaker 2 (11:37):
On after this, a little thought experiment.

Speaker 1 (11:39):
Yeah, exactly. Let's say an attacker is using a totally
standard encrypted connection port four four three, just regular HTTPS
for their command.

Speaker 2 (11:47):
And control, something you'd see a million times a day.

Speaker 1 (11:49):
Right, So, which are the three techniques we talked about today, connection,
statistical or event based, which one would be the least
effective against that? And more importantly, why.

Speaker 2 (11:59):
It really comes down onto what encryption does, what it
fundamentally hides from you.

Speaker 1 (12:03):
So to think about
Advertise With Us

Popular Podcasts

Stuff You Should Know
My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder is a true crime comedy podcast hosted by Karen Kilgariff and Georgia Hardstark. Each week, Karen and Georgia share compelling true crimes and hometown stories from friends and listeners. Since MFM launched in January of 2016, Karen and Georgia have shared their lifelong interest in true crime and have covered stories of infamous serial killers like the Night Stalker, mysterious cold cases, captivating cults, incredible survivor stories and important events from history like the Tulsa race massacre of 1921. My Favorite Murder is part of the Exactly Right podcast network that provides a platform for bold, creative voices to bring to life provocative, entertaining and relatable stories for audiences everywhere. The Exactly Right roster of podcasts covers a variety of topics including historic true crime, comedic interviews and news, science, pop culture and more. Podcasts on the network include Buried Bones with Kate Winkler Dawson and Paul Holes, That's Messed Up: An SVU Podcast, This Podcast Will Kill You, Bananas and more.

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.