Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Welcome to the deep dive. Today. We're really opening up
the engine bay of modern investigation. When you think about
high stakes work, you know, cybersecurity, o sind analysis, the
goal is always the same. You've got this ocean of raw,
noisy data, IP addresses, accounts, file hashes, whatever, and you
have to turn it into a clear picture, a structured
(00:22):
picture of connections.
Speaker 2 (00:23):
Yeah, and fast, exactly. And the human mind is great
at spotting patterns, it really is, but it just crumbles
under the sheer volume of gata we're talking about now.
So the real challenge is speed and precision. How do
you find that one specific link, say, between an email
address and a piece of infrastructure, quickly enough for it
to actually matter.
Speaker 1 (00:43):
That is the perfect question. That's exactly what we're diving
into today. We're focusing on the absolute fundamentals that make
these systems work, These automated functions that people call transforms.
We need to get into what they are, how analysts
organize them, and maybe most importantly, the shortcuts they use
every day to keep things moving.
Speaker 2 (00:58):
A transform is well, think of it like a specialized
little micro robot. It takes one piece of data, an
entity like a domain name, and it automatically runs a
query to find out what's connected to it. And the
real skill of an investigator isn't having a thousand of
these things, it's knowing which five to run at exactly
the right moment.
Speaker 1 (01:18):
Okay, let's unpack that engine then, starting with how you
actually run them. It seems obvious, but you have to
select something first, right, Yeah, you need to give it
an input.
Speaker 2 (01:25):
You do, the transform has to start somewhere, and this
is where the design is pretty clever. Whether you write
clicks to get that context menu or you use a
dedicated panel in the application, the software does this crucial
filtering for you. It will only show you functions that
are relevant to the specific type of data you've though acted.
Speaker 1 (01:44):
That filtering is so important. It must cut down on
all the noise. Yeah. So if I have an entity
that's say a person, I'm not going to see options
for for a.
Speaker 2 (01:52):
Reverse DNS look up for example. Yeah, exactly. But if
you select something more complex, like a domain entity, that
list of Livant functions can get really, really long, and
that can be overwhelming.
Speaker 1 (02:03):
So that's where the organization comes in.
Speaker 2 (02:05):
That's where it has to come in to manage those
long lists, you group them into what we call transform sets.
That's your first layer of defense against clutter. And then
there's a second layer that comes in when you start
integrating external data sources, which are often called hub items.
Speaker 1 (02:20):
Oh okay, so if I install a third party threat
intelligence feed, that whole package is a hub item.
Speaker 2 (02:27):
That's it, and the menus will then sort first by
that hub item and then by the transform sets inside it.
It's a hierarchy. It lets you quickly find the tools
from your high quality, trusted sources versus, you know, the
more generic public ones.
Speaker 1 (02:42):
That makes sense, so you can go straight to your
specialized malware package.
Speaker 2 (02:45):
For example, right, and in that malware hub item you
might have specific sets for say, file hashing and network footprinting.
The analyst just knows instinctively where to go, and.
Speaker 1 (02:55):
When you're in those menus, navigating quickly is key. How
do you tell the difference between a folder and something
you can actually run?
Speaker 2 (03:02):
There are visual cues. A group or a set will
usually have a lighter background color, maybe a little plus
icon next to it that tells you to click to
drill down deeper. The actual runnable transform will look different,
usually a much darker almost black background and a little
play icon on.
Speaker 1 (03:21):
The right, so you know, click here to execute, click
the name.
Speaker 2 (03:24):
Or click the icon and it runs. But this is
where we have to put up a big warning sign.
Sometimes you'll see an option to run all the transforms
in a set all at once. It might have a
double play icon. This bulk action is dangerous.
Speaker 1 (03:37):
Wait, really, I would think that's a huge time saver. Yeah,
but you're saying it's a risk. If I run that
on something with a lot of connections, like a big
company's email address, Am I going to get rate limited
or just I don't know, crash the graph?
Speaker 2 (03:50):
You will, precisely. It's the definition of high risk, high reward.
It can spit out hundreds, maybe thousands of new entities
and seconds. If you don't have really robust high limit
data sources set up, you'll hit those rate limits instantly.
And yes, you could overwhelm the system's ability to even
draw the graph. You only use that book function if
(04:11):
you are absolutely certain of the scope of what you're targeting.
Speaker 1 (04:14):
So a single wrong click could cost you hours and
cleanup hours easily.
Speaker 2 (04:18):
Strategic caution is everything there.
Speaker 1 (04:20):
Okay, So, assuming we're being careful analysts can customize this
organization right, create their own.
Speaker 2 (04:25):
Sets absolutely, and they should. Customization is where the real
efficiency gains are. An analyst working on say network defense,
could make a custom set called initial recon and just
drag their top five whois lookups, ip checks and so
on into one bucket. It turns three four clicks into
just one.
Speaker 1 (04:43):
And there are a couple of system generated sets that
help out too, right, I think there's an all set.
Speaker 2 (04:48):
Yeah. The all set is a convenience thing. If you've
got a bunch of different groups. It just shows you
every single available transform for what you've selected, all in
one long list. It ignores the structure for a moment.
Speaker 1 (04:59):
And what about the A favorite.
Speaker 2 (05:00):
Set AH Favorites is special. It's a hidden set that's
based on your own usage. If you start a transform,
you use a lot, it goes in there. But here's
the catch that trips people up. The favorite set only
shows up if you've starred transforms that are relevant to
the specific type of entity you have selected right now.
It's context aware.
Speaker 1 (05:18):
That's a subtle but important detail. And there's one more
shortcut right at the top, something called machines. It sounds intense.
Speaker 2 (05:27):
It is intense and powerful. You'll see it at the
very top of the menu, sometimes in red to make
it stand out. A machine is just a pre configured workflow,
a sequence of multiple transforms that run in a specific order,
maybe with some logic built in. It's the highest level
of automation. One click can set off a thirty step investigation.
Speaker 1 (05:45):
Wow. Okay, so that's the heavy artillery that brings us
perfectly from the transforms themselves to the other tools you
use to manage the workflow, the entity management shortcuts. If
transforms are the weapons, these are the tools that let
you move around the battlefield and clean things up.
Speaker 2 (06:02):
That's a great way to put it. If you look
at that right click menu again at the bottom, there's
a little row of common functions. There are just shortcuts
to things you can find elsewhere, but they're the ones
you use constantly.
Speaker 1 (06:13):
Some are obvious, like cut, copy, delete, but even copy
has a trick up its sleeve. There's an option to
copy as graph mL. What's that about.
Speaker 2 (06:22):
Copying as graph mL? That's graph markup language is huge.
It means you're not just copying the little picture on
the screen, You're copying the entity, all of its data,
its properties. It's links the whole structure to your clipboard.
You can then paste that into a completely different graph
or even another tool, and it maintains all that rich data.
(06:42):
It's about data portability.
Speaker 1 (06:44):
Okay, that's a pro move. What other quick actions are there?
Speaker 2 (06:46):
You have Type actions, which is basically a quick look up.
It'll send the entiti's value straight to Google or Wikipedia
for a quick check, and send to URL, which lets
you push the data to some external service you might have,
and one that's surprisingly useful, clear or refresh images for
things like profile pictures exactly, social media profile pictures, website favicans,
(07:10):
things that change. This tells the system to ignore its
local cash and go grab the latest version from the source.
Just remember that only works if you're not in a
stealth mode, because you're making a direct call out to
that server.
Speaker 1 (07:23):
Got it. Okay, Now for the big ones, the real
productivity boosters. The first one you mentioned is copy to
new graph. This sounds essential.
Speaker 2 (07:31):
It is the single most important tool for keeping your
investigations sane. Imagine your main graph has thousands of entities.
It's complex. Now you want to experiment on just five
of them. If you run some aggressive, slow or risky
transform right there on your main graph. You could clutter
everything or worse, trigger something that stalls your whole workflow.
Speaker 1 (07:50):
So the routine is you select your little cluster of interest,
you copy them to a new clean graph, a blank canvas,
and there you can run anything you want, no consequences,
no risk to the main investigation. That's a game changer,
it is.
Speaker 2 (08:06):
It's an isolation chamber. You do your messy work there
and then the workflow loop closes. Once you find something
valuable in that side graph, you select it, copy it
and paste it back into your main investigation.
Speaker 1 (08:19):
And the system is smart about that. When you paste
it back in, it knows that the IP address you're
pasting is the same one you copied out.
Speaker 2 (08:25):
It does, and it triggers a merge function. It asks you, hey,
I see you already have this entity. Do you want
to merge them? But here's the crucial part. It also
asks you which entities properties should win.
Speaker 1 (08:37):
Ah, So you have to decide if the new information
you just found is more important than the old information.
Speaker 2 (08:42):
You have to you tell the system, yes, merge them,
and make the properties from this new entity the preferred ones.
That ensures your latest most accurate data persists.
Speaker 1 (08:51):
Okay, next life saver change type. You mentioned how important
the entity type is for filtering. What if the system
gets it wrong. Say something comes back as a DNS name,
but you the expert, no, it's really a website.
Speaker 2 (09:05):
And it happens more than you'd think. And if it's
stuck as a DNS name, you might not have access
to crucial website specific transforms like uh, look up exterial links.
Your investigation just hits a wall.
Speaker 1 (09:17):
So change type lets you manually reclassify it. But that
feels like it relies one hundred percent on the analyst's
own expertise. What if you get it wrong.
Speaker 2 (09:25):
That's the risk, but it's a necessary one. You should
only do it when you're sure, when your own external
research confirms the mismatch. When you use it correctly, though,
you instantly unlock a whole new set of tools for
that entity. It lets the human override the machine when
the machine gets it wrong.
Speaker 1 (09:41):
It's a powerful override. Then there's the manual merge function.
This is for cleanup, cure cleanup.
Speaker 2 (09:47):
You have a website entity and a DNS name entity
on your graph, and you know they're the same machine.
You select them both, hit merge and they collapse into one.
It pulls all the links from both into the new
single entity. It's just essential for keeping your graph readable.
Speaker 1 (10:03):
And lastly, a simple one attach just.
Speaker 2 (10:05):
A quick shortcut. It lets you link files, notes, screenshots,
any kind of evidence directly to an entity. You can
even tell it to display an attached image right on
the graph instead of the normal icon, which is great
for visual evidence.
Speaker 1 (10:20):
So if we boil it down, the four big strategic
advantages here seem to be copy to new graph for
that risk free experimentation, change type for unlocking the right
tools when data is misclassified, merge for tiding up duplicates,
and attach for quick evidence logging.
Speaker 2 (10:37):
Master those four routines and you go from just being
someone who looks at data to someone who architects a workflow.
It's about being really precise with your data management so
you can be aggressive with your.
Speaker 1 (10:46):
Collection, which brings us to a final thought for you
to consider. Imagine you found a critical but sensitive piece
of data, maybe a financial transaction ID linked to an adversary.
You need to run a bunch of slow proprietary transform
on it, but you absolutely cannot risk messing up your
primary investigation graph. If you use that copy paste routine
(11:07):
to test and then integrate the results, which two functions
are absolutely mandatory when you paste back. And why is
getting the property preference right so critical for the success
of that whole operation.
Speaker 2 (11:19):
Think about that whole cycle, the isolation, the experiment, and
then the seamless controlled return of that new data. How
do you bring it back home without making a mess
or losing what you already had. It's all about that
controlled integration