Course 12 - Maltego Advanced Course | Episode 3: The Maltego Transform Hub: Finding, Installing, and Utilizing Data Integrations - CyberCode Academy

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
How do you take a powerful base tool, say for
programming or cybersecurity or even deep intelligence analysis, and turn
it from just a robust engine into an unstoppable force.

Speaker 2 (00:12):
It's a great question, and the answer is in better
base code. It's all about strategic data integration. Exactly the
software you start with, I mean, no matter how good
it is, it's fundamentally limited by its own data. To
really get an edge, you have to extend its reach.
You need to pull in these specialized dynamic data feeds
from partners and the whole global community.

Speaker 1 (00:32):
And trying to manage all of that external knowledge needs
some kind of central catalog, and that is our mission
for this deep dive. We're going to explore this centralized
platform where all these extensions live what we're called the
Transform Hub. We're going to unpack what these integrations or
hub items are, how you can evaluate their sometimes complex pricing,

(00:53):
untily complex sometimes and then walk through the actual installation
and maybe most importantly, figure out how you actually use
them once they're in your toolkit.

Speaker 2 (01:02):
This is really the ultimate shortcut to making your investigative
platform truly your own. We're going beyond just the how
two steps here and looking at the strategy behind picking
the right data for what you need.

Speaker 1 (01:14):
Okay, so let's unpack the anatomy of a HUB item.
When we say integration, it sounds like we're just getting
a package of functions, but it's it's a lot more
than that.

Speaker 2 (01:24):
It's so much more. And that's because the integration is
designed to be well holistic. It knows that just having
a new function isn't enough. You need the context around it.

Speaker 1 (01:35):
So what else comes bundled in.

Speaker 2 (01:37):
When you install a hub item. You're not just adding transforms.
You're getting new entities, which are the specific data types right, exactly,
the relevant data types for that integration. Plus you get
custom views, you get something called machines, which are basically
pre built automated workflows. And critically specialized icons.

Speaker 1 (01:53):
Okay, wait, wy are icons so strategic? That just sounds
like a cosmetic detail.

Speaker 2 (01:58):
Oh, it's anything, but especially when you're in a really
complex investigation. Imagine you're mapping out thousands of connections on
a graph. Right, those specialized icons they instantly tell you
visually that this IP address didn't come from a standard
look up. It came from say the cipher Trace integration
or from Showdowan it just it cuts down on the
ambiguity and speeds up your analysis.

Speaker 1 (02:20):
That makes a lot of sense. So let's talk about
finding the right tools in this catalog. If I've got
the transform hub open, how do I cut through what
could be hundreds of options?

Speaker 2 (02:30):
Efficiency starts with knowing your filters. The standard search box
is great, you know, for keywords. If you're looking into
financial fraud, you type bitcoin or crypto and it'll highlight
what you need. Sure, But the structured filters that's your
real strategic advantage, right.

Speaker 1 (02:46):
So you can filter by things like data categories, financial threat, intel,
social media.

Speaker 2 (02:51):
You can filter by the pricing models, which is huge
and we'll get to that, or even by the teams
that might use it. You know, if you're an investigative journalist,
filtering for finance and open source intelligence is a super
fast way to narrow the field.

Speaker 1 (03:04):
And what about sorting, don't forget sorting.

Speaker 2 (03:06):
Sorting by the update date is often my go to
because you want the newest, freshest integrations on top they
got the latest APIs the latest functionality.

Speaker 1 (03:14):
Now below the name of each hub item you can
see who maintains it. Sometimes it's the big organization behind
the platform, but often it's a third party or community member.
Why is knowing the source so important.

Speaker 2 (03:27):
Because it tells you a lot about the tool's focus
and its agility. We see community developers out there creating
and maintaining some of the most popular integrations, like the
ones for have I been plowned or host I own?

Speaker 1 (03:39):
And why do those thrive?

Speaker 2 (03:40):
Well, often they can just iterate faster. They can fill
these really niche investigative gaps that a major vendor might
just overlook.

Speaker 1 (03:47):
It shows you where the innovation is really happening. Okay,
so before we click that install button, we have to
talk about the cost. This is where an investigator has
to pause. Where do we find out what we're committing to?

Speaker 2 (03:57):
You have to look at the details page on any
hub item brings up a long description. It's tags, pricing info,
and really important the contact details for the data provider.

Speaker 1 (04:07):
And sometimes those details pages have visuals that they tell
the whole story. I'm thinking of something like the cipher
Trace integration where it shows you its risk scoring system
for crypto transactions.

Speaker 2 (04:20):
Absolutely one look and you know if it solves your problem.
But let's really dive into those five major pricing models.
They're shown as tags, but you have to remember they
often overlap and that can complicate things.

Speaker 1 (04:33):
Okay, let's start with the two most common ones.

Speaker 2 (04:35):
First up, free, simple enough, zero cost. Second, bring your
own key or b yok. This means you have to
go purchase an API key directly from the external data provider.
The platform is just giving you the connector and.

Speaker 1 (04:49):
That byok, model is everywhere. What's the strategic reason for that?
Why the extra step?

Speaker 2 (04:54):
It's usually about legal and liability management. The platform itself,
it doesn't want to be the billing intermediary. They don't
want to handle the specific legal agreements you need for
certain data.

Speaker 1 (05:04):
Do they want the user to own that relationship directly?

Speaker 2 (05:06):
Precisely, they want you or your organization to have that
direct contractual relationship with the data source.

Speaker 1 (05:13):
That makes sense. Okay, what's the next tag?

Speaker 2 (05:15):
Trial? This lets you have temporary usage, but it's usually
heavily rate limited. You might get say ten transforms an hour,
just enough to prove its value before you have to pay.

Speaker 1 (05:26):
And then there's data bundle.

Speaker 2 (05:27):
Right. If you see the data bundle tag access is
already included as part of your main subscription plan with
the platform. This is zero friction, no external keys.

Speaker 1 (05:36):
Needed, but there's always a butt.

Speaker 2 (05:38):
You often need to reach out to a sales rep
just confirm that it's active on your specific contract.

Speaker 1 (05:43):
Here, got it. And finally, the one that can cause
the most confusion, paid connector.

Speaker 2 (05:49):
This is where you have two layers of cost. You
need to get a key from the external data provider,
and you have to pay a separate fee sometimes to
the platform provider just to use the integration code.

Speaker 1 (06:00):
Hold on, if I'm already paying the data provider for
the key and the platform is just hosting the code,
why the second fee isn't that double dipping? What value
am I getting for that?

Speaker 2 (06:11):
That's the critical question.

Speaker 1 (06:12):
Ask.

Speaker 2 (06:13):
What you're paying the platform provider for is their engineering effort.
You're paying for maintenance, for support and all the API
upkeep to make sure that connection never breaks.

Speaker 1 (06:22):
Okay, so the data.

Speaker 2 (06:23):
Provider maintains the data, the platform provider maintains the seamless connection.
It's really an insurance fee for integration stability understood.

Speaker 1 (06:32):
It's for reliability, not the data itself. A great example
of this overlap is something like far Site DNSTB, which
has tagged both BYO and free Trial exactly.

Speaker 2 (06:43):
You get that limited free trial, but for full access
eventually you have to bring your own key, which you
buy from farsite. Knowing these tags upfront saves you from
wasting time on an item you don't have the budget for.

Speaker 1 (06:53):
So let's walk through installing these, starting with the simplest case,
a completely free item like threat minor for.

Speaker 2 (06:59):
Free or bun items, it's totally frictionless. You just hover,
click install and confirm. The platform grabs the resources, and
you get a little summary saying what was added maybe
forty transforms, twenty new entities.

Speaker 1 (07:11):
Super simple, okay. Next scenario an item that requires a
key right away, like cipher trace.

Speaker 2 (07:16):
If the tag is strictly byok the platform will ask
for the key immediately after you confirm, so before the
installation actually starts, And that prompt is helpful because it
also includes the contact info you need to go get
that key.

Speaker 1 (07:29):
Now. For the trial scenario like that far Site one
where we don't need a key immediately, but we know
we're going to hit.

Speaker 2 (07:35):
A limit, the installation finishes instantly, no key prompt. You
start running transforms, say pivoting from a domain to look
for historic DNS entries. The critical information here isn't on
your graph, It's down in the output window.

Speaker 1 (07:49):
So I'm on a trial. How do I know when
I'm about to hit that limit?

Speaker 2 (07:52):
You have to look for the message that explains the quota.
You'll see a line confirming your usage, something like free
transforms one of twelve runs over the last hour. That
message is your quota tracker. And when I run out,
that's when the system shuts you down. You'll see a
warning message saying you can't run anymore, and it'll usually
have a clickable link to go inquire about buying the

(08:12):
full key. Now here's a crucial tip. If you buy
that key later, you do not need to reinstall anything.

Speaker 1 (08:19):
So how do you input the new key without starting over?

Speaker 2 (08:23):
You just go back to the transform hub, find that
item's details page and click the settings button in the
bottom left corner. That opens a window where you can
just plug in the API key and boom, You're immediately
upgraded to full access. It avoids a whole reinitialization process.

Speaker 1 (08:37):
That is extremely helpful. Okay, so we've installed the new integration,
but then the classic analyst dilemma hits. I installed it,
What entity do I even start with? It's not always obvious.

Speaker 2 (08:49):
That's the roadblock, right, and it's why understanding the input
output logic is the real course skill here. Luckily, you
have two powerful resources to guide you.

Speaker 1 (08:58):
The first one being external doctum.

Speaker 2 (09:00):
Yes, always check the platform's website under their data sources section,
and then check the data integrator's own website. They'll have
white papers, blog posts, and the good ones will have
a direct link on their details page that shows you
exactly how to get started.

Speaker 1 (09:14):
And resource number two is inside the application itself, the
Transform Manager.

Speaker 2 (09:20):
The Transform Manager is the technical blueprint for everything on
your system. You can find it in the transform tab. Now,
the key here is to use the Transform SERVERCE tab.

Speaker 1 (09:29):
Why that tab specifically.

Speaker 2 (09:30):
Because when you expand the list under each server, it
shows you the transform name, it's description, and most importantly,
the exact input entity type it requires. This is how
you reverse engineer your starting point.

Speaker 1 (09:43):
Okay, let's use a real world example. Say we're analyzing
a thread actor and all we have is a URL
they posted. We want to pivot that URL into something
very specific like a showd end host detail entity to
check for open ports.

Speaker 2 (09:55):
Right, and that showdown entity isn't a standard starting point,
so we work back words. We open the transform manager
search for transforms related to showed end host detail. We
find that the transforms that create that entity need an
IP address as input.

Speaker 1 (10:09):
Okay, so the URL is useless directly, we first need
to get an IP address exactly.

Speaker 2 (10:14):
So now we search the transform manager again for any
transform that outputs an IP address, and we find one
maybe called URL to IP or DNS lookup, which takes
a domain entity as its input.

Speaker 1 (10:24):
So we found the chain. You are all to domain domain,
IP address and IP address to showd in host detail.
So how do we get from the URL to the domain.

Speaker 2 (10:33):
We checked the transform manager one more time. We search
for a transform that converts a generic URL entity into
a domain entity. Once we find that, we've identified the
complete path.

Speaker 1 (10:42):
So the practical takeaway for you listening is start with
a URL entity, run the transform to extract the domain
from it, then run the DNS lookup to get the
IP and then you can run those powerful specialized showdend transforms.
The transform manager revealed the whole chain.

Speaker 2 (10:58):
That's the whole insight. Understanding that input out put mechanism
isn't just about finding one piece of data. It's the
foundational knowledge for building your own automated workflows what the
platform calls machines. You're creating a self driving investigative process, and.

Speaker 1 (11:12):
That brings us to our wrap up. This deep dive
has shown that unlocking the full potential of your platform
really depends on mastering these data integrations. In the Transform Hub,
we covered three key resources.

Speaker 2 (11:23):
First, use the external documentation blogs white papers to understand
the use case. Second, you have to scrutinize that Hub
item details page for the pricing tags in the description
to know what you're getting into.

Speaker 1 (11:35):
And third, the internal technical blueprint the Transform Manager, specifically
that Transform Servers tab, which lets you reverse engineer the
required inputs and plan your entire workflow.

Speaker 2 (11:47):
So before we sign off, here's a quick review question
for you. If you install a Hub item that's tagged
with both bio and trial, what two actions do you
have to take to access the data and how might
your usage be restricted?

Speaker 1 (11:59):
At first thing to think about. We focused a lot
today on the starting points, those initial input entities.

Speaker 2 (12:04):
We did, but now that you understand the input output logic,
the fact that entity A has to become Entity B
before it can feed into Transform C, the next step
is to use that knowledge think about how you can
construct your own complex, multi stage automated workflows. Understanding those
entity dependencies, well, that's the real secret to scaling your investigations.

All Episodes

Course 12 - Maltego Advanced Course | Episode 3: The Maltego Transform Hub: Finding, Installing, and Utilizing Data Integrations

Episode Transcript

Popular Podcasts

Stuff You Should Know

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

Dateline NBC

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Course 12 - Maltego Advanced Course | Episode 3: The Maltego Transform Hub: Finding, Installing, and Utilizing Data Integrations