All Episodes

December 3, 2025 16 mins
In this lesson, you’ll learn about:
  • The core entities used in Maltego infrastructure investigations
  • How transforms connect Domains, DNS names, IPs, Netblocks, and ASNs
  • The methodology of Level 1, L2, L3, and XL infrastructure footprinting
  • Key transforms for pivoting forwards and backwards in infrastructure graphs
  • The difference between live DNS, passive DNS, and specialized DNS transforms
Summary of the Episode: This episode provides a structured introduction to infrastructure investigations in Maltego, covering the foundational entities, essential transforms, and the systematic methods used for infrastructure footprinting. It explains how domains, DNS names, IP addresses, Netblocks, and Autonomous Systems interrelate, and how transforms allow analysts to map and attribute online infrastructure. 1. Foundational Entities & Core Concepts Infrastructure investigations rely on a small set of critical entities: Key Entities
  • Domain
    • Public-facing resource
    • Common starting point for discovering related DNS names
  • DNS Name (and variants like Website, NS, MX)
    • Represents a system that can resolve to an IP address
    • Often a gateway to other infrastructure
  • IPv4 Address
    • A central pivot point in investigations
    • Even on shared hosting, IPs remain strong identifiers
  • Netblock
    • A range of IP addresses
    • Useful for clustering infrastructure and linking disparate nodes
  • Autonomous System (AS / ASN)
    • Represents routing ownership over Netblocks
    • Useful for identifying ISPs or large organizations
Other Useful Entities
  • Email Address — often the strongest pivot in broader investigations
  • Port & Service — show server capabilities (SSH, RDP, HTTP, etc.)
  • Tracking Code — connects different websites to the same operator
2. Core Infrastructure Transforms The episode divides standard Maltego infrastructure transforms into functional groups. 1. Domain → DNS Name Methods used:
  • To Website (Quick Lookup) — checks common “www” A/AAAA records
  • To Website Using Domain (Bing) — broader search engine discovery
  • Passive DNS (Robtex/Robex) — historic DNS relationships
  • SPF Transform — extracts DNS names and IPs from email policies
2. DNS Name → IP Address
  • To IP Address
    • Resolves any DNS name to its current IP
3. IP Address → Netblock / ASN Transforms use:
  • Historic Passive DNS
  • Global routing data
  • WHOIS sources (ARIN, RIPE, APNIC, etc.)
Important transforms:
  • Using Natural Boundaries — creates typical /24 IP ranges
  • To AS Number — gets ASN from the Robex database
  • To Company Owner — retrieves organization ownership & location
3. Footprinting Methodology Infrastructure footprinting is a repeatable process across industries. Level 1 Footprinting (L1) Example shown using CIA.gov Steps:
  1. Find all DNS names / Websites for the domain
  2. Resolve all DNS names → IP addresses
  3. Cluster IPs → Netblocks (often with natural boundaries)
  4. Run To AS Number on the Netblocks
  5. Extract ownership using To Company Owner
This reveals which Netblocks actually belong to the organization and allows deeper exploration (e.g., Wikipedia edits from those IPs). Higher-Level Footprinting L2 & L3 Machines
  • Add more depth
  • Use Reverse DNS (PTR lookups)
  • Provide prompts to filter MX/NS results
  • Reveal additional infrastructure through recursive pivots
XL Footprint
  • Uses a completely different strategy
  • Heavy focus on reverse DNS on name servers and SPF-derived IPs
  • Requires significant system resources
  • Most thorough automated footprint
4. Pivoting Techniques Pivoting is how analysts move through an investigation graph. Forward Pivot Domain → DNS Name → IP Address → Netblock → ASN Backward Pivot IP Address → Historic DNS Names → Domains → Tracking Codes
Used to uncover:
  • Hidden assets
  • Legacy systems
  • Connected infrastructures
5. DNS Transform Distinctions Two commonly confused transforms: To Website Mentioning Domain
  • Broad search for any website that references the domain
  • Good for OSINT, not for footprinting
To Website Using Domain
  • Returns websites that end with your d
Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
Welcome back to the deep dive. Today, we are putting
on our digital detective hats. We're diving right into the
foundational layer of digital investigation, which is, how do you
systematically map the entire public facing infrastructure of a huge
organization starting with just one name.

Speaker 2 (00:18):
That really is the core emission, isn't it. We're essentially
trying to give you a shortcut, a way to understand
the complex geography of the Internet, a kind of map.
Exactly if you work in cybersecurity, network architecture, or even
corporate intelligence, you absolutely have to understand this idea of
an infrastructure footprint. Okay, so today we're going to unpack

(00:38):
the essential building blocks, the data entities, and the practical
tools are what we call transforms that you use to
create that complete map.

Speaker 1 (00:47):
And the goal here is I assume efficiency and just
being thorough. We start with one tiny thing like a
company's domain name, right, and we need a reliable system
to get from that to a whole picture of their owner,
their network ranges, every system they run. So before we
connect the dots, we have to define the dots themselves.
What are the say, five fundamental pieces of digital information,

(01:10):
the core entities that hold this whole map together.

Speaker 2 (01:12):
Okay, so we have to start with naming conventions. The
first and most critical pair are the domain and the
DNS name.

Speaker 1 (01:19):
Okay. This distinction is crucial because I think in everyday
conversation we just use them interchangeably. So help us draw
a clear line here.

Speaker 2 (01:26):
Think of it this way. The domain is the conceptual thing.
It's the name your organization, markets and registers like examplecorp
dot com, the brand, the brand exactly. The DNS name
is the name of an actual functional system, a specific
endpoint on the network that you can assign an IP
address to.

Speaker 1 (01:45):
AH. So the domain can be a DNS name, but
not all DNS names are domains.

Speaker 2 (01:50):
You've got it. You might have thousands of other DNS
names like mail dot examplecorp dot com or FTP dot
examplecorp dot com, so.

Speaker 1 (01:57):
All the specific systems running behind that. That makes sense.
And you mentioned there are three specific types of DNS
names we really need to track.

Speaker 2 (02:04):
Yes, the standard website of course, but critically the NS records,
which are the name servers and the MX records for
mail exchange. Those records really define how the domain works
and where it's traffic goes.

Speaker 1 (02:17):
And once we move from those human readable names to
actual network traffic, we hit the big one.

Speaker 2 (02:22):
We arrive at the IPv four address. This is truly
the pivotal node in any investigation.

Speaker 1 (02:28):
Why is the IP addressed such an anchor.

Speaker 2 (02:30):
Point Because it's the common ground. It is the central
piece of evidence. In a digital investigation graph, you might
find multiple DNS names, different websites, different services that all
resolve to the very same IP address.

Speaker 1 (02:44):
Even in shared environments like cloud Flare.

Speaker 2 (02:47):
Even then, that IPA address is still owned and managed
by a single entity, and that's recorded somewhere. So the
investigator's core job is finding that owner.

Speaker 1 (02:56):
So an IP can tell you if two people work
for the same company they just use the.

Speaker 2 (03:00):
Same ISP, or even just share the same VPN exit node.
That one single point can just explode your investigation in
all these different directions.

Speaker 1 (03:08):
Okay, So if the IP addresses like the individual street address,
how do we efficiently group those together to see the
whole neighborhood.

Speaker 2 (03:14):
That's where we introduce the net block. This represents a
continuous range of IP addresses, usually owned or leased by
a single lard organization, and.

Speaker 1 (03:23):
Its value is really visual, isn't it.

Speaker 2 (03:25):
Oh, it's enormous. You might find fifty IP addresses that
seem totally unrelated. One's a website, one's a mail server,
another's some test environment, and.

Speaker 1 (03:33):
They're scattered all over the place.

Speaker 2 (03:35):
Scattered everywhere. The netblock just snaps them all together. It
brings clarity by showing they all belong to one range,
one network owner.

Speaker 1 (03:42):
And finally that brings us to the entity that defines
how all this traffic flows across the.

Speaker 2 (03:47):
Globe, the autonomous system or AS entity. It's identified by
its unique Autonomous system number or ASSN.

Speaker 1 (03:55):
This sounds pretty technical.

Speaker 2 (03:56):
It is, but you don't need to be a routing engineer.
It's all about how massive organizations, corporations ISPs can figure
how traffic gets routed across the Internet backbone. The key
insight is this, looking up the ASN for an IP
address is often the most reliable way to get ownership info.

Speaker 1 (04:13):
More reliable than say a whois registry entry.

Speaker 2 (04:17):
Much more because for the Internet to work, these routing
protocols require clean, definitive organizational boundaries. If your routing is messy,
your network just breaks ah.

Speaker 1 (04:29):
So the data has to be accurate for things to function.

Speaker 2 (04:32):
Exactly that operational need for precision means the ownership data
tied to an ASN is usually far more accurate than
what you might find in some regional IP registry database,
which can be out of date or even have spelling errors.

Speaker 1 (04:45):
Excellent, So we have our five core building blocks. Let's
quickly touch on the secondary entities, the sort of digital
breadcrumbs that can lead to a breakthrough, Starting with the
email address.

Speaker 2 (04:55):
The email address is still fundamental. It's often tied to
domain registration, sure, but it's also a key starting point
for any person of interest investigation. It links all this
corporate infrastructure back to an actual human.

Speaker 1 (05:07):
Even with throwaway accounts being so common.

Speaker 2 (05:09):
Even then, it's a connection.

Speaker 1 (05:11):
Then we have the more technical details, port and service.

Speaker 2 (05:13):
A port, you know, TCP or UDP, is just a number.
It tells you what a server can do. Is it
set up for a remote desktop SSH a website?

Speaker 1 (05:22):
In the service?

Speaker 2 (05:22):
The service is more specific. It includes the port and
the header information that comes back. These are usually the
result of an investigation, not where you start.

Speaker 1 (05:30):
And why do we care so much about that specific
header info which you call the banner?

Speaker 2 (05:35):
Because the banner gives you critical details about their technology.

Speaker 1 (05:38):
Stack.

Speaker 2 (05:38):
It's the HTTP or FTP header that can tell you
the server type, content type, and crucially for cybersecurity, the
exact version number of the software APATCHE two point four
point one for instance.

Speaker 1 (05:52):
And knowing the version lets you immediately check for known
vulnerabilities precisely. Okay, speaking of breadcrumbs, tracking codes seems surprised.
Basically important for linking things that don't seem connected.

Speaker 2 (06:02):
They are fantastic clues. I mean, imagine two websites. They
look completely different, different domains hosted in different countries even right,
But if they both share the exact same Google Analytics ID,
that is a very powerful indicator that they are managed
by the same organization. It connects footprints that a simple
DNS lookup would never find.

Speaker 1 (06:21):
Okay, we've defined our entities are nouns. Let's talk about
the verbs that connect them, the transforms. If we start
with just a domain, how do we efficiently find all
the associated DNS names.

Speaker 2 (06:34):
We use two speeds, basically quick lookups and thorough lookups.
The quick method is fast, but it's incomplete. It usually
just checks for the common www prefix.

Speaker 1 (06:44):
So for a real footprint you need the thorough approach.
You have to and how do you get that level
of thoroughness.

Speaker 2 (06:49):
We leverage search engine APIs. You can run a transform
that uses a search engine to find every single website
that it knows about that ends with your target domain name.
It casts a huge net, pulling.

Speaker 1 (06:59):
Out old microsites, test pages, all the.

Speaker 2 (07:01):
Stuff the organization probably forgot about.

Speaker 1 (07:04):
And then we get to the historic data, which you
mentioned is critical for investigators passive DNS.

Speaker 2 (07:09):
Passive DNS is non negotiable for finding infrastructure. It's not
a live query. It's a massive archive, a database like
roebex that records historical DNS lookups.

Speaker 1 (07:20):
So it's like a history of book of the Internet.

Speaker 2 (07:22):
A very detailed one. When a transform queries this archive,
it can return DNS names that were associated with the
domain years ago. This is invaluable for finding forgotten or
decommissioned systems.

Speaker 1 (07:34):
And we can even filter it.

Speaker 2 (07:36):
Oh yeah, you can filter for interesting variants like names
that contain VPN or webmail.

Speaker 1 (07:41):
That feels like finding an old, dusty blueprint for a
building that was remodeled five years ago. What about technical
lookups that can reveal, say, vendor relationships.

Speaker 2 (07:51):
For that, we need two specific transforms. First, the start
of authority or soa lookup that gives you the primary
name server and usually an email address for whoever manages
the zone, and second second, the Center Policy Framework or
SPF lookup. This is huge. The SPF record lists all
the DNS names and ips that are authorized to send
email for that organization.

Speaker 1 (08:12):
Which instantly reveals third party services they use.

Speaker 2 (08:15):
Instantly, you'll see if they use sendrid for marketing or
a Google workspace for their internal mail. It broadens the
footprint immensely.

Speaker 1 (08:22):
So now we have this giant list of DNS names,
both current and historic. The next step feels pretty straightforward.

Speaker 2 (08:29):
It is the transform is simply to IP address. By definition,
every valid DNS name, website MX record, NS record can
be resolved to at least one current IP address. This
is the bridge from names to the raw network.

Speaker 1 (08:45):
And now we have a big scatterplot of IP addresses.
It's time to pivot and cluster them into those crucial
net blocks. How do we make that jump right?

Speaker 2 (08:53):
The whole investigation relies on the principle that big organization's
own big contiguous blocks of addresses. We have four main
ways to generate those net blocks. One method uses historic
DNS data. If an IP was used as a name
server in the past. That suggests it's part of a
stable important block. Another uses routing information how the IP

(09:13):
is advertised. That can be quick, but sometimes the blocks
are too big or too small.

Speaker 1 (09:17):
What's the most definitive method?

Speaker 2 (09:18):
Checking the whois registries directly ARN in North America, RIPE
in Europe. If a corporation like Nike registraate its IP
allocation correctly, the registry will give you a big, undeniable range.

Speaker 1 (09:30):
In the fourth method, the natural boundaries guess sounds like
a practical starting point.

Speaker 2 (09:33):
It's highly effective. It's just a reasonable guess based on
how address base is usually allocated. The default is to
group by two hundred and fifty six addresses, where only
the last number changes a twenty four range, and for
most companies, that's a pretty strong starting point, a very
strong starting point. It clusters related systems really neatly, and
you can always validate the ownership later with the ASN.

Speaker 1 (09:54):
And once we have that validated netblock, we pivot straight
to the owner YEP.

Speaker 2 (09:58):
We run to as number on the net block and
then to company owner on that ASN. This is the
moment of confirmation.

Speaker 1 (10:05):
And that data is solid because, as you said, the
Internet's routing depends on it.

Speaker 2 (10:09):
It's one of the strongest proofs of ownership you can get.

Speaker 1 (10:12):
This brings us to a key distinction understanding the difference
between current live data and historic data. The contrast between
reverse DNS and passive DNS.

Speaker 2 (10:21):
Right, So, forward DNS is named to IP that standard.
We use passive DNS for historic IP to name lookups.
It's based on observation, so we usually visualize those links
with dotted lines.

Speaker 1 (10:32):
As a reminder that the data might be.

Speaker 2 (10:33):
Old exactly it might not be active anymore. And reverse
DNS that uses PTR or pointer records. This is current
live information that the organization sets intentionally, often for things
like anti spam. It's them telling the world these names
are currently associated with this IP.

Speaker 1 (10:52):
So because it's current and intentional, we link those with
solid lines.

Speaker 2 (10:56):
Solid lines, and this is a critical point. If you
query a single IP, you might get twenty historic names
from passive dns, but only one or two solid, verifiable
current names from reverse dns. The quality is totally different.

Speaker 1 (11:10):
Okay, let's put this all together. Let's walk through a
level one footprint the core manual process, and let's use
a hypothetical big target, something like CIA dot gov.

Speaker 2 (11:19):
Perfect, So we start with that domain entity. Step one
we run discovery. We do the quick website check, the
thorough search engine look up for everything ending in CIA
dot gov, and critically, the passive DNS lookout.

Speaker 1 (11:29):
And now we have a huge list of potential DNS.

Speaker 2 (11:32):
Names, hundreds, maybe both current and historic resolution. We take
that entire list and resolve them all to their current
IP addresses. We see which old names are dead ends
and which current ones map to a fresh set of IPS.

Speaker 1 (11:44):
And step three is where the magic happens. We take
those newly resolved IPS, which are just scattered all over
our graph, and we run the to net block using
natural boundaries transform.

Speaker 2 (11:54):
Yes, this is the discovery moment. Suddenly you see this
massive clustering effect. Some of those IPS will just snap
together into one or two defined blocks like one ninety
eight point eighty one point one two nine dot x, and.

Speaker 1 (12:07):
That concentration immediately confirms their core internal network.

Speaker 2 (12:11):
It does the few IPS that don't cluster, they're usually
third party services like CDNs or external email gateways.

Speaker 1 (12:18):
And step four is the final confirmation.

Speaker 2 (12:20):
We select that main central netblock, get the as number
and look up the company owner. We confirm the ASNSA
seven to forty six is directly associated with them. That
confirms not just the ips, but who's responsible for routing
that network.

Speaker 1 (12:33):
And with that one process, we've defined the absolute verifiable
core of their network. Now we can use that to pivot.
Let's say we want to answer a totally different question,
like which Wikipedia pages have been edited from this network.

Speaker 2 (12:45):
This shows the value of doing the footprinting. First. We
take that confirmed netblock, expand it to all two hundred
and fifty six individual IP addresses it contains, and then
run a very specific transform on that whole list to
Wikipedia page edits. By defining the perimeter first, we can
ask these hyper specific questions about behavior from inside that boundary.

Speaker 1 (13:06):
That is incredibly powerful, right, But for a really massive organization,
that manual level one process sounds tedious. This is where
automation these machines come in.

Speaker 2 (13:17):
A machine is just an automated script that runs that
whole level one process for you. If you run on
a huge domain like LinkedIn dot com, the graph is enormous,
thousands of entities.

Speaker 1 (13:26):
How does an analyst even begin to manage that sheer volume?
Of data.

Speaker 2 (13:30):
You need powerful analysis techniques. You use an organic graph
layout to help cluster things Visually. You can size the
entities by how many links they have, so the most
important NetBlocks just pop.

Speaker 1 (13:40):
Out AGA and lots of filtering.

Speaker 2 (13:42):
I imagine heavy, heavy use of bookmarking and filtering. You
might bookmark all your confirmed net blocks and red their
company names in purple, and then just delete everything that
isn't connected to those critical bookmarks. It turns chaos into
a focused investigation.

Speaker 1 (13:57):
And they're even higher level footprints for when level one
isn't quite thorough enough.

Speaker 2 (14:02):
Yeah, we call them L two and L three. They
increase the search breadth, but they also bring in more noise.
So to manage that, they often have interactive prompts.

Speaker 1 (14:11):
So the machine will pause and ask you.

Speaker 2 (14:13):
A question exactly. An L two machine might find twenty
historic MX records and pause to ask you, the investigator,
which of these are actually relevant before it continues.

Speaker 1 (14:23):
And what's the big difference with L three.

Speaker 2 (14:25):
L three is all about advanced pivoting using that live
solid line data we talked about. Once it finds the
core net blocks, the L three machine performs a reverse
DNS look up on every single IP address in those ranges.

Speaker 1 (14:37):
Wow.

Speaker 2 (14:37):
Why because of the initial forward lookups only found the
public names. By checking every single IP backwards from IP
to name, L three can uncover non public names, internal
staging servers, test environments, things only visible via their PTR records.

Speaker 1 (14:53):
And for the really huge multinational targets, there's the footprint
Excel tool.

Speaker 2 (14:58):
That is the nuclear option. It's incredibly time consuming, it
can take hours and it needs a lot of computing power.
It's designed for exhaustive detail, focusing on name, server and
SPF records to find the absolute broadest IP range is possible.
And then it does those massive reverse lookups across everything.

Speaker 1 (15:17):
For when you need every obscure internal server that ever existed.

Speaker 2 (15:20):
That's what it's for.

Speaker 1 (15:21):
This has been really a complete practical tour of the
infrastructure landscape. We've defined the core entities, domain, IP net
block as man and seeing how transforms let us pivot
between them.

Speaker 2 (15:32):
You know, the biggest takeaway for any analyst should be this.
The investigation isn't just about collecting data. It's about understanding
the function of each of these pieces and mastering the
art of pivoting.

Speaker 1 (15:41):
That ability to move seamlessly from name to IP and
then backwards from IP to name or to historic data.

Speaker 2 (15:48):
That's the definition of success in this field.

Speaker 1 (15:50):
Absolutely. The graph only becomes powerful when you know how
to use that IP address as a shared node to
pivot in totally unexpected ways. So here's a final thought
for you, Tom moll over. If you start an investigation
with an email address you found on an old, historic
passive DNS record, how could you use that context pivoting

(16:11):
through ips and NetBlocks to see if that old domain
owner is secretly connected to a brand new, unlisted organization
you didn't even know you were looking for.

Speaker 2 (16:19):
Connecting the old footprint to the new. That is the
power of digital persistence.

Speaker 1 (16:23):
Indeed, thank you for joining us on this deep dive
into infrastructure footprinting. We will catch you next time.
Advertise With Us

Popular Podcasts

Stuff You Should Know
My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

My Favorite Murder is a true crime comedy podcast hosted by Karen Kilgariff and Georgia Hardstark. Each week, Karen and Georgia share compelling true crimes and hometown stories from friends and listeners. Since MFM launched in January of 2016, Karen and Georgia have shared their lifelong interest in true crime and have covered stories of infamous serial killers like the Night Stalker, mysterious cold cases, captivating cults, incredible survivor stories and important events from history like the Tulsa race massacre of 1921. My Favorite Murder is part of the Exactly Right podcast network that provides a platform for bold, creative voices to bring to life provocative, entertaining and relatable stories for audiences everywhere. The Exactly Right roster of podcasts covers a variety of topics including historic true crime, comedic interviews and news, science, pop culture and more. Podcasts on the network include Buried Bones with Kate Winkler Dawson and Paul Holes, That's Messed Up: An SVU Podcast, This Podcast Will Kill You, Bananas and more.

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.