All Episodes

March 19, 2025 • 54 mins

Send us a text

The story of cloud networking rarely gets told from the perspective of those building it inside unicorn startups, but that's exactly what this episode delivers. Richard Olson, cloud networking expert at Canva, takes us behind the scenes of building network infrastructure for one of the world's fastest-growing SaaS platforms.

Richard's fascinating career journey began with literally throwing rocks with phone lines into trees during his military service, progressing through network operations centers and pre-sales engineering before landing at AWS and eventually Canva. His unique perspective bridges traditional networking expertise with cloud-native development approaches.

Unlike enterprises migrating from legacy environments, Canva started entirely in the cloud with minimal networking considerations. Richard explains how this trajectory created different challenges - starting with overlapping 10.0.0.0/16 addresses across development environments and evolving to hundreds of VPCs requiring sophisticated connectivity solutions. By mid-2022, these networking challenges had grown complex enough to warrant forming a dedicated cloud networking team, which Richard helped establish.

The conversation takes a deep technical turn exploring Kubernetes networking challenges that even experienced network engineers might not anticipate. Richard explains why "Kubernetes eats IP addresses for breakfast" in cloud environments, detailing the complex interaction between VPC CIDR allocations, prefix delegations, and worker node configurations that can quickly exhaust even large IP spaces. This pressure is finally creating compelling business cases for IPv6 adoption after decades of slow uptake.

Whether you're managing cloud infrastructure today or planning your organization's network strategy for tomorrow, this episode offers invaluable insights into the evolution and challenges of cloud networking at unicorn scale. Listen now to understand why companies are increasingly forming dedicated cloud networking teams and the unique skill sets they require.

Connect with Richard:
https://www.linkedin.com/in/richard-olson-au

Purchase Chris and Tim's new book on AWS Cloud Networking: https://www.amazon.com/Certified-Advanced-Networking-Certification-certification/dp/1835080839/

Check out the Fortnightly Cloud Networking News
https://docs.google.com/document/d/1fkBWCGwXDUX9OfZ9_MvSVup8tJJzJeqrauaE6VPT2b0/

Visit our website and subscribe: https://www.cables2clouds.com/
Follow us on BlueSky: https://bsky.app/profile/cables2clouds.com
Follow us on YouTube: https://www.youtube.com/@cables2clouds/
Follow us on TikTok: https://www.tiktok.com/@cables2clouds
Merch Store: https://store.cables2clouds.com/
Join the Discord Study group: https://artofneteng.com/iaatj

Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
And it was the same at Canva.
So most enterprises have beenfairly large for a long time.
They're fairly mature.
They've been through thatprogression from the 90s through
the 2000s and ITIL and variouswaterfall turning into agile
method, all that sort ofenterprise type stuff.

Speaker 2 (00:17):
ITIL is my trigger word.
Man Don't say that.

Speaker 1 (00:19):
I've triggered him.
Oh sorry, Change management,release management, that sort of
stuff, you love it.
There you go.

Speaker 3 (00:24):
Oh sorry, yeah change management, release management,
that sort of stuff.
You love it.
There you go.
He's like the Australiancandidate.

Speaker 2 (00:43):
Hello, hello and welcome back to another episode
of the Cables to Clouds podcast.
My name is Chris Miles at BGPMain.
When you say Blue Sky Handle,do you say the dot-com piece?

Speaker 3 (00:55):
Should I not say that I don't.
Usually I just use the partBecause when they search for you
it's going to be at BGP Main.
They're not going to type inthe whole thing, right?

Speaker 2 (01:01):
Yeah, fair enough, All right.
So at BGP Main, on Blue Sky,and as you may have heard, you
know, my co-host rudely startedtalking before I introduced him,
which he's not supposed to do,but you know, we'll give him a
pass today.
My co-host today is TimMcConaughey, as always at Juan
Golbez.

Speaker 3 (01:20):
Wait, no, it's not Juan Golbez.
No, it's not Wonko Biz.
No, add Carpe DMVPN, carpe dashDMVPN.
Dude, you're killing it.
He's a man of many personas.

Speaker 2 (01:27):
You know, just search for him.
You'll find him out there.
He's out there somewhere.
Today we have a very funepisode.
I'm very excited for this one.
So we're going to kind of goback to our roots on this one.
We're going to go back to a bitof storytelling.

(01:53):
You know we love hearing, youknow, engineers, stories and
things like that.
So I've come in contact with agreat individual who's joining
us today, richard Olson, who iscurrently at Canva, and I think
he has a very interestingbackground and very interesting
story to tell about.
Kind of.
You know, we're a niche podcast.
We talk about cloud networking.
You know it's about as niche asit gets, and I feel like
Richard has had one of the mostniche roles in that particular
context.
So I think it'd be good to kindof talk to someone that's

(02:16):
dealing with it day in and dayout.
So, yeah, welcome Richard,thanks for joining us and, yeah,
glad to have you on the pod.

Speaker 1 (02:23):
Thanks for having me, Chris, and I have to say you've
called me niche in a verypolite way.
That's the first time.

Speaker 2 (02:29):
Very special man.

Speaker 3 (02:30):
It's like nerd, but like more polite.

Speaker 2 (02:32):
Yeah, there you go, but yeah, so let's just hop
right in.
So, richard, tell us a littlebit about you know, kind of your
background and how you got tothe point that you are now with
Canva.

Speaker 1 (02:45):
Yeah, kind of your background and how you got to
the point that you are now withCanva.
Yeah, sure, thanks.
So, as you mentioned, I work atCanva and I should probably
just introduce the company andproduct for those that haven't
heard of it.
It's a SaaS application, it's adesign application and I guess
Canva's goal is to empower theworld to design and make it
accessible to everybody.
So it's a really cool,interesting story.
I'd encourage readers to golook into it.

(03:07):
It's been covered in variousplaces.
But another cool thing aboutthe company is that the
company's got a two-step plan.
Step one become the mostvaluable company in the world or
one of the most valuablecompanies, and step two, do some
good with that, do the mostgood that you can, and that's a
really cool ethos which I'm I'mon board with.
So, yeah, super happy to bewhere I am and to chat to you

(03:28):
guys today.
I guess from from here I've gota long story.
It's I've been doing this sortof thing we're not cloud, but
this sort of thing for 20 ishyears.
I actually started my career init, literally throwing rocks
into trees, so that, which whichsounds really funny, that that
was literally my job.
Uh, so I was in the military atthe time, in the army, and um,

(03:49):
there's no internet or mobilephones, or any four phones for
that matter, in the bush.
Uh, so you've got to make dowith what you've got.
So I would take a phone line, afield phone line.
I would get some electricaltape, tape that around a rock,
find a tree that looked at aboutthe right height and throw the
phone line through the tree andrun phone lines in the bush,
cause that's a lot quicker thandigging.
So that was that was my start,and from there I guess, pivoted

(04:15):
into networking.
We did a little bit in my rolein the military.
We did what was called a basicrouter course and I think I
learned on a Cisco 2500 series.
That's probably dating me a bitnow.

Speaker 2 (04:27):
I think they were pretty old by the time I used
them.
I learned on that as well, youlearned on that as well.

Speaker 1 (04:30):
They were great boxes , yeah, they were Really good
the octal cables for yourout-of-band management, and so,
from there, I discharged fromthe military and I managed to
land a job in.
I discharged from the militaryand, um, I managed to land a job
in.
I briefly transitioned throughthe public sector, but I I
managed to land a job in anetwork operations center.

(04:51):
So I was one of those shiftguys in in those rooms that you
see in the movies, you know,with the tvs wall to wall, and,
um, we had this fancy uh frostedglass as well, so we would
bring customers into the roombehind us and then unveil the
knock to the customers, whichwas this, this, this cool bit of
uh theater there, but, um,that's crazy to know that so
many knocks are the same, thatthey want to showcase it like
that, like I.

Speaker 2 (05:11):
I didn't know that everyone was like that, but my,
my first knock that I worked inwas the same way that they had
something that we called thefishbowl, which was like that,
exact, like private view.

Speaker 3 (05:19):
You come in and you see the whole thing and if we
wanted to do something.

Speaker 2 (05:20):
you, you come in and you see the whole thing, and if
we wanted to do something, youknow, if someone was, you know,
kind of a class clown, so to say, we'd put them in the back so
they weren't viewable from thefishbowl perspective.
But yeah, that's so funny.

Speaker 1 (05:33):
Yeah, it's kind of weird being the center of where
everyone can see, isn't it?
And I'll be honest, we didn'tlook at the big screens at the
front that much.
I mean they were probably theywere too far in a lot of cases
to actually see anything, but itwas cool, having you know big
network maps.

Speaker 2 (05:46):
That's what it was.

Speaker 1 (05:47):
Do you remember Cacti and the weather?
Map plugin, you'd build thesecool maps and it would show you
like the throughput goingbetween certain places with,
like you know, hot, cold colors.

Speaker 2 (05:56):
Yep Very familiar with Cacti.
I, I do remember it.

Speaker 1 (06:01):
I don't think PHP is cool anymore, is it?

Speaker 2 (06:02):
I'm not sure.

Speaker 1 (06:03):
No, probably not.
But yeah, from there I guess Iworked up through the ranks.
I started off on shift that waspretty tough, doing night
shifts, but moved more into,we'll say, like a level three
support and then progressedthrough design and a bit of
architecture.
Most of my customers back thenwere sort of customers that
needed a higher degree ofsecurity.

(06:24):
So lots of air-gappedmanagement networks, lots of
crypto.
I was probably I get thefeeling I was one of the few
people that used GetVPN orGroupVPN, if you're familiar
with that one.

Speaker 3 (06:33):
I really liked it, but DMVPN which one.
Sorry, get was good, but you'reright, I don't think it was
widely accepted or deployed Likeif you had it, it was great.
But DMVPN you were about to saywas way more interesting, way
more prevalent, yeah.

Speaker 1 (06:49):
Having that overlay network gave people some sense
of security.
Your internal IP addressingschema is hidden from the
untrusted underlay networkwhatever that was when GetVPN.
You had native RADIC failover,which I think was one of the key
advantages.
You didn't have to wait for forthat IPsec renegotiation.
But, um yeah, so from there, Iguess, being being air gapped,

(07:11):
that created a lot of challenges, because I worked for a
reasonably big um, the MSP sideof a reasonably big telco, and
they had a lot of tools to helpthem scale, like they were
managing hundreds of thousandsof devices we're not talking,
you know, a couple of hundredhere and there and not having
access to those tools on theAirGap network presented a whole
bunch of problems.
And so, from my perspective, Ihad to replicate these tools to

(07:34):
some extent, and there was aparticularly big customer that
we had at that time, I think the800-odd sites, a couple of
thousand devices, and I had thisenormous spreadsheet that was
being managed by hand by thedesign team.
Input validation was not athing, this was Excel, so it was
lots of rough data, and I hadto turn that into literally
thousands of devices of what wecall pre-configuration or you

(07:56):
know, the sort of bootstrapconfig so that the tech in the
field would get connectivityback to you to dump the actual
end state config on.
And so I actually started tolearn how to code.
Back then I didn't have a codingbackground.
I went to university to dobasically IT telecommunications,
so there wasn't much of thecomp sci programming side and I
hadn't done much development.
So I started out in Python.

(08:18):
Python was pretty cool backthen.
I think this was circa 2010.
And it was in that awkwardphase of Python 2 to Python 3
transition.
Yeah, I was going to say Stillcool, still cool.
Still cool.

Speaker 3 (08:28):
I like it.

Speaker 2 (08:30):
It's still real to me , damn it, you know.

Speaker 1 (08:32):
I like.

Speaker 3 (08:33):
Python still.

Speaker 1 (08:33):
It's a great language .
I mean, like everything prosand cons.
I can work really fast inPython.
Maybe it's not the best ateverything.
Yeah, true, so thing, um, soand yeah, from from there I sort
of started to learn how to codeand that sort of set me.
I mentioned the timeframe, 2010,.
That set me on this journey of,I guess, getting into network
automation.
I actually say I got intonetwork automation before.

(08:54):
It was cool and I had this, um,I had this opportunity to uh do
a little bit of working travelthrough uh Europe while my
partner now wife had a briefsecondment over in the UK and I
was going from cafe to cafesmashing lots of coffee.
But I had this opportunitywhere I had maybe a month of a
clean slate just to work on acool project.

(09:14):
And this was around the timethat Docker was starting to
become cool, Ansible, wasstarting to really gain some
traction, and so I spent thatmonth of this working out of
cafes just learning the variousbits and bobs that eventually
started to form the foundationsof network engineering and the
tools that we use.
So that was a really greatopportunity that I had there.

Speaker 2 (09:34):
This is before Docker , so to say fumbled the bag, as
they say.

Speaker 3 (09:38):
Yeah, it was still the most popular container at
the time.

Speaker 1 (09:42):
Yeah, I'm definitely not an expert in how to
commercialize and and make yourstartup a monetary success, but
it was your big goodness, wasn'tit?
Everyone used it.
I think it was just hard forthem to make a dime out of it to
monetize it.

Speaker 3 (09:56):
Yeah, ultimately.

Speaker 1 (09:56):
Yeah, I mean we've seen parallels with things like
yeah, terraform, for exampleterraform cloud was the
money-making engine.
Um, I don't know how many peopleuse Terraform Cloud, but
certainly Terraform iseverywhere, right, yep, how do
you monetize that?
But yeah, from there, that waskind of my we'll call that the
peak of my project design phase.
I was definitely hands on toolsthen rolling out lots of pretty

(10:20):
large scale networks.
But then from there, I had afriend of mine who introduced me
to a pre-sales manager at F5and we hit it off and next
minute I'm working through thispre-sales phase of my life.
So I worked for a couple of UStech vendors there, started with
F5, followed by Juniper, and itwas at this time that network

(10:43):
sort of automation arc reallystarted to mature.
Sd-wan was starting to become areal thing and that was a large
part of my role in thatpre-sale space.
But I wanted to keep refining mycode skills and around 2015,
2016, I had this opportunity tostart working with the cloud,
particularly Amazon.
One of my customers wanted torun a number of proof of

(11:03):
concepts in different clouds, soI had the opportunity to
connect.
I think it was three clouds atthe time, definitely Amazon and
Azure and I had this taste and Iwanted to learn more.
So I actually was superfortunate in around 2021, I
think it was to get offered asolutions architect role at AWS
and that was.
That was really cool.

(11:23):
So I definitely stepped awayfrom my comfort zone networking
and moved into the world ofcloud and I was a generalist at
this stage and I think everybodyon the call knows how many
products Amazon have.

Speaker 3 (11:38):
So that was intimidating.

Speaker 1 (11:39):
Yeah, and what would the you know, the fair enough
Amazon had this expectation thatI think they called it level
200.
You'd be level 200 across allof those core products and I I'm
as you said so politely before,chris I'm kind of niche.
I knew the networking like theback of my hand.
But yeah, all the compute, thedata, um, what's an S3 bucket?
All those sorts of things, Imean not quite to that level.

(12:07):
Ai was just starting to becomea thing now too.
So that was part of the role.
But and one of the really coolthings that happened when I was
at Amazon was reInvent.
2021 to me felt like the yearof networking, like just a whole
bunch of things started tohappen.
Ipv6 was getting taken moreseriously.
The CloudWan was in preview.
I think November, december-ishthat year became GA the
following year.
Private links started to gain alot of maturity.

(12:28):
Just a whole bunch of reallyinteresting features started to
drop.
Around that time I think VPCLattice may have been dropped.

Speaker 2 (12:35):
The following year as well, I think it was 2022.

Speaker 3 (12:38):
2022,.
Yeah, was it.
Yeah, I think that's right.

Speaker 2 (12:40):
I think you're wrong, but yeah.

Speaker 1 (12:42):
To me it felt like peak cloud networking at that
time.
I'm like yes this is a thingand this is still really, really
interesting, because, I don'tknow, to me just having some
VPCs and peer links is not thatexciting.

Speaker 3 (12:55):
It's just Fisher-Price like very, very
basic networking.

Speaker 2 (12:57):
I think it's.
I mean that probably was.
I don't know if that time willcome back, but Tim and I just
got back from reInvent a fewmonths ago and there was not
much networking content.

Speaker 3 (13:05):
I'll tell you that they're back to developer and AI
, and that's well it was.
Ai, everything right, but therewas definitely more developer
focus.

Speaker 1 (13:11):
Well, ai is so hot right now.

Speaker 3 (13:12):
Yeah, they've definitely the pendulum swung
back the other way for sure.

Speaker 1 (13:17):
Yeah, definitely, but it was around this time where I
was kind of thinking to myselfI love being a generalist.
Around this time where I waskind of thinking to myself, I
love being a generalist.
This exposure, broad exposureto like literally everything in
the cloud has been really,really beneficial for broadening
my skill set.
But I don't want to lose touchwith my roots.
I think I've got a reallyvaluable deep set of skills and

(13:39):
now I can combine that with thisnew environment, this new world
called the cloud.
And I was just starting tothink about this and I really
enjoyed working at AWS.
But then one day in my emailbox I got two notifications and
two different companiessimultaneously, I think within a
day, were advertising for acloud networking engineer and
I'm like cloud networking, thatsounds like what I want to do,
like I was thinking aboutbecoming a network specialist to

(14:01):
go down that path at AWS.
But I'm like, oh, this is anopportunity to jump back into
the project deep end, startdelivering cool things, have
that satisfaction of buildingsomething as well as doing what
I think I really want to docloud networking.
And I was fortunate enough toget an interview, an initial
conversation with recruiters foreach of those companies and

(14:21):
during that interview I asked avery specific question.
Both of these companies werevery developer centric and are
you looking for a developer whocan do a little bit of
networking or are you lookingfor someone who's done a lot of
networking who can do somedevelopment work?
And in the case of Canva, whowas one of these two companies,
they were looking for the latter.
They wanted someone who can donetworking with some development

(14:42):
.
And, yeah, the rest is history.
So I've been at Canva now forabout two and a half years,
maybe a little bit over and Istarted out.
So Canva maybe we'll talk aboutthis soon but Canva didn't
actually have a cloud networkingteam and the day I started with
another coworker of mine, thecloud networking team was formed
.
We joined someone who was kindof looking after networking at

(15:03):
the time and we became a team ofthree, the cloud networking
team, and I've spent some timeas the coach of that team, which
is our term for management, andnow I'm in a technical
leadership sort of individualcontributor position and I'm in
that same space.
I'm predominantly looking atthings like networking and, just
as important, the intersectionof networking with applications

(15:25):
and getting compute and thatsort of thing.

Speaker 2 (15:28):
So that's my story.
Yeah, I think that's the thingis like.
Well, I found that sointeresting and when we first
met Richard, I was kind ofsurprised at how kind of I guess
I will say mature the cloudnetworking team was at Canva,
Because I mean, we're like yousaid, you've been on the cloud
networking team for two and ahalf years now.
I feel like most companies thatI talk to are just now starting

(15:50):
to kind of spitball on whetheror not they form a cloud
networking team, right, it'sbecoming a thing which is good,
you know, good for all of us,good for, hopefully, this
podcast, but we'll see.
We need to break out a big nicheyeah there you go.
We need to be run rate prettysoon.
Um, but yeah, that's that's.

(16:12):
That's super, super interestingstory, like, especially because
you went cause you, you started.
The first first position youhad was at enterprise.

Speaker 1 (16:17):
Yeah, that's right At Telco MSP, so looking after
various enterprises.

Speaker 2 (16:21):
Cause I like I for one, I've never worked
enterprise Um and uh, variousenterprises.
Because I like I for one, I'venever worked enterprise and,
like, from what I've learned, Ilike a very you, yeah, but like
that's the thing is, I feel likeI've heard 10 times more horror
stories about working inenterprise than than somewhere
else.
But maybe this kind of speaks towhat you were talking about
before with you know kind of thecompany that Canva is and kind
of the ethos that you're workingin, which sounds pretty cool.

(16:43):
But one of the things that Ithought was really cool is Canva
is, you know, kind of operating, you know by the industry term
as a tech unicorn, right, andthey're kind of this born in the
cloud company, and I don'tthink you guys have ever had a
physical data center across theglobe, necessarily.
It's all been 100% clouddeveloped and cloud deployed,
right.
So I mean, from thatperspective, what kind of

(17:04):
problems do you think, like bornin the cloud companies
typically have to deal withcompared to you know,
traditional ones?
Like, do we all end up kind ofsuffering the same and bleeding
the same blood, or is thattrajectory a little different?

Speaker 1 (17:17):
It's a big spag.
So it's a bit of both that I'vefound.
So we do end up suffering someof the same things, uh, but some
is is definitely unique to tobeing here.
So I guess, uh, yeah, you'reabsolutely right where we're 100
in the cloud, in the sense thatwe don't have any data center
footprint or infrastructure.
I mean, you know the office hassome internet connectivity and

(17:37):
wi-fi and that sort of stuff ofcourse, but we're not like a
classic enterprise where youknow you might have your
headquarters with two bigrouters and a whole bunch of
campus switches and, you know,maybe you'll have your dark
fiber lit up to your datacenters and this SD-WAN branch
construct.
There's nothing like that.
There Everything is in thecloud.
So I think, yeah, as I said,ultimately we're going to have

(18:00):
the same problems, but westarted in a very different
place.
So I think everyone who'sworked in networking maybe
unless you're maybe a startupISP or a startup you know
network technology specificcompany.
I think everyone that works innetworking knows that people
aren't going to need yourspecialist skills until you get
to a certain scale, and it wasthe same at Canva.
So most enterprises have beenfairly large for a long time.

(18:21):
They're fairly mature.
They've been through thatprogression from from the 90s
through the 2000s and you knowitil and various waterfall
turning into agile method, allthat sort of enterprise type
stuff I tell my trigger word mandon't say that I've triggered
him, oh sorry.
Yeah, uh, change management,release management, that sort of
stuff you love it.

Speaker 3 (18:39):
There you go.
He's like the uh candidate.

Speaker 1 (18:45):
Yeah, it's pros and cons of every approach.
Right, it certainly aimed tostop the cowboys and I think it
definitely stopped the cowboys.
But the way Canvas started, itwas a very, very small group of
people who had this vision tobuild an application.
And these people weren'tthinking about, yeah, what's my
IP address schema going to looklike and what's the fastest ASIC

(19:05):
that I can chuck into a switchin a rack.
These were not problems thatthey had.
Their problems were buildingthis app, getting to MVP,
building features, building thecustomer base and that sort of
thing.
So, in the early days as well,cloud, for one of the massive
benefits I think everyone canagree about cloud is that it
makes it so easy to start outwith things.
So you build a VPC, you chuckin a couple of subnets with

(19:27):
availability zones Like theeffort you used to have to go to
in the 90s and early 2000s toachieve that was enormous.
And now we can just click a fewthings or, better yet, use
infrastructure as code.
And so, yeah, we started out.
I think it was originally oneVPC, which quickly grew to three
for different developmentenvironments, and I guess maybe

(19:48):
you can feel where this is going.
Each of these VPCs were giventhe same prefix 10.0.0.0.16.
So personally, I've never comeacross that sort of a thing in
enterprise the prevalence ofoverlapping IP addresses.
But this is part of the growthand maturity that anyone would
go through, I think.
So from my perspective, great,low hanging fruit.
This is an easy challenge.
Let's fix it.

(20:08):
But from there I mean thewizard company were very
fortunate.
The product was successful,customers started coming on
board and there was this periodof really really rapid growth.
And rapid growth means that theteams are getting bigger, the
number of services that we haveare getting bigger and while a
lot of the application-specificservices lived in these VPCs,

(20:30):
other VPCs were created aroundthem to form part of the process
whether that's part of ourtooling, internal services and
that sort of thing.
And now we're getting maybe ahandful of VPCs let's say a
dozen or two and the problemthen becomes okay.
How do of VPCs, let's say adozen or two?
And the problem then becomesokay how do these VPCs
communicate?
And I think in the cloud.
We talked about VPC peeringjust before.
It's boring, but it works rightand it's fast and yeah, that's

(20:54):
one option, but another optionyou can take you don't even need
to think about networking.
We've got the internet and it'strivial to connect the internet
to VPCs, create public-facingload balances and establish some
degree of connectivity betweenyour VPCs, and you still don't
need a networking team like loadbalancing, http, web services,
tls.
These are things thatdevelopers are very, very

(21:16):
comfortable with, so still don'tneed a networking team.
But what happened then was wejust kept growing, so we got
more engineers, services startedto become more specialized, the
teams themselves were gettingbigger, and when I joined
mid-2022, I think we had aroundabout 100 VPCs and it was really
starting to get to this pointwhere managing IP address

(21:39):
conflicts was gettingchallenging.
Connectivity between thingswithout having to build out all
of these public facing loadbalances and services, and the
cost associated with those,started to become really
challenging, and so the cloudnetworking team was born, and
I'm very happy that it was born.
It's been great.
Yeah, that's awesome.

Speaker 2 (21:56):
Yeah, so you got in on essentially at the ground
floor, I assume, on the cloudnetworking team.

Speaker 1 (22:00):
Yeah.
So it was, oh yeah, a couple ofhundred VPCs, not really much
connectivity between them, ahandful of VPC peer links here
and there.
Like I didn't know what toexpect when I was joining.
It could have been, it couldhave been anything.
I had no idea how mature theywere and it was a absolutely
fantastic opportunity.
Because here we are, as you say, ground floor we've got to
build a network.

(22:21):
Awesome, I know how to do this.
I've done this before.
Let's do it.
And so you know, one of thefirst things we did was look at
technologies like transitgateway in AWS particularly.
Uh, that certainly makes it alot easier to connect a whole
bunch of VPCs, and you know afull mesh of VPC peer links and
that sort of thing, and and wegot going from there and, and
you know other things egresscontrols rolled into this, dns

(22:43):
has been rolled into this, andso we went through a pretty
rapid period of a lot of bigwins, like going from no
connectivity to almostubiquitous connectivity is a
pretty cool thing.
And then, from there, a coupleof other things happened as the
business itself started tomature.
Let's look at thisinfrastructure stack.
What can we do to align thiswith best practices so that
we're not inhibiting growth inany way, making sure that we're

(23:07):
agile so the developers can dowhat they need to do, and things
like the Amazon AWSmulti-account started to or best
practices started to, sink in.
So what are we doing to ensurethat our services are separated
according to these bestpractices?
Iams it's not bad, but itcertainly has a point where you
want to move to a hard accountboundary, where there's an

(23:28):
explicit lack of access betweentwo different resources.
And so from there, great, we'vebuilt a network, we've got a
couple of VPCs.
This is awesome.
How are we going to cope withpossibly thousands and thousands
of AWS accounts?
You know this could be asgranular as one per microservice
.
Is this, uh?
You know, an account per lambda?
How are we going to cope withthis?
And so we went down this pathof, uh, looking at shared vpcs,

(23:51):
and so I think it was.
Was it reinvent 2022?
Maybe I think it was.
Netflix was up on stage talkingabout, uh, shared vpcs and
amazon multi-account, how thatintersected with networking yeah
, right.
So, uh, we went down that pathand, uh, it's been pretty
successful, like, certainly.
There was a point in time whereI'm just like, damn, I'm
creating like dozens of vpcs aweek.

(24:13):
This is not sustainable.
We need to do something aboutthis.
And, uh, we very quickly wentfrom from 100 to quite a few
hundred, and I'm sure transitgateway made some of that a lot
easier, but it just didn't feelit had a smell to the
architecture.
It's not something that I thinkwe wanted to go down.
I mean, do we really want tohave a database in one VPC and a

(24:33):
service accessing that databasein another VPC?
There's a lot of administrativeand cost boundaries that you've
got to go through to do that.
Is it the most cost-effectivesolution?
Probably not.
So, shared VPCs.
They were a great solution tothat.

Speaker 3 (24:45):
I mean shared VPC actually makes a lot of sense.
We have lots of customers andlots of people that do it From
an administration perspective.
If nothing else, it makes itinfinitely simpler to be able to
share out, like here's yoursubnet, here's you know.
And then do you find with theshared VPC constructs that
security is more difficult toenforce than not shared, like if

(25:11):
you actually split the VPCs.

Speaker 1 (25:13):
Not so much.
I guess we're not going to one.
We'll call it mono VPC, whereevery account gets access to
that We've got a couple ofcourse degrees of separation.
At its most basic we've got aproduction and a non-production
network and then within thosecertain things have a degree of
affinity.
So things that are associatedwith a certain part of the

(25:34):
application will all live in thesame VPC.
They were probably alreadyliving in the same VPC just now.
It's a shared VPC running onKubernetes instead of, say, a
bare metal or or ecs task, right, not bare metal ec2 instance.
Yeah, I mean, they have it, youcan, you can buy it, you can get
it so, uh, that was yeah, and Ithink that actually, just from

(25:57):
a reduction in the number of wecall it ktlo keeping the lights
on the reduction of ktlo was wasenormous.
We're no longer cutting dozensof vpcs every week.
Yeah, I believe it.
We cut a dozen and then we justshare them, and so I guess,
maybe tying this all back to theoriginal question are we
bleeding the same blood?
I guess where I was going withthis story is no, I don't think
we are, because if you look at atraditional on-premises data

(26:17):
center, you don't have a VPCconstruct right Like you've got
subnets and generally in yourdata center maybe you'll have
some segmentation and that sortof thing, but generally in your
data center you're going to havea bunch of subnets which can
all communicate at very highspeeds to a bunch of other
subnets.
Like a VPC is a very specificcloud construct which was
designed to hold your cloudresources in a kind of private

(26:38):
network and later networkingbetween those VPCs became a
thing.
So we're kind of they startedat opposite ends and the VPCs
and the cloud networkingconstructs are slowly making
their way more towards that moreubiquitous connectivity.
But I don't think that was aproblem that a lot of people
on-prem had, like, how can I getfrom A to B?
You already can.
You're plugged into the network, right, right.

Speaker 3 (27:00):
I think, yeah, I think user to user, user to app.
I think, yeah, I think user touser or user to app.
That's true, like I, just whenI, when I worked in enterprise,
I remember quite I remember whenI got to the enterprise job I
had.
It was a big flat, like everycountry.
It was a global company.
Every country just had its ownslash 16.
And there were, there wereoffices, offices that had data
centers in them, offices thatwere connected to data centers,

(27:20):
whatever it was.
But every site basically hadlike its own slash 16 and like.
So there was one where the theoffice was sitting right next to
the data center, like in thesame building is like in the
same walk across the floor andyou can get to the data center.
All one flat network.
So you got all these peoplethat can just like attach to
servers and stuff and so like.
One of the things I did when Iwent in there was to go in there

(27:41):
and break all that up andsegment it and install firewalls
and create that segmentation.
It's very much the the samething.
So there's, I don't think we'rebleeding the same blood, but I
do think, you know, depending onthe maturity of the networks
we're talking about.
There's definitely somesimilarity between breaking.
You know when at some point youjust got to break it up, get
your segmentation in there,enforce your security boundaries
.

Speaker 1 (27:59):
This doesn't mean you didn't have firewalls and that
sort of thing and and segments.
It was just um.
You're starting from a worldwhich has like literally
hundreds of segments and you'vegot to kind of kind of coalesce
them, whereas on-prem networkingis typically the the opposite
of that.
You don't start with 100segments that you need to you
know, do vrf leaking and you,you probably got one segment and
maybe a firewall.

(28:19):
If you're getting really bigyou'll have a couple of segments
.
Yeah, I don't know.
It's been a while since I'vedone enterprise networking.
Vrfs are they like?
How many does one have thesedays?
It used to be less than 10.

Speaker 2 (28:31):
More than you think.
I've seen it all.
I've seen it all.
I've seen some people withthree or four.
I've seen some people with like600.
And I'm like production networkin that fashion.

Speaker 1 (28:39):
Yep Resume driven development Exactly.

Speaker 2 (28:52):
Exactly, yeah, so that's a nice segue.
So you know, on the show wetalk a lot about, I mean, if
you're involved in cloud on adaily basis whatsoever, there a
there's a very strong focus onmigration, right, we're still.
There's all this talk aboutwhether or not we're in early
innings of the cloud, or youknow.
Or you know, to use baseballterms maybe we're at the seventh
inning.
You know we should be at theseventh inning stretch by now,

(29:14):
but we're not even close,apparently you lost me baseball.
I'm sorry yeah, sorry, sorryhe's an american.
So, yeah, I don't know theequivalent n NRL or potentially
cricket term to use there.
But you know we'll stick withbaseball for now.
But you know there's a there'sa strong focus on migration and
moving from from on prem to thecloud, right, and we still hear

(29:35):
about that on a very regularbasis.
But you guys didn't have to gothrough that particular problem,
right, You're still dealingwith migrations, but I'm
assuming they're more internallyabout migrating from service to
service or from construct toconstruct type thing.
But maybe tell us a little bitabout how that's affected, the
kind of overall culture andimpact on the infrastructure
teams compared to what you mightsee in a traditional

(29:56):
organization.

Speaker 1 (29:56):
Yeah, definitely.
So you're right, I don't thinkanyone in IT can escape
migrations, right?
There's always something newwhich brings benefits of
whatever sort, and let's jump onthis new thing.
It looks great, so don't haveto deal with.
Oh, I don't even remember howmany R's there are of cloud
migrations.

Speaker 2 (30:14):
You know, refactor, reshift I don't know, as we're
having this conversation, theyare probably added a couple more
.

Speaker 1 (30:22):
Yeah, there's a lot of different, complicated ways.
So no, don't have to do thoserefactors, but absolutely
migration.
So I mean some examples off thetop of my head.
I wasn't around in this era soI may have it slightly wrong,
but I believe we started off onEC2 instances you know autoscale
groups behind ALBs and thatsort of thing.
Those were changed to becontainerized and we

(30:42):
containerized the applicationand moved to ECS and from there
we've moved since to Kubernetes.
I think when we moved to ECSKubernetes was still pretty
fresh and new EKS maybe wasn'tas mature as it is now.
So we went along this journeyand that's definitely still a
thing.
Luckily they're bothcontainerized, so from that

(31:04):
perspective there wasn't anyrefactoring required.
But certainly it changed to awhole bunch of things like
tooling and the ecosystem,troubleshooting and all that
sort of stuff.
But I mean, maybe this isn't sodifferent from other places,
other companies whose primarybusiness is making software, but
for me this is the first we'llcall it software engineering
company that I've worked for ismaking software.
But for me this is the firstwe'll call it software
engineering company that I'veworked for and the focus and the

(31:27):
structure a lot of thecompanies will call it?
What was the term that you usedbefore?
How does it affect theinfrastructure teams?
Everything is organizedaccording to, I guess, platform
teams and this is the moremodern take on the evolution of
DevOps.
And there's so much materialout there on differences and
different opinions on those thatI'm not going to wade into that

(31:48):
.
But I guess I started off myjourney at Canva in the cloud
networking team, and maybe thisis just me, maybe this is
networking people, but thenetwork has always been a
platform.
If you want to communicate fromA to B, you need to use the
network.
One doesn't simply haveconnectivity.
So I guess maybe my mentalmodel of a platform team was
always networking, but certainlyother teams around me were

(32:09):
never based on platform teams,and so I work in what we call
the runtime platform subgroup,and runtime wasn't a term that I
was familiar with until Ijoined Canva, and it basically
means it's where the code isrunning.
So you think about the mentalityof a software engineering
business.
There's a whole bunch ofdifferent phases and places that

(32:30):
that code is written, is built,compiled Runtime is where it's
actually running, and so there'straditional enterprise silos of
.
You've got this massive team ofsystems administrators in a
silo, who don't talk to thenetworking team, who don't talk
to the firewall team.
All of those constructs arebroken down and we're all in
these platform teams who offerour services to others and

(32:51):
there's a real self-servicedrive here.
We want people to be able toconsume the network without me,
you know, getting the email andthe ITIL ticket to update a VLAN
description and that sort ofstuff.
That's out the window.
So, and this particularplatform team, the runtime
platform, is within the cloudgroup.

(33:11):
So I think that's probablywe're starting to get
similar-ish structure to a lotof places.
You have a cloud team, someplaces call them co-e's, which
maybe there's more of anarchitecture flat to that but
certainly we're all doers, we'reall engineers, and that's
probably a good segue to say.
One interesting part aboutworking here is that, unlike a
lot of enterprises, I'm one ofjust two networking people or

(33:34):
people with a networkingbackground in the business.
I would say that 90% of theengineers plus have more of a
development background, and sothat's a real unique culture
shift and perhaps explains a lotof the thinking around that
software engineering practiceand methodology.
So all of the things that wetalk about in terms of network
engineering we've got to put ourconfigs in Git Well, that's a

(33:56):
given right.
We've got to use CI pipelines,like which one?
I've got dozens that dodifferent things and work in
different ways.
So a lot of those challengesthat I think a lot of other
businesses may have, orenterprises moving to that
methodology, they're kind ofalready solved because of where
we came from, and that's beenreally interesting yeah, that's,
yeah, I guess that's.

Speaker 2 (34:13):
The thing is like software has been doing this
much longer than we have right,so it's like we're we're still
on the back foot, adopting thisum, even from we're even we're
even still debating on whattooling to use, whereas they've
had stuff in place for manyyears now.
So, yeah, definitely veryinteresting.

Speaker 1 (34:29):
One small anecdote to that may be spicy, but just
because it's different and theway that software people do it
doesn't mean it's justnecessarily better, like one
interesting introspect frombeing on the other side of the
fence.
Like, take your mind back toyou're configuring a firewall
and you need to add a new URL toa URL list.
Right, like?
Let's assume that all of theadministration, the ITIL, change

(34:50):
management, the triggers, chris, how much is sorted out?
You jump on your firewall, youadd a URL to whatever construct
in your firewall, click save orcommit whatever it is, and walk
away.
The whole thing might take 30seconds.
Fantastic.
Now you put this into a CIpipeline and that pipeline's not
necessarily optimized forfirewall rule change deployments
.
It can take 20, 30 minutes togo through all the various CI

(35:13):
steps.
Maybe you're building a testfirewall in this hermetic
environment so that nothingaffects it, and just for the
purpose of adding this onefirewall rule, right?
I'm not saying that theapproach is necessarily bad.
The CI is a fantastic tool andhas a lot of uses.
But, done wrong, there's alwaysgoing to be something you can
cut yourself on.

Speaker 3 (35:32):
Yeah, Like, should we spend a huge amount of time
automating something that we'regoing to do three times ever,
for example?
You know?

Speaker 1 (35:39):
Oh yeah, exactly, and the closer you get to the
physical world, the more thatbecomes apparent.
Right, Like going down theinfrastructure as code path.
Is it worth me writing someterraform to create a dedicated
direct connect on Amazon whenit's going to take, you know, an
LOA, four days of people doingcross connects and that sort of
stuff Like where's the value?

Speaker 3 (35:59):
Like what are we actually gaining from this?
I'm not saying, I didn't.

Speaker 1 (36:02):
I absolutely do have this in infrastructure as code,
for example, but just to pointout the example with the
differences of approach, yeah,that makes sense.

Speaker 2 (36:10):
So it's good that you're still dealing with those
legacy things like LOA, cfa,like we all do right?

Speaker 1 (36:18):
Not very much so in terms of looking at the same
plan.
Yeah, I have had to deal withLOAs and cross-connects in a
very, very limited basis,predominantly high-speed cloud
interconnect.
But that's not my day-to-day.
How often do you need to dothat?
Not very, but maybe on thatsubject actually I did touch on
it.
Infrastructure as code likeback to the networking

(36:40):
automation is still trying tocatch up.
When I came here, literallyeverything was in infrastructure
as code.
I mean, don't get me wrong,when you're tinkering with
something, you may need to dosome click ops here and there,
build some resources, try toremember to tear them down and
that sort of thing, so you getfamiliar with how the product
works and the constructs.
But once the rubber hits theroad, like everything is in
infrastructure as code.

(37:01):
And that's wildly different towhere I've come from in a lot of
places as well, where I meanI've seen spreadsheets that
generate router code.
I've seen, you know, all theway through to perhaps a more
mature, ansible playbook andsome Ginger scripts doing bits
and pieces here and there.
But to have everything,absolutely everything, in
infrastructure as code was quitean eye opener.

Speaker 2 (37:21):
Yeah, definitely Super interesting.
So let's actually use that as apoint to pivot here and talk a
little bit about Kubernetes.
So I know you've mentionedKubernetes up to this point
already and from talking to you,richard, I know that you're
maintaining one of the mostsizable deployments that I've
ever heard about running in thecloud.

(37:42):
So let's kind of open that up alittle bit.
Kind of, given your strongnetworking background, what is,
um, so to say, the good, the badand the ugly about maintaining
a Kubernetes environment of thatsize?

Speaker 1 (37:52):
Oh, the good, the bad and the ugly Um, I'll probably
start with.
I'm going to start with theugly Um.
Just in my case, in myexperience and other people,
your mileage may vary.
Kubernetes eats IP addressesfor breakfast from a networking
perspective, and what do I meanby that?
So, like super whirlwind tourof Kubernetes networking,

(38:12):
kubernetes has a concept calleda CNI which is used to implement
Kubernetes networking.
The CNI is responsible forhanding out IP addresses, making
sure that the pods in aKubernetes cluster can
communicate to each other, youknow, without NAT and that sort
of thing.
And there's a whole bunch ofCNIs out there.
But when you look at productslike EKS or GKE in Google Cloud,
they both come with a CNI whichhas some opinions, and their

(38:36):
CNIs both opt to use what I'llcall VPC native addressing.
So we're not talking overlaynetworks which one might find
on-prem.
You know there's no VXLAN,geneve, yguard, whatever overlay
technology that your particularCNI wants to use.
These are actually gettingassigned IP addresses which are
native in the VPC substrate, ifyou will.

(38:57):
So straight away we've gone awayfrom the ability to isolate
these.
You're in a VPC.
Those VPCs are on the network.
All of these IP addresses needto be unique.
We can't just slap 100 or 64slash 16 on them or whatever.
Now, in the case of Amazon,I'll focus on Amazon the biggest
IPv4 prefix that you can assignto a VPC is a slash 16.

(39:18):
Now, coming from on-prem network, and that's heaps, right, like
that's what you might assign toyour big sites, or maybe it's a
good chunk of your data center.
But 65,000 IP addresses noone's ever going to need that
much, right you do withKubernetes.
So let's break this down inAmazon.
So we're given our VPCS slash16.
We've got availability zones inAmazon and we need to create

(39:41):
subnets per availability zone.
That's the way Amazon works.
So across three availabilityzones we can get some slash 18
subnets.
And again they still soundpretty big like 16,000 odd IP
addresses.
Who's ever going to use thatmany?
Like?
Do you have 48,000 pods?
No, probably not, unless it'squite a decent sized cluster.
So I'm still not seeing theproblem here, richard.

(40:01):
But then the way Kubernetesactually operates and the way it
integrates with the network inthese cloud vendors is in
Kubernetes.
We want to do bin packing,which is we're trying to jam as
many active processes that useup as much of the CPU as
possible into a given workernode, right Like the days of
being on-prem and setting a CPUthreshold alert for 80%, to say

(40:22):
you know, this is bad, you mightneed to shift some stuff.
It's the opposite in the cloud.
You want to set the alert forwhen it drops below 80%, because
that's wasted capacity you'repaying for.
Yeah.
So we want to do this binpacking right.
So we want to put as many podsas possible onto a given worker
node, and the best way to dothat in AWS is using prefix
delegations.
Now, so in AWS you have an ENI.

(40:44):
Now, so in AWS you have an ENIElastic Network Interface.
On a worker node that ENI has Iguess I'll call them slots a
fixed number of slots dependingon the size of that node, for
how many secondary IP addressesthat you can have on there.
And it's usually pretty low,right, like 15-ish.
We'll say let's pick thatnumber and we want to have more
pods than 15.
In fact, in a lot of deploydeployments that I've seen,
there's a concept called a demonset which is a particular type

(41:05):
of thing is running on everynode.
Some deployments I've seen have10, 20 demon sets.
So that means straight away ournodes are using quite a few IP
addresses already.
Now, given this limitation onthe number of slots on an ENI,
aws came up with this conceptcalled prefix delegation, which,
in the case of IPv4, you canput a.
Rather than putting on a singleIP address, a slash 32, you can

(41:27):
put on a slash 28.
And that gives you slash 28, 16IP addresses per each of these
slots, and you can have the samenumber of prefix delegations as
you can secondary IP addresses.
So straight away, okay, cool,maybe we can get a hundred a
couple hundred.
Yeah, on a given node.
And then we're starting to getinto the territory of like how
do we size our nodes?
How many pods is it acceptableto go down when a given node has

(41:51):
a problem?
And we're changing the paradigmhere a little bit.
But let's say, for example,like we've got these slash 28
prefix delegations.
Now, when you look at thatagainst the context of your
slash 18, you've only got 10bits there.
So we've gone from 16,000 IPaddresses to 1,024 prefix
delegations.
Now if a prefix delegation getsassigned to a node, we're

(42:12):
straight away artificiallycapping the maximum number of
nodes to about a thousand, right, yeah?
So let's say and back to thebin packing I want to put
multiple prefix delegations ontoa given node, let's say two.
That means I've got a maximumof about 500 nodes in that
subnet.
And then, just to make mattersslightly more challenging, in
AWS you create an instance.

(42:33):
That instance in a subnet thatENI will be given a random IP
address.
It'll just be plucked fromsomewhere in the VPC.
Now when you start thinkingabout, I want to have all of
these contiguous slash 28s totake the most advantage of the
IP space that I've given thesubnet, and you start spraying
random IP addresses around theplace.
That makes it reallychallenging to find a slash 28,
right Like we've got to map tothese binary boundaries.

(42:55):
One node in the wrong placekills 16 IP addresses.
So AWS have this other productcalled a CIDR reservation and so
what you can do is you can putaside a chunk of space and I
believe in our case it's aboutthree quarters of the subnet.
We put that aside for prefixdelegation and the nodes get
created in the upper quarter ofthat given subnet and that gives

(43:15):
us a bit of breathing room.
The prefix delegations aren'tbeing trodden on by the nodes
and it gives us a bit ofcapacity in there in there.
But I guess, just to serve thepoint, we've gone from 16,000
pods You're never going to havethat in this subnet to okay,
we've got 500 nodes.
That's perfectly realistic tohave here.
And if you need more of thoseprefix delegations per node
you're going to need a biggersubnet.

(43:36):
And we've got VPCs which havelike multiple slash 16s slapped
onto them, and so when you thinkabout that in terms of the 10
slash 8 range range, there'sonly 256 slash 16s that you have
in that range and I mentioned100.64 uh before.
That's definitely on ourhorizon, but better.
Yeah, let's go to ipv6.
Hey, so for the first time inmy 20 years, there's a uh

(43:58):
compelling use case for for ipv6, and so that's really cool.
Now because, um, I think it wasthe last, uh most recent
reinvent I don't think therewere reinvent announcements but
certainly quite a lot of AWSservices are starting to get
more and more IPv6, private linkcapabilities and that sort of
stuff.
So I think the time for IPv6 isnigh, dare I say it.

Speaker 3 (44:19):
So, that being the case, then, richard, what do you
?
How does that impact?
And I imagine the answer isquite a lot.
How does that impact?
And I imagine the answer isquite a lot.
How does that impact yourKubernetes, integration with the
rest of the wider network,Because everybody knows, okay,
we've got Kubernetes andKubernetes can talk to
Kubernetes, and we've got ournodes, our pods, you know, like
Service Mesh, everything'sencased within Kubernetes.

(44:40):
But what about once we have totalk to something legacy,
something outside essentiallythe pod?
Are we doing six to four NAT?
What happens there?

Speaker 1 (44:51):
That's definitely something that I'm actively
thinking about, like how do weget, like, let's say, I make my
IP address space problems goaway?
We're all at IPv6.
We have IPv6 only podsFantastic, I believe.
With the VPC CNI that comeswith AWS they get 169.254
addresses, so they can still doNAT44 to get to IPv4 resources.
That might save a bit oftrouble.

(45:12):
Need to think about exactlywhat that's going to look like.
But to your point though,kubernetes is.
It's kind of a, it's a black box.
So previously, where we hadlet's pick on Terraform, we had
Terraform, and Terraform was ourinfrastructure of code tool and
it would go to the cloud and itwould create an instance or an
ECS task, whatever it is, andyou would be able to, within a

(45:34):
single Terraform ecosystem,reach out, create a security
group for that ECS task or EC2instance, do the same for an RDS
database, and you would be ableto say in that RDS database hey
, this security group thatrepresents my EC2 instance,
please allow that inbound on mySQL.

(45:56):
And because that's all part ofthe single ecosystem, it
actually becomes quite easy andtrivial to have a reasonable
amount of security, for examplein this case, and it's done in
one place, now into Kubernetes.
To your point, kubernetes doesa lot of this for us.
It creates these pods.
It creates, depending on whatcontrollers that you run in your
cluster.
It might create your loadbalances, and I mean, if we say

(46:19):
legacy, let's assume that you'restill running on RDS databases
which, for good reasons, yourpods now need to access your
database.
The challenge is now how do youidentify that pod to your
database where previously wewere able to reference these
security?

Speaker 3 (46:33):
groups.
You've got to open it up toomuch, right yeah?

Speaker 1 (46:36):
that pod is now represented by a security group
on the node, and that meansevery pod on that node,
depending on what you do withsecurity groups could
theoretically have access tothat RDS database.
You can't just.
And there's some things thatwork around.
This AWS has a I'm not sure theexact name AWS Security Group
Controller, which is kind ofsimilar to ECS, if I recall,

(46:58):
where it has the concept of abranch ENI which you can assign
a security group to if you usethis controller.
But if you don't use thiscontroller, maybe you're rolling
your own CNI, whatever it is.

Speaker 3 (47:07):
Yeah or Cilium or Calico or Flannel or one of
those.

Speaker 1 (47:12):
Yeah, calico, maybe a Cilium, certainly those and
this is where it's reallyinteresting watching Kubernetes
and that ecosystem develop morein the network space.
We've got things like networkpolicies now which can to some
extent, dictate what you can andcan't talk to.
And I'm seeing, I'm seeing amassive resurgence in the use of
things like FQDN rules.
Now, to me maybe people willdisagree with this, but to me an

(47:35):
FQDN rule was kind of thetransition between the
traditional firewall, which wasyour five tuple rule to a very
static internet.
To me, an FQDN rule was thetransition from that to a proper
next-gen firewall, SNRI or HTTPlayer 7 inspection in a more
dynamic world.
But we're seeing a resurgenceof that because it's reasonably
easy to create.

Speaker 2 (47:57):
I love that we've had to pretty much refer to Rds as
legacy at this point, which is aservice that's?
I don't think.

Speaker 3 (48:02):
Well, I actually was thinking I forgot, uh, for a
second that you guys were 100 inthe cloud, but really I just
meant outside, because, likeservice mesh has, this has this
challenge as well.
Right, service mesh isextremely good at building app
player security and connectivitybetween the pods with the
sidecars, but you know what ifyou have to go egress?
What if you have to go toanother something that's outside

(48:22):
of the service mesh?
Right, it's, it's still kind ofclunky.
Uh, even with network policiesit can be kind of clunky.

Speaker 1 (48:28):
So that's how I was yeah, like you, you can't just
even if you're going to anothercluster, unless you've got some
sort of technology that bridgesthose clusters together, and you
you mentioned service mesh.
There's a whole range oftechnologies out there, which is
definitely a different podcast.
But if you've got a separatecluster, how do you authenticate
services between these twodifferent clusters?
Now we're getting into thatauthentication space.

Speaker 3 (48:47):
Yeah, like the SDO with the control plane and doing
MTLS and all that crazy stuff.

Speaker 1 (48:51):
Yeah, mtls like Spiffy Spire is being talked
about so much now.
Like my first exposure to MTLS,I think it was in 2010, I was
rolling out a Wi-Fi network thatused effectively what's now
being called MTLS, but back thenit was EAPTLS.
It's cool again.
Mtls is so hot right now.

Speaker 2 (49:08):
Never drawn that correlation.
Yeah, I never thought aboutthat, but yeah, I think EAPTLS.

Speaker 3 (49:11):
MTLS.
I never put the two together,like EAPTLS, mtls I never put
the two together.
But you're absolutely right.

Speaker 1 (49:14):
I mean slightly different purposes perhaps, but
ultimately it comes down toauthentication.
In this case, or rather in thecase of EAPTLS, it was letting
you onto the network.
In the case of MTLS, it'sletting you access, or not even
letting you access my servicenecessarily, but it's at least
identifying that you are who yousay you are.

Speaker 2 (49:34):
I guess.
One last question, because youmentioned kind of this, this
concept of moving to everythingbeing FKDN based.
I mean, like I think about,like when I worked, I worked on
the service provider side.
Like I said, I've never workedenterprise but it's.
It's odd, like when you're onthe networking team, how much
you reference stuff by IPaddress, right, if I'm just like
there's, there's native thingsthat that tick in your mind
whenever you see like oh, this10 dot, whatever, whatever.

(49:55):
I know where that is, I knowwhat that is, I probably even
know what server it is, what'srunning on that service, et
cetera.
But like for you, where you'rein this environment where IPs
are pretty much meaningless,like how does that kind of mess
with your frame of reference tolike what services you're
talking about, what you'resupporting, what infrastructure
is Like, has that been achallenge or has it been

(50:17):
actually better in the in thelong run?

Speaker 1 (50:19):
Oh, that's a good question.
Actually, I haven't reallythought about that, but I to to
maybe extend upon that as well.
Previously, networking was aworld of yet.
To use the cloud analogy uh,pets instead of cattle you would
not.
You wouldn't give things sillynames, necessarily, but you had
a very strong naming convention.
You would be able to identifyfrom a device name what location

(50:40):
that device is in, maybe whatmodel and vendor.
It is the function of thatparticular device.
You would start to rememberthings.
I still remember some of mycustomers' IP addressing plans.
10.32 is this big site inSydney, for example.
That's all gone.
And where I was going with thisis enter the cloud.
Everything is given a well, atleast in the case of Amazon.

(51:01):
Everything is given a uniqueidentifier that I cannot
remember, I think.
I remember the ID of one VPC andit's a very old VPC so it has
the short identifier.
How do you remember theseresources?
And I guess that's where I?
I honestly do not know what thesolution for that is, but maybe
one thing to to consider is, uh, storing that information in

(51:24):
some sort of inventory which isused by engineers and has a
degree of like enrichment aroundit yeah, like it could be a
netbox, a nautobot.
Uh, it could be.
I mean aws, ipam is is a toolthese days.
Not quite the same league, butit certainly does IP address
planning, which I think isautomating.
That is absolutely essential in2025.

Speaker 3 (51:43):
What scares me about the whole DNS thing and don't
get me wrong, I'm not sayingthere's any other way to do it,
especially when we're talkingabout ephemeral resources like
Kubernetes is that it's like howmany times have we seen the
meme like it couldn't be DNS?

Speaker 1 (51:57):
It's not DNS.
It was DNS right.

Speaker 3 (52:04):
And remember a few years ago when US East 1 went
down and everything broke allacross the world for Amazon,
because that's where a lot ofthe DNS stuff was hosted and
stuff.
It's such a I don't know such alinchpin thing.

Speaker 1 (52:15):
The blast radius these days is very different to,
I think, what it used to be.
Everyone used to have their owndata center, or at least be in
a colo, and your blast radiuswould probably just be you.
But these days, as you say, Ithink I remember I think it was
an S3 outage, maybe circa 2016.
The impact to me was that theapp that I was using to order my
coffee as I was walking to thecoffee shop didn't work.

(52:36):
I'm like, oh, that's weird.
Oh well, I'll just do what anormal person does and order it
at the coffee shop, but that wasa huge issue.

Speaker 2 (52:41):
Yeah, oh man, all right.
Well, I think we are coming upon time here, so I think this
has been a great conversation.
It's funny how we I think wetook this from a level 100 talk
to a level 400 talk very, veryquick, but I think that's I mean
it's been super fun for me.
So I appreciate you coming onRichard and talking about this
and you know, maybe, once again,we'll have you have you on
later Any any closing commentsor questions from you Tim.

Speaker 1 (53:05):
No, it's been great.
Thanks so much for inviting me.
It's been lots of fun for metoo.
All right, Um, yeah, Richard,anything uh, where can people
find you online?
Anything you want to plug orlet people know how to find you?
Oh, I'm really boring when itcomes to social media.

Speaker 3 (53:21):
I think I might have a mastodon account somewhere.
I oh man, that's a blast, yeah,maybe um, but by our
terminology, mastodon's legacy.

Speaker 2 (53:29):
At this point it's legacy, yeah, it's like more
than two years old, right?

Speaker 1 (53:33):
uh, yeah, predominantly linkedin is where
you'll find me awesome, cool,we'll get that in the show notes
.

Speaker 2 (53:38):
yeah, we'll throw that in there, all right?
Um, well, thanks again forlistening.
Um, hopefully this has beenhelpful and if you want to uh,
you know, leave us reviews, sendus a comment, anything please
reach out to us on social mediaor cables to clouds at gmailcom
and uh, with that, we'll take itaway and we'll see you next
week.
Hi everyone, it's Chris andthis has been the Cables to

(53:59):
Clouds podcast.
Thanks for tuning in today.
If you enjoyed our show, pleasesubscribe to us in your
favorite podcatcher, as well assubscribe and turn on
notifications for our YouTubechannel to be notified of all
our new episodes.
Follow us on socials at Cablesto Clouds.
You can also visit our websitefor all of the show notes at

(54:20):
cables to cloudscom.
Thanks again for listening andsee you next time.
Advertise With Us

Popular Podcasts

Stuff You Should Know
Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.