Colocating Data with David Aronchick

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:00):
Welcome to Fork Around and Find Out, the podcast about building, running and maintaining software and systems.

(00:19):
Hello and welcome to Fork Around and Find Out. I'm your host Justin Garrison and with me as always is Autumn Nash. How's it going Autumn?
I'm just really excited to see what Joe keeps coming up with.
I can't even think of a joke. I'm just trying to think of like we are co-locating our jokes with our smarts today.
Be funny now.
I know. It's like it's too much pressure. This is why I don't do stand-up.
Exactly.

(00:40):
This is why instead of stand-up I just have kids.
Don't lie.
I'm interrupting me like intro.
You just do like random stand-up.
Yeah. It has to be ad hoc and today on the show we do not have an ad hoc guest.
I guess we have David Aronchik, CEO and founder of Expanso. Welcome to the show David.
Thank you so much. Real pleasure.

(01:01):
David, we met like I feel like it was like a decade ago. It was like one of the first cube concept.
Absolutely.
No, for sure.
And we just kind of been around. It's just infrastructure, Kubernetes, cloud. We've been doing all this.
Tell us about what your journey's been throughout the last decade of doing infrastructure stuff.
I mean, you know, when I was looking back, you know, I've certainly exchanged Twitter and Blue Sky and so on in all these days.

(01:29):
But you're exactly right. You were one of the first Kubernetes adopters.
So my background is I've been doing enterprise and consumer. I'm on my fourth startup.
I'll get to that in a second.
But the most recent arc of my career started in just before you and I met.
I worked at Chef where I was doing director of product management and leading a bunch of their work on, you know, as Docker and all the container stuff was first coming out.

(02:00):
Then I left there to go be the first non-founding PM for Kubernetes, which I did for Google for a bunch of years.
And and help start the cloud native computer foundation and so on and so forth.
And and that's where we met, you know, you, you were at Disney at the time.
You were one of the first big adopters of this from a more traditional enterprise.

(02:25):
And I know Disney is like super forward looking, you know, because of folks like you, but like it was really like, you know, no one really understood like what this Docker thing is what these containers are.
And, you know, how does this going to affect me?
But you were you were absolutely one of the first actually one of your other media proper and not not Disney's media properties.

(02:46):
Although who knows now, though, it's all this consolidation, but another one of your media cohorts was right after you.
I always remember because one of the HBO HBO Max adopted Kubernetes really early as well.
And I remember watching or streaming Game of Thrones on my laptop.

(03:08):
And I was like, oh, my God, this is running on our stuff.
I'm very, very proud of it.
But um, yeah, you know, that's where you and I met.
And so I let I started to GKE the Google Kubernetes engine.
And then I did that for a bunch of years.
I then moved from there into starting a machine learning platform called cube flow, which has been very popular.

(03:38):
And so I did that, then I left Google to go work at Microsoft to lead open source machine learning strategy out of the in for the Azure ML group and out of the office of the CTO.
And I did that for a few years and now I'm on to my startup, which is, you know, crazy, like I honestly never thought I would go back.

(04:03):
Not because I don't like the startup game, but because like I had a perfectly reasonable job at like big toe, like kick your feet up and chill, dude.
Like, but no, I have a vision for the world and I'd like it to to exist.
So I'm off doing the startup thing again.
Is there ever really a kick your feet up and chill moment intact?

(04:24):
You know, this should be like I keep hearing about this like it does.
At what point do you get there?
Some people are wired that way.
I wish I was I truly wish I was.
I didn't I didn't hear the word.
I didn't hear the word wider wired.
I heard whiter like my skin color and I was like, well, yeah, that also probably plays into it too.

(04:47):
Where I've worked with quite a few people that have kicked their feet up and usually they look like me.
Yeah, yeah.
It's autumn takes the biggest drink of her coffee.
She's talking idiot.
Take me back a little bit to the GKE creation.
Was that always the intention of Kubernetes?
Like you open sourced it and it felt like it was meant as just like a pure open source play and then just the popularity was there immediately.

(05:14):
So so it's long enough ago that I think I can say all this stuff without missing too many people off.
But no, the story here goes back to 2003.
Okay.
And the story is that Google came out and released the the Hadoop paper, the MapReduce paper in 2003.

(05:35):
And Yahoo came along and very helpfully read this wonderful paper, this groundbreaking paper and say, wow, this sounds really cool.
Let's go launch something around it.
And they created Hadoop.
Right.
And Google was like, oh, this is good.
You know, we're glad people are out there and Google is a very academic place.
So they like really don't take any like ownership over that until it gets to Google Cloud.

(05:57):
And they like at the time that they launched Google Cloud, they they had to create an HDFS compatibility layer or Hadoop.
And what that meant was you had something that Google invented, re-implemented by someone else, implemented on this like compatibility layer that ultimately went through another layer that ultimately was still running on MapReduce.

(06:20):
Right.
And they're like, why the hell did this happen?
Like we could have just done the thing.
Right.
So that's going to be angle one.
They were like, hey, look, we don't want to, we're going to release something in the world.
Let's actually release something to them.
Okay.
So that's category one.
And category two is they saw AWS and they saw it growing.

(06:41):
And they're like, holy shit, you know, this, this, oh, sorry, I don't know what this is a non-safer word.
No, we're good.
You're good.
We don't, we don't believe in it anymore.
I curse like a sailor.
So you'll have to.
So do I. Justin's very good at not cursing.
I don't have it in me.
I just don't have it in me.
I almost feel like swearing is like, like I look for slightly spicy people because I appreciate their honesty.

(07:05):
Justin's spicy in other ways.
I was going to say that.
Like I'm so excited for you to be here because I love being in between a spicy interview
E and Justin.
Cause it's.
You're in between there.
All right.
Yeah.
This podcast has taken a very interesting turn.
Let me just say that.
Within five minutes of like meeting you, I was like, David's going to be so much better.

(07:32):
Like, I thought this was a spark around to find out.
And this is not a BDSM podcast.
Right.
Oh, thank God.
Tim is in here.
Oh shit.
My things went all blurred here.
Okay.
Um, so, so then, so then, uh, AWS comes along and they're like killing it.
Right.

(07:53):
And we all look at that.
We're like, Hey, but wait, we have a cloud that we're trying to get going here.
Um, like we think that the right thing here, the right.
Or, uh, uh, element is not a via.
Right.
We think the right element is a container and look at Docker.
They're doing great.
Right.
So let's take.
Dockers.
You know, extension and wisdom, which by the way, again, another thing that Google launched.

(08:18):
Again, no one is saying that Docker wasn't an enormous part in arguably the reason that
containers are successful.
Um, but a lot of it was based on, you know, kernel changes that came in, you know, in
2004 or 2005, right?
Like there's an enormous amount of stuff there.
And so they're like, Hey, look, Walker's killing it.
Let's help Docker extend even further.

(08:39):
And let's help people say, you know, can take VMs are not the right thing.
It's just not.
And so, you know, again, I was not here during this time.
Craig McClucky and Joe Beta and Brandon Burns and Tim Hawken and Brian Grant and whatever,
they were like all working on a bunch of stuff internally to Google where they're like, we
think there's a new orchestration paradigm that people should be adopting here.

(09:04):
Um, they were going to build it internally to Google in a project called Omega and you
should go read everyone.
You should go read Brian Grant's blog history of this.
It's so good and it's so real.
It is like transparently going through a nice human, which is like amazing that he's
like a humble, nice human after doing like he's so smart.

(09:26):
He would guess two or three on this show.
So yeah, he's so good.
When you talk to him, you're just like, dude, you're so smart.
Like he is just so intelligent.
So, so, uh, uh, Brandon is a good friend.
And when I first get to Google, he tells me this thing, which is amazing.
He says your goal inside Google is not to be the smartest person on day one or day two

(09:50):
or day 400, right?
Your goal is the following.
Like you should go and you should come up with a smart idea.
We hired you because you have a smart idea.
Okay.
You should go and you should try and figure out where that idea is because I guarantee
somebody internally has already thought about it.
Right.
And there will be a window between you thinking of this idea and the paper and the, that window

(10:12):
will start off at like four years.
Like the idea was four years ago that somebody looked at this and they decided this was a
bad idea or they implemented it or whatever.
And then you should.
Okay.
You get smarter about it.
You read the paper and then you come back and then you do that again.
You're going to come up with another great idea and it'll be two years.
You're like, what?
Two years and then you'll do it again.

(10:33):
It'll be like nine months.
Then you do it again.
It'll be like three months and then you do it again.
You won't find a paper and then you're like that.
That is the thing you should go and implement and, and so on and so forth.
So Brandon says this and I was like, I still take this wisdom way.
I think it's so interesting, especially in the real world where you can go out and you
can research it and you can figure out why things worked and didn't work and so on and
so forth.
Brian is interesting because he's the other half of the coin.

(10:55):
Like he's the one who will like, he just has canonical knowledge of everything.
And so he is whenever I'm trying to come up with a new feature for our platform or,
you know, hey, you know, why didn't people do this?
I go and talk to Brian or another guy, Eric Brewer.
He's, he's also a really wonderful human.
You should have him on if you haven't already.
And the two of them together, you're just kind of like, oh, you know, what's, what about

(11:20):
this idea?
Oh yeah, we didn't look at that.
And this is the problem and distributed to this and consensus that in this year and
you're running into this.
And eventually you'll get to a point where they're like, yeah, that's actually not a
bad idea.
And you're like, ah, I'm going to go with it.
I feel like having Brian as a friend has got to be like some sort of life hack because
to be able to bounce ideas off of someone like that, like God.

(11:41):
I mean, I say that I am, I try and collect smart friends like they're fucking Pokemon.
That is that.
Okay.
Like that is like the top tier.
Like if all of the rest of the world is questionable at the moment, having smart friends and good
friends.
100%.
I mean, just having someone who can be honest with you is like brutally important.

(12:02):
I might tease Justin on the internet, but like having good friends, like top tier.
Like if you want to know how to improve your life.
I don't know if that was including or excluding me.
That's kind of it goes both ways.
Duh.
Like having Justin, but also I have good friends in between your questionable moments of the
fact that you don't drink coffee.
But like, we won't go into that today.

(12:25):
But it's like, I think that drink coffee.
How's that even possible?
Thank you.
Like, like you work in tech and you have children.
What is wrong with you?
He, he does have a Dr. Pepper obsession.
I love Dr. Pepper.
Last night it was at an event and somebody had like, so it's, it's tech week here in
Seattle and it's been phenomenal.
You live in Seattle and I've never met you, David.

(12:47):
What are you doing this afternoon?
There's like three more events.
What?
It's tech week.
Yeah.
It's tech week, man.
But I was going to say I was at an event yesterday afternoon all week.
I've been drinking Diet Coke.
Don't get me wrong.
I love Diet Coke.
But like at the same time, like I was at an event and they had Diet Dr. Pepper.
I'm like, oh, you, how do I get in business with you?
Diet Dr. Pepper is amazing.

(13:08):
I love it.
I love how you said, how do I get in business with you?
I'm making this happen.
I totally understand that.
Do you know how hard it is to find Justin Dr. Pepper?
It's not that hard.
So I continue to ignore him at conference, annoy him at conferences.
I don't understand how everyone is at drinking Diet Dr. Pepper.
It's so much better.
I went to three different stores at scale so I could give, be like, here's a Dr.

(13:31):
Pepper and Rice Krispie so I can keep stealing your charges.
It was the only reason I would go to Texas was to get good Dr. Pepper.
They have the original OG sugar.
Do they have a different Dr. Pepper?
It was invented in Texas and they have real sugar Dr. Pepper.
And so it started to percolate out some other places and one store near me sells it.
And so I go there sometimes.
It's fine.

(13:52):
I said it's fine, but it didn't sound fine at all.
Let me, let me finish up the story so we can get out to other interesting things.
There's too much ADHD here, David.
No shit.
I'm like ADHD, like on ADHD.
So anyhow, so Brian Grant and Brennan and whatever, they come up with these things

(14:14):
and Brennan, you know, literally.
Why are you not in business with Brian?
Because he's got his own thing.
Yeah, he's got his own thing.
Yeah.
I love what he's doing, by the way.
Yeah.
Like the configuration management and easy with another wonderful friend of mine that
I got ideas of all the time, Alexis.
Alexis.
Yeah.
Config hub.
Config hub.

(14:35):
Yeah.
Huge, huge.
I'm going to fly on the wall while you guys are having like technical discussions.
Like, can I just sit in the background?
I mean, we never have it.
We never have a technical discussion.
You get in the room and you're like arguing about like how, you know, whatever, blah,
blah, blah, like bad mouth, blah, blah, blah.
And like, oh, you see what these idiots are doing.
I mean, I feel like your group chat is fire.

(14:59):
More group chats.
They're the best.
I mean, our group chat is hilarious.
I don't know.
I don't know.
It's an interesting question.
Finish your story.
Anyhow, so Brandon gets it running on his laptop in Java, like his total skunkworks.

(15:21):
That was kind of a fork.
It wasn't a fork, but it was kind of like a conceptual fork of the thing they were doing
internally to Google.
And then it starts to catch fire and somehow it breaks through, like, because Google was
really internally opposed to Kubernetes.
Not that they were, there was just a lot of motion around like what the hell is going

(15:44):
on and, you know, what kind of team do we spin up?
And then, you know, Craig McClucky and like I said, Brandon and Brian and all these people
ended up forcing it through, get like, I think releasing it to the world like forcibly
and then, you know, just kind of cascaded forward from there.
And so I joined in January of 2015.
And Craig was like, Hey, look, I need someone to take over Kubernetes management for me

(16:09):
because I'm going to go off and work on three other things.
I mean, there's another genius for you.
And so he, he proceeds to go and do that.
And I like launch GK.
And so they're like, Well, all right, we're going to have this open source thing.
We've got to, you know, get this project going.

(16:30):
It was already like put it already been written in.
There were some early versions and so on and so forth.
But, you know, I started leading it and, and, you know, it was the three, three core pillars of
of GKE compute under Navneet, Paul Nash, who was off running compute and

(16:55):
totally blanking on his name.
I feel terrible, but like it was the lead for App Engine.
This was, this was 10 years ago.
We're not old.
I know, but I can't remember his name.
I feel really terrible because he was great.
Crazy, like in 10 years, how much things have changed?
Oh, absolutely.
Absolutely.
So anyhow, so that was it.
And it was just like, let's help people adopt containers.
And, and for better or worse, it's not that we're opposed to AWS.

(17:18):
It's just, we don't want people building on VMs.
That was it.
And we think the world is better if, if everyone isn't completely married to a
VM, because a VM is so heavy weight, even the lightest weight VM to have an
entire kernel to care about your serial port and your Ethernet driver.
And I mean, it's just like, it's insane.
Like let's, let's give people what they want an isolated environment that allows

(17:42):
you to execute against things.
And, and that was the, the whole idea of on the container.
And then obviously letting people do a whole bunch of those at the same time was
really powerful.
And even just to like paint the scene of people that weren't in technology or
weren't doing infrastructure around this time, right?
Like Docker was kind of launched in, in 2014, the first Docker con was 2014.

(18:04):
So this is still super early.
ECS came out from AWS, which was like basically just like a big Docker engine
in 2014.
So this is within six months of all these other things.
Google already had the app engine, which was already kind of this like,
has sort of, you know, you didn't have to care that it was a container sort of

(18:25):
environment where it's like, Hey, you just bring us your application that looks
like this, we'll run it for you.
No VM, no S management, all of that stuff is going to work.
And then launching this new, very configurable kind of complex looking
container engine into the world had to have contention because I know like all
the internal Google stuff around Borg is like, well, you can't just ship Borg to
other people.

(18:46):
How do you wrap that to make it easier just like Hadoop?
It must, it must have been like political struggle.
I think even more than a technical struggle to be able to push that through.
No, I mean, look, you know, we, again, we all forget, right?
But, but Google was not successful and open source at that point, right?
They were very successful publishing papers.
But to that point, they had Android, which they bought and they had Angular.

(19:13):
But other than that, they had Go.
That's true.
They didn't go.
I take that back.
That's the only other thing.
But Go wasn't, Go wasn't as popular as it is now.
It was like, you know what I mean?
Yeah.
Yeah.
Yeah.
Go.
I don't think people know how old Go is because Go got so popular in the last
few years.
And then when they, like, and because so much of Kubernetes and certain

(19:35):
infrastructure is built on it.
Now it's like, I won't say it's like Java, but it's like, you can't avoid
Go in a lot of ways.
So much of the infrastructure tooling was Ruby before that.
Right.
Because Ruby on Rails exploded.
That makes me so mad.
And then there was.
I have like flashbacks.
Chef and Puppet were like, that was like, if you were doing infrastructure,
you were doing config management and you had to know Ruby to be able to write

(19:58):
Chef and Puppet.
Oh, and now so many things at AWS make sense.
Yeah.
Yeah.
Absolutely.
I was like, why would you do this?
Yeah.
Absolutely.
Well, now it's all TypeScript at AWS.
Yeah.
So I mean, like, again, it's, it was like, and so.
I have a question.
Oh, sorry, please.
What do you think is harder politics trying to get things done internally and

(20:21):
then like a mega corporation or open source?
Because I feel like they're very two different, like.
They're.
You break a really interesting question.
I, I, you know, they just are very different.
The nice part about internal politics is there are at least defined motivations.
It's very rare that someone's just absolute chaos.
Right.

(20:42):
Every now and then I'm like, you play D&D, huh?
Cause you are just a chaos goblin for no reason.
Like you just walk in and you're just like, for no reason.
But yeah.
So that's very rare.
And I'm like, I don't know.
I don't know.
I don't know.
I don't know.
I don't know.
I don't know.
I don't know.
Yeah.

(21:03):
So that's very rare internally.
At least you can say, okay, well that person has this job and this VP asked them to do
this.
So like that's a thing.
I don't agree with that thing, but at least you can like unpack what, what they're doing.
I think knowing your audience is important.
100%.
No.
100%.
And, and those politics almost always come down from cheese.
I'm responsible for a thing and you are risking that thing.

(21:27):
So I'm going to like be a dick to you.
Right.
And so you got to figure out because I love the way that you like, you're like, can we
be friends?
You're like, so this person has motivation.
So they're going to be a dick to you.
Like I felt that in my soul.
So, but, but in open source, the, all you have is ego, right?

(21:50):
And so ego can be way more irrational.
It's, but sometimes it's like really like, you got to love the purist, right?
Like, you know, like we, I feel like we all like really care about open source, but every
now you get someone and you're like, do you go outside?
Like,
I don't, you know, it's funny.

(22:13):
So let me give you an example.
When I was leading Kubernetes, they're one of the first.
Hardy debates we had in public.
I was excited about what you're going to say.
He said a party in the way you perked up.
I was like, this is going to be good.
There was a, I can't remember it because we just introduced job sets.

(22:37):
So I can't remember what was the name.
Oh, maybe it's a stateful set.
But like, I think that's what we were calling it.
We're still calling it that.
But like at the time, 2015, there was a thing called pet set.
Right.
And
What name do these things?
What is a pets?
So this is the old name for states.
You bring up the right point.

(22:58):
Right.
So there was a thing called pet set.
And that's because there was a whole idea.
Oh, or things pet cattle or pets.
Okay.
Like, because you, you, you're going to put a bullet in a cattle, but you're not going
to put a bullet in a, in a, right.
And so the whole idea was like to keep it around.
And this person submitted a bug to say, Hey, you should change the name of pets.

(23:20):
And they, they get going to this big, long explanation, like, look, you know, animal
welfare and this, that and the other and so on and so forth.
And I'm not dismissing.
Like what their feelings were, but like, that's, that's your deal, dude.
That's not our deal.
Like I, I, I get you want that, but we don't.

(23:41):
It like, that's not going to help the project.
The fact that this is the motivation.
Now that said, the name is terrible.
Exactly what you said on it.
What the fuck is that?
What is that?
What does that set even mean?
That you're refusing.
Why are you saying this?
And so this is why having a lot of really smart friends that have been in the
field for a long time is good.

(24:02):
But every now and then a name comes out and I'm like, this is how we know you
don't talk to anybody else besides white guys that have been in tech for forever.
Who else knows what this means?
And, and here you go.
You can go search for it.
It was, I can tell by Justin's face.
He's already searching for it.
Please.
Please.
It's, it's a 27, 4, 30.

(24:24):
And this was, and for the time, small community.
This was a, I don't know, 50 comment thread here.
Like it was pretty, like I was a lot of annoying stuff that I love that there
was a 50 comment thread here.
There was a lot of annoying stuff that I love that there was 50.
This is just, this is the epitome of open.

(24:46):
Okay.
So when you get to the conversion, right?
The like conversion zone of like open source.
Oh no, sorry.
120, 150, 150 total comments on this.
Shut up.
I love this.
Like insane, insane.
This is like those people who are so into open source that they refuse to use any
proprietary software.
So they refuse to use Google maps or anything.

(25:08):
And you're just like, bro.
But like, okay.
How do you, okay.
I think that Google was, I won't say one of the first, but you guys, your time at
Google, I imagine that you really learned that conversion zone of proprietary
software and corporate and open source.
And I feel like we're in this kind of, I don't know.

(25:31):
I don't know if I'd say a transition, but weird time.
So like, what do you think about the politics of trying to both balance
corporation politics, but also like interacting with like open source, you
know, because it's very, it's, it makes it even more complicated when you're both

(25:52):
arguing for something internally, but then you have to go to the politics of
open source, right?
Which I think a lot of open source is corporations right now, but when you're
actually the person that's inside doing the arguing with that company, you have
to really know what that company's goals and business and leadership principles
and then fighting it in open source is a whole different battle.

(26:16):
And I feel like it's almost like sometimes if people only work in proprietary
software and they only work in open source, don't realize what it's like to
kind of, so can you speak on the whole, like how you do that and like your
experience and kind of making those worlds happen?
I think you, you touched on it earlier, right?

(26:38):
It's like, how to understand their motivations.
You know, corporate, people who are like in that open source or in the
corporation are both humans and corporate employees, right?
And so you've got to figure out how to balance all of that mess and the

(26:59):
extent to which you can help them achieve their corporate goals, but still
enable them to be humans.
I mean, that's, that's the sweet spot you're like looking to do.
I think trying to teach open source to people that are used to corporate
America, you know, and corporate internals and then trying to like show

(27:20):
the business value of open source.
It's like both hard, but like my favorite thing because there's so much
value in open source and trying to get people to invest, but also understand
the mindset of people that work in open source and what open source customers
want because it's very different than proprietary like software, right?

(27:42):
So trying to teach companies that are into proprietary software.
Like I think Google is really interesting because they, they do, they're
very academic in the papers and kind of where that flows.
But if you look at it like Kubernetes is one of the like most thought out, like
well built open source projects, which I think is going to like really
changed how that's done.

(28:03):
Like if you look at how like the Linux foundation in Kubernetes works, like
you can tell that the rest foundation is really taking that into account when
building it.
So I feel like it's going to like change the future of how those like, so
what's it like kind of seeing that from like the start and how corporate and
open source kind of start at this marriage and how we're still trying to
navigate that.

(28:24):
I mean, I think the thing is like it's the extent to which you become part of
the critical chain of a corporate supply chain is where you start to do it.
Right.
So like, you know, to use a terrible example, you know, analogy here, right?
It's like energy.

(28:45):
Okay.
You know, corporations don't care about energy.
They don't care about electricity.
Most of them don't, right?
They're just like, all right, I plug in my laptop and it works and that allows me
to, you know, produce some Excel files.
If you ask them to like give a shit about, you know, what, what the
transformer up the street is or what the, you know, greenness of your, you know,

(29:06):
you know, power coming in is they're just like, Hey, that sounds great, but like
unless we're very forward looking, we're just not going to care.
Now, if you can align it to whatever their business goals are, then you're
going to be off to the races.
And so if you're like, Hey, you know what?
It turns out that putting a 10 kilowatt battery on every retail outlet means

(29:30):
that, you know, they get less brownouts or something like that.
And that improves the ability to sell and great, they're going to do that.
And now it happens to forward your other goals of being green and resilient and,
and, you know, getting rid of fossil fuels and whatever, but like, that's
not what you don't sell it that way.

(29:51):
You sell it as, as part of this other thing that they care about.
Open source is the same way, right?
Like, you know, you're not, it is very unlikely that you're going to be able to
walk into someone and say, Hey, you know what, you should do this because this
is a social good and you need to support the Linux kernel.
You need to support, you know, engine X or whatever because of X.

(30:12):
What you need to say is like, Hey, do you know that like 84% of our stack runs on
this open source project and we have zero upstream developers on it?
Like that seems like a supply chain risk.
We are not going to go and build this better than they do.
Uh, so let's put someone on maintaining it or, or allocate some dollars or whatever

(30:35):
it is, because that's a supply chain component.
And again, that's just one example.
There are many other reasons.
Like I'm so passionate about like trying to explain that to corporate America
because I feel like it's not only way that we're going to move forward with this
kind of like, this new like change in open source, then trying to license
everything and trying to make it, people pay for it in a different way.

(30:57):
I don't think we're going to get that forward movement that they think they
want.
But I think really showing like, uh, corporate, like companies, like
70% of infrastructure I think is built on open source.
And I think really showing people the value of like, Hey, contribute these
things upstream, learn to get into the political, the politics and contributing

(31:19):
and being a part of the community.
And like it for one, it's like everybody is doing more with less right now.
Right.
So the more that you're contributing upstream, everybody's on the same page.
It's easier to maintain software together.
There's so much actual business value.
Absolutely.
And I just feel so passionately about trying to get people on that page

(31:41):
because I think that we can, for one, we can get people paid to be
maintainers, right?
Absolutely.
But we can all, like people are going to be doing this development anyways.
And instead of taking from open source and internally developing it,
contributing it back to open sources, not only going to make you a better
steward of that community, but also why maintain it solo and like

(32:04):
siloed when you can maintain it.
Like look at how log for J was like, so, you know,
I mean, I would say also maintaining things solo is easier than trying to
trying to get the, like you slow down sometimes in what you're doing,
but like trying to get it depends on how big the project is.
Yeah, absolutely.
There's a tipping point because like at some point, like if you're going to

(32:26):
like a legacy software like Java Linux, all these places.
Thank you for saying Java's legacy.
Why are you always trying to hurt my soul?
You said it.
I just said how good you were friends.
Oh yeah, she is.
Terrible.
Terrible.
Come on.
Join the 21st century.
I'm still IT.
My defense, I haven't got to write Java and God knows how long.

(32:47):
So like apparently I'm a Python and everything else head and decrepit C
and every open source.
Yeah.
But like, you know what I mean?
Like if you think about it, they're the amount.
Okay.
In this world as engineers in 2025, everybody is so under like headcount.
We haven't had headcount for years, right?

(33:09):
We are doing more with less to get the extreme knowledge that you need for
some of these open source.
Think about how big Kubernetes is.
Think about how big Linux is, how big Java is to get that type of
of specialized knowledge.
You're not going to just, unless you're buying a
maintainer for a million dollars, like, and you'd have to have multiple of them,

(33:33):
right?
So if you can get, if you are putting that money towards getting developers to
learn how to like become parts of those ecosystems, you now are like a force
multiplying your developers because you're maintaining it and you're
contributing to this ecosystem.
So if you have four corporations that are huge and they have the smartest

(33:54):
minds and they're all now adding to this open source project, really, that's a
force multiplier because when you have a horrible bug like log4j or something,
if you have four smart, huge, like not huge, but smart teams, right, that are
working at the same problem, that's more now secure than it would have been if
you siloed that.
Yeah.
You know what I mean?

(34:15):
That's like the whole reason for the CNCF as a foundation, right?
So that these big corporations can work together in a neutral place because...
But we have to teach people that because a lot of corporate America just thinks
of like, I'm going to go pull this repo down, pretend like I didn't pull it down.
And then, you know, and it's like, bro, it's better for your business.
It is more business value for you to contribute these things upstream.

(34:39):
I know you're going to have a little bit of politics, but hire someone who knows
how to do that.
They are in the market right now and they could use a job, you know what I mean?
And do the work because it really is a force multiplier when you look at it.
So David, we're going to skip over everything you did at Microsoft because it doesn't work.
And we're just going to jump right into it.
I want to know all the things.
Tell us all the cool things.
I will talk for as long as you'd like.

(35:01):
I'll come back for part two or whatever.
I'm not that interesting, but I love hearing my own voice because I'm narcissistic or
I don't know.
I'm not going to give you any key that you'd like at any of these places that I haven't
gotten into the politics.
All right.
Cube float.
There's a lot of drama there.
Oh, you are like my favorite type of human because not only do you have all the intellect,

(35:24):
but you actually have a personality, right?
Because like, bro, you all know that sometimes you get the engineers and you're just like,
oh, good Lord.
Pulling any kind of personality and socialness out of them is just like, it's so hard.
That is incredibly kind.
I don't know.
I don't know if I deserve that, but thank you.

(35:45):
So I will skip over all that stuff.
What else would you like to?
So I'm kind of curious.
What point did you decide that data co-located with compute was a problem to solve?
And what are you doing that's new?
Out of all the interesting questions you could ask David, like all the tea, look at his eyes.
Like they're just, it's hiding behind the eyes and it wants to come out.

(36:10):
And the funny part is like, it's funny that we started with Kubernetes because that was it.
Right?
It's one of these things where you're like, when you hear something for like 10 years
and you're like, oh yeah, I actually heard this problem a million years ago.
So a little bit more history for you.
When we were, when I was at Google, like one of the first features that I launched, I wanted

(36:32):
to launch the first PRDs I wrote when I was at Google or on the Kubernetes team was a
product called Uber Netties.
And it was the idea of how do you federate Kubernetes clusters together?
Right?
How do you have an API server?
I'm sad this didn't work out because the name is cool.
I mean, it was like, it was genius.
It was a genius name, Uber Netties, right?
But the problem was is that these things don't work together, right?

(36:57):
Kubernetes is incredible.
It's not going anywhere, but it is built for a world in which nodes have continual connectivity
between itself and the API server.
And if it goes away for even a small amount of time, Kubernetes is really unhappy.
And, and so then you had someone come along like Brian from Chick-fil-A who has that amazing

(37:20):
2017 talk about how they have a Kubernetes cluster in every single Chick-fil-A.
And they still have that today, right?
It's incredible.
Doesn't Walmart have that too?
Yeah.
Almost everybody starts it in a bunch of other people.
And it's all kind of insane, right?
Like you're kind of like, hey, you know, there's something weird about this, right?
Why isn't there a platform that sits over the top?

(37:41):
And, you know, when I was thinking about it, it really was around, you know, data, right?
And it's the, it's the idea that like data is the challenge here.
I like to say like there, there are three things that, that will never change, right?
One, data growing, not really in dispute, data will continue to grow.
But, but the key is that it will grow everywhere, right?
Not in a central data center in Iowa or Oregon or Tokyo, but, you know,

(38:06):
cross zone, cross region, cross cloud on Prem, Edge, IoT, blah, blah, blah, blah, blah.
Like data is coming from all those places and Kubernetes is not in those places,
nor are any of these other giant data warehouses.
They're all sitting inside of your massive data center as they should.
But somehow that data is there and you got to figure out how to get it in.
And it can't just be a log shipper.
Cause guess what?

(38:27):
Look at what happened with log 4j.
Exactly what you're saying earlier.
You ship raw stuff from the edge.
Go your central data warehouse.
Bam.
Security bullet really.
Guaranteed.
And we haven't even gotten into the other things, right?
So that number one, that's a beautiful number two, speed of light, not getting any faster.
Right?
Just it is what it is.
In 10,000 years, it will still be 49 millisecond ping time between Boston and LA.

(38:52):
It just will.
And so if you want to do anything that takes is faster than that, you're going to need a system
that like can take the action remotely.
But on top of that, like networking is just not keeping up.
And it's not because they aren't out there busting their ass.
It's cause data is growing even faster.
And then the captain gets you every time.
Sorry, go ahead.
The captain gets you every time.

(39:14):
Captain gets you every time.
Thank you very much.
So then the third is around security and regulations and those are growing, right?
So GPR and HIPAA and things like that.
You, you tend to put yourself at risk the moment you move a bit off wherever you generated it.
Right?
Now you, you're a wonderful segue because that's exactly it.
Every major platform today is built around the C and the A of cap theorem, right?

(39:37):
Consistency and consensus, whatever you want to say and availability.
Right.
That's amazing.
Something should be built around the other half of it.
Availability and support for network partitioning.
And that's because of all those things I just said.
And, and, you know, when you go and look at the Chick-fil-A example or the Home Depot or
the millions of other folks out there who have these multiple deployments, retail outlets,

(40:00):
manufacturing, security, et cetera, et cetera, this is the problem.
Right?
Because that data is over there.
I want to do things declaratively.
I want to take action over my data before I move it, but I still want it to move.
So how do I do that when the network could go away for a minute, an hour, a day, because

(40:21):
I'm going to put a backhoe through something who knows what, I want those systems and those
systems to keep working.
But when they reconnect, I want someone to like eventually bring this to consistency.
And that's what we provided.
I also think that's even more value because for one, the more you spread your data, the
more you're spreading your attack surface.
Absolutely.

(40:42):
And then I think that networking and security are both things that developers aren't always
educated on and they're very like in depth areas, right?
So the more that you can do in those areas to set them up for success, the better because
and now that we're going to have more agents just, you know.
Doing whatever agents do?
Yeah.

(41:03):
Well, the thing is, is that we didn't, we are struggling to teach developer security,
right?
So now if they don't understand all of the security and then they are giving permissions
to agents that they already don't understand, that it's just a recipe for like the more that
you're scaling this, yeah, you're scaling it, but you're scaling disaster in some ways.

(41:24):
So it's like, I think that this is going to, that has like so much value just on so many.
I mean, I'm sorry, please.
David, what you just described is basically just the, all the benefits of edge computing,
right?
Yeah.
Like we can, we can, how do we get more compute at the edge?
How do we make it more powerful?
How do we make it easier to manage from a central place?

(41:45):
So what is it that you're doing that's different than like, I don't know what to say, traditional
edge computing, but like the, the idea is behind edge computing is just like, put compute
closer milliseconds, have it better, have better storage, whatever, to where the data's
being created.
Yeah.
So, you know, that's, we don't provide the edge compute.
That is my, what do you call it here, by the way, my visual aid, my Raspberry Pi, right?

(42:07):
That we run great on.
Well, I don't know why you're coming apart here.
That's weird.
It's a Raspberry Pi.
Yeah, exactly.
But no, you're exactly right.
And what we provide is the distributed consensus layer.
And what that means is that like, turns out that that thing that you put on the edge is

(42:28):
wonderful, but how do I know what is there?
How do I know it stays running?
How do I change the configuration?
How do I do all this in a network and disconnected friendly way?
That is the challenge of distributed computing.
That is 40 years of academic research.
And what we give you is a Kubernetes or container or orchestrated like experience, but one that

(42:51):
is resilient to eventual consistency.
And so if something happened on our side while you were disconnected, if something happened
on that side while you were disconnected, you know, we, we give you, you know, the one
of the big things that we keep seeing is we call it intelligent data pipelines, right?
Where you can much more intelligently do some, not all some of your process before you move

(43:15):
it.
So a trivial example I talk about all the time is a lot of times factory owners, for example,
will have all these sensors all over the place and they're great.
But the sensors usually come straight from a warehouse in Shenzhen and get installed immediately,
right?
And they have no information, you know, no GPS, no, no schema.

(43:36):
A lot of times they'll just output like a raw text string and they'll, you, you jam that
into a backend, right?
Well, geez, the moment you do that, you now have to reverse engineer all of the information
that you lost along the way, right?
What kind of machine was it running on?
What was the firmware?
What was the disk?
What was, you know, what, where in the room was it?

(43:57):
You know, so on and so forth.
And so if you took something, right, you took one of these Raspberry Pis, you stuck it inside
that factory and you said, Hey, you know what, before you send this raw into the back end,
send it to this local machine or like local being like within the region and attach some
metadata to it and do some initial data model enforcement and do some schematization.

(44:19):
So change it from a flat text string into JSON or, you know, structured logging or whatever
and take your pick, but still go to the backend.
All you're doing is you're moving, like I say, you know, you may have a 17 step ETL pipeline
and all your enterprise customers are like, Yeah, right.
Add a zero buddy, right?
But like, you know, take, you take those first, I don't know, four or five steps of your pipeline,

(44:42):
data and model enforcement, schematization, adding metadata, adding providence, adding location,
filtering, aggregation, just do some of those things before you move it.
And magically, all those things downstream become better, faster, smarter.
You can multi home stuff.
So a lot of times, for example, you know, you might have these all these sensors pushing

(45:05):
into the back end, no matter how fast your pipeline is, it might take, you know, five,
10 minutes to ultimately go all the way through the pipeline, very, very common, not because
the pipeline isn't like busting its ass, but because it needs to aggregate from all these
sources before it does anything.
Imagine that you have a factory floor where four different sensors are simultaneously saying,

(45:27):
Hey, you know what, we're above our temperature threshold, right?
Um, do you want to wait 10 minutes to know that?
Wouldn't it be nice if you could trigger an event from that location?
We can do that for you.
And again, it's just by taking some of this and moving that out there and saying, Hey,
you know what, we're still going to send all the raw data back, but simultaneously, we're

(45:48):
also going to track her, you know, trigger pager duty or whatever, take your pick by
your hydrant, you know, any kind of other endpoint sqs, we're going to trigger that from this
location.
We can help you do that too.
Again, we're not like stopping the rest of it.
We're not even, you know, we can save you money if you want, we can reduce data if you
want, we can do other things, but even just putting us there helps you be intelligent

(46:10):
in a more distributed way.
And to your point earlier about what edge compute, uh, doesn't do here, it's not that
edge compute isn't critical to this.
We can't operate without some form of edge compute.
It's really about the orchestration of that job.
So again, let's say you're like, Hey, you know what, I want to change this from being

(46:32):
for, you know, uh, sensors going bad to five sensors going bad.
Imagine what that's involved today.
Right.
How do you actually push that down?
How do you know what version of this job is running?
How do you know what the last time you saw this error is, whatever it might be.
Um, all of that is hard to get down to these places.
And we give you a clean API that is resilient to these networks that gives you, you know,

(46:56):
full and rich intelligent pipelines at the edge and help you push that stuff through.
Uh, and by the way, when people talk about the next trillion devices that are out there,
all of which doing inference, either way, we do that too.
Right.
Like, because at the end of the day, you know, inference is just remote data being processed
and, and we hope you, you, you do that as well.

(47:19):
I think this is going to be really cool too, because now with AI and all the different,
we want to get data from everything.
It's the promise of the future, but to be able to analyze all that people don't understand
that distributed systems.
Like you either, like, especially when it comes to either like, so their IOT is one aspect,
but then you have the retail aspect where you don't want to charge people's cards more

(47:42):
than once.
So you have to have that eventually consistent and really worry about how you're, so like
I went and made an order from happy lemon the other day and I was on my way to a tattoo
appointment and I was trying to speed this up because I'm always late just in those,
but, but I ended up getting there and I'm like, why isn't my order like ready?
So I go and buy it and then all of a sudden my order comes through and I think it was

(48:05):
because they had a like connection issue and I'm just sitting there like something's going
on in your back end that you have an eventually consistent like, like database, but I'm sitting
there and happy lemon, like trying to figure out what's wrong with their back end.
This is, this is why we're friends because I often will like something.
I was diagnosing their problem.
Is this a backup job?

(48:26):
Is this, do you need more workers?
Is this of Kafka something?
Yes, but see the thing is, is that like what you're, okay, so people don't realize that
when you're streaming that amount of data, like David's talking about, you get now this
bottleneck when all of it starts to come back in that like it's so bad.
So the fact that you're doing, like this is another, it's almost like the concept of breaking

(48:50):
different, just kind of compartmentalizing it.
What's the word I'm looking for?
Like, wouldn't you break up things when people want to do like service oriented or kind of
like, so you're not monolith, but you're kind of in different services.
Yes.
Not, but what's the other way to do it?
Not microservices.
So a service oriented architecture.
Sort of, but like basically you're keeping it so that way when one area breaks, it doesn't

(49:15):
completely break everything else.
But like if you, so if you're processing this, like for one people are bad at processing
and schemas and all of that stuff anyways, but if you're doing some of the work in these
individual places that when you get backed up and then you send it all to the same pipeline,
you're now not creating the same stress and bottleneck on your pipelines.
And because we're going to get more and more and more data and people are going to want

(49:38):
to like do all these crazy things with it, like that's great, but it's going to cause
more and more stress.
Like we keep making, like at this point, like Jenkins is not made for the amount of crazy
stuff that like, cause think about it.
It wasn't made for anything.
It wasn't, but it was originally made for Java a million years ago.
And now people are trying to use it in this new modern way or just pipeline services that

(50:01):
were not made up for this amount of heavy data streaming that we're doing, you know, so
we're
David, what do you think the next bottleneck is?
Right?
Cause I do think that data is the obvious one in connectivity to, especially if you're looking
at edge, right?
You're like, Oh, there's some capacity limitation in an edge environment, whether that's compute,
whether that's data.
That's kind of like what we were talking about with LA too, though, because about like how

(50:23):
hardware is moving faster.
Like I don't know if the different parts of computing like are in sync with how some
are moving so fast, you know, it's interesting to see.
Yeah.
Which one do you think is going to outpace the other thing?
Cause like you said, never, uh, let's speed of light is never going to get faster, but
the pipes are getting a little bigger.
But is that a, do we just need better compute to compress that?

(50:44):
No, I, I, I, again, my personal opinion is, uh, first, you know, it's, it's not about
the pipes.
It's about the latency.
Right.
And that will never change.
No amount of compute will, will ever improve latency cause speed of light can't get.
They'll lie to you and say, well, though, like they're just like, and then if you do
this, and I'm just like, no, that's how that works.

(51:05):
But, but, but that said, you know, the, the, the fact is, is that the compute, one of the
things that I talked to a lot of people about is like, the compute is also unused.
Like it's just, you know, a lot of this stuff, again, this Raspberry Pi can, can do a ridiculous
amount of throughput, like ridiculous, um, uh, far more than people would think.

(51:27):
And you're like, well, shit, you know, I'm already have it out there.
It's already doing this other stuff.
I might as well.
Um, and what I would say is I will contest your point.
Like, yeah, pipes are getting better, but they, they're getting bigger even faster.
That's all I'm saying.
Like, and so it really is like the amount of, and we're going to be making just useless
data because people like, I think that they just really like data is the most valuable,

(51:50):
you know, commodity, but also because we have all of these sensors and we have all of this
AI trying to make all of this data.
I think we're going to end up just, and we have, we're going to have so much compute
power with all these data centers that like, so people are just almost, I don't think they
realize how much infrastructure and data are growing.

(52:11):
Yeah.
Totally agree.
Totally agree.
So all, all I want to say though is, is, uh, Justin is, uh, um, you know, you are, it
will be a challenge in a year or two years, five years time to have anything, even this
Raspberry Pi, not have acceleration on it, right?
Like just at current power, right?

(52:33):
It will still have an acceleration and maybe that's because of the system on a chip or whatever
it might be.
But then you're like, it's not that I don't want to use my Blackwell plus plus, you know,
whatever as a central thing, but why don't I have it work on the more interesting problems
and have whatever a GPU that's sitting on this thing do some of the work?
Like I, I gave a demo, um, uh, earlier this week that showed, uh, onboarding from 20 nodes,

(53:01):
uh, on three different clouds using Expanso, um, uh, to all of the, to, to, uh, BigQuery,
for example, right?
As a backend.
And, uh, I was able to push, uh, 27 million rows in under 40 seconds, right?
From all of these things.
And that's not because, you know, there was something magical happening here.
It was because I was adding together the aggregate of all that bandwidth at the same time.

(53:27):
And there's just, there's no way to like make that any faster.
Like no amount of network will ever achieve what you can do with the same network multiplied
times, you know, the number of nodes I have, right?
It's just the way that works.
So like I would argue, yeah.
I was just like, that's exactly the age old problem we've had in infrastructure where

(53:49):
you can vertically scale one machine and say, I need, I need this big Oracle database on
that one machine and it needs a terabyte of memory, or you can go with, give me five different
racks and I'm going to spread that amount of memory across them.
And there is overhead for the coordination, but we've found those just better performance
in general for resiliency and for all these other things.

(54:11):
So being able to spread out that and aggregate it is obviously like we can't make single
machines big enough.
We can't make single pipes big enough that are, that are economic, right?
Cause we can do like Japan just broke a record.
They're doing like a, like you could download all of Netflix in under a minute.
Right.
It's like, yeah, it's like petabyte of throughput, but I can't buy that and no one's going to,
no one's going to buy that one.

(54:32):
They're like, actually, I'm just going to go spend it on 10 gig networks everywhere
rather than one giant pipe.
I think this is interesting.
I think it's interesting if you think about like this conversation we're having with
LA about how like much like hardware has advanced, right?
And now everybody really wants to run their own stuff.
But I think that we were almost like optimizing like these, the different hardware because

(54:55):
for a long time everyone was using cloud.
So they were building it for these cloud companies and then for AI making this really advanced
hardware that people weren't playing with and experimenting as much because they were using
it in the cloud.
And now that I think people are getting almost like reacquaint it with hardware and what
hardware can do that.
It's really interesting to see what people are going to push the limits with this hardware

(55:17):
because it's so optimized for AI and cloud and all these different places that it was
being used in.
And now developers and startups are getting that actual hardware back in their hands.
Yeah.
And I think it's going to be interesting.
Like you said, like what Raspberry Pi can do, you know what I mean?
Yeah.
Absolutely.
Absolutely.

(55:38):
And there's a threshold of when that hardware advancement becomes just universally available.
That's what I'm saying because there's so much of it.
This is so cheap that it's economical for me just to like redevelop something so that
it uses that.
I remember at Disney animation at one point we were switching out hardware because we switched
out racks and racks of servers because the new chip had this.
I forget what the math was, but it was some function that we do a lot in rendering on

(56:01):
the CPU.
And we're like, actually, we will just render this movie like 10% faster by swapping out
all these CPUs.
And it's going to cost us millions of dollars to swap out all these CPUs, but we're going
to get the movie done in time versus delaying it.
And that's absolutely worth it or whatever.
It's just like, yeah, no, we're going to do that part in hardware and no longer do it
in software.
I think we're going to see a lot of that.

(56:22):
Like if you look at how Apple is processing AI inside of the iPhones just because that
way it's technically safer and it's more secure and they could promise more.
But Siri's still the dumbest one.
I don't know.
But still that's, but that like, if somebody told you, if someone told you 10 years ago
that you are not only going to be able to run an AI model and have it processed in a chip

(56:44):
that's in your pocket, that's still amazing.
I don't care what you say.
I mean, there's, oh, sorry, please.
But just, you know what I mean?
Like hardware is changing so much.
And like there's whole developers that went half a whole career or like young developers,
like maybe that are mid developers now that have never got to play with that kind of hardware.
And we're making it at such a fast speed and they're making these chips for AI that so much

(57:07):
hardware is going to now just have like either it's going to be overproduced at some point
or they're going to get rid of all the old hardware and now it's going to get so cheap.
I keep saying, I'm so excited for three to four years from now with all of these GPUs
hit secondhand market.
Yeah.
Everyone has super powerful broad today and video chips like actually I can run so much

(57:28):
stuff.
Like the, just the way that the tech market's been really weird and all of that.
Like I just wonder what cool advancements are going to come out of that, you know, moment.
Yeah.
You know, it's, it's, it's fascinating.
So there's an amazing story.
I remember way back when the plenty of fish guy.

(57:49):
Did anyone remember plenty of fish fish was a dating site and it was like a competition
to okay, Cupid and all these various things out there.
And it was really, really popular and very, very funny.
Like at the time this is way before containers and things like that.
He had built in the entire thing to be a vertical massive machine.

(58:13):
And it was the like he, I think what I was at Microsoft the first time and I think he
had the largest like single instance thing that we knew about it was like seriously was
like 128 CPUs or something like that.
It was like just an absurd thing.
And he's, and everyone's like, why do you do like build this out?
And he's like, cause I don't need to and it's easier.
And like you go to like Monzo's thing that a graph of microservices that got like then

(58:42):
everyone got up in arms about a few years ago.
Oh my God, why are you doing all these microservices?
Like, and, and they're not wrong.
You know, like you don't get complexity for like that's a cost.
But the cost may be worth the benefit.
Right.
And so like exactly like you were just saying on them, like we're going to have all this
compute out there and we're going to have this, but we need to have lower like it needs

(59:06):
to be sub linear cost in scaling that every additional machine should not cost the same.
It should be very, very small increment and benefit.
And, and you know, I'm trying to participate in that by like offering this platform that
helps with that, but like Kubernetes certainly did that.
And, and, you know, Docker with its portability certainly did that and all these kind of things.
Right.
There's just a bunch of different ways to go out and tackle this thing.

(59:29):
And so what I would say is that, you know, you're exactly right.
We need to enable this, but we need to enable it smartly because it is really, really, really
easy to go and get a massive machine, two terabyte machine and just stick everything
on there.
This piece of cake single machine.
I know where to log in.

(59:50):
I can manage the entire thing with SSH and call it a day.
That's not particularly efficient, but it's also not terrible.
But also just a lot of tax service and exactly is just looking at this.
Exactly.
They're going to have so much fun with you.
But what I like to say is the reason I built this company is like they're these immutable
truths.

(01:00:11):
Like the fact is, is no matter how great that machine is, my entire business does not exist.
In co-located with that machine, like I have real physical things in the world, whatever
they may be users that are accessing my website from all over the world or retail outlets
or hospitals or factories or cars, like that stuff is happening too.

(01:00:35):
And, and so then people are like, well, don't worry.
I'll just take all that stuff and I'll build a digital twin and I'll just mimic all that
stuff.
And I'm like, oh, that's not see that.
But like, you want to be redundant, but that's not all that is not just making an exact copy
is not always it.
100%.
I think that we're, I think we're going to be into a weird space though, because we're

(01:00:57):
removing so much abstraction, like cloud was an abstraction from hardware, but AI is
like a super extraction.
And we're not only forcing engineers to use it, but we're going to grow a whole like
generation of developers on AI.
So you've got people that are either experimenting with hardware and they're in the like thick

(01:01:18):
of it, or they are even more abstracted than the whole generation of developers that we
just had that came into the cloud.
So how will we kind of like educate people on how to use those things, because like you
said, it's really easy, especially with the money that people are throwing at certain
like things where they're just going to buy this huge hardware and put everything, you

(01:01:39):
know what I mean?
Because it's going to be simple and it's going to be less permissions to give AI.
And like, how do we even educate people to do that?
Absolutely.
Absolutely right.
Yeah.
There's just a, I mean, you just put your finger, you're what do you call it?
And the key is, again, you touched on it.
The education will be giving people a full picture of the world, right?

(01:02:02):
Like, you know, when we were first trying to get people to adopt the cloud, there were
so many times people like, Oh, I don't know.
You know, I got these machines and so on and so forth.
And we're like, you know, like they would be like, why would I go out and pay for something
when I already have the assets in house?
And, and the conversation was like, well, are you really capturing what you have and
what you're doing?
Like, do you want to let go and reboot the machine at three o'clock in the morning?

(01:02:26):
Do you want to migrate, you know, the OS when you have to the, you know, the hypervisor,
et cetera, et cetera?
Again, no one's saying that's not an answer.
But when you're doing this, you need to think about the entire scope of the problem and
capture all the costs.
Because if it's just this, that's not going to be enough.

(01:02:47):
And so that's very much what it is.
I think that's like the thing we get these trends and everybody wants to do it.
And then they never like, yeah, you can be in the cloud, but then you have to think about
how expensive the cloud is and the fact that you're like abstracted.
But then you get on-prem and then you have to figure out, do you have a DBA to run all
this stuff?
And like, you have to be able to be good at kind of figuring out your future.

(01:03:10):
But okay.
So when you're talking about Kubernetes and Dockers, it made me think of Corey Quinn's
scale talk about how he compared Docker to like that.
Was it like a F-14 or something?
I didn't see it.
But he basically like was comparing Docker and Kubernetes and basically said it was like
the worst, but it was so funny.
It was basically he was just really explaining Kubernetes and Docker and the differences.

(01:03:34):
But what do you think the next, like, what is the next big, because everything in tech
is like databases are databases.
Compute is compute.
What's the big next, I guess, like revolution and compute and Kubernetes and Docker.
Obviously, I personally have this general opinion, right?

(01:03:58):
Obviously, I think edge and distributed things is going to be enormous.
And I very much hope to be a part of that.
Because again, I love building on things that can never change, right?
All those things I said earlier will never change.
Data will grow, speed of light will get faster, regulations will be out there, all that kind
of stuff.

(01:04:19):
I mean, we're old that we like the old constant things.
Does that mean that we're old people now?
No, it's because like you just recognize it now.
Yeah.
Well, I mean, it's one of these things where it's like, you know what, the secret sauce
behind Manhattan is not because, you know, like why it gets so tall buildings is because
they know where the bedrock is, right?
And so they're able to drill down to things that will never change for better or worse.

(01:04:42):
And so like, I think that's critically important to understand the things that will never change
and then figure out what will happen inside that is what will be next.
So my take on it is, and again, I say this with a highly biased thing is like, it will
be how do I act like a cloud, but in a thing that respects those immutable truths and matches

(01:05:03):
where the data and the compute and the things actually are growing.
And so when you see Jensen stand on stage and talk about the next, you know, trillion devices
where you talk about, you know, me being able to have instant response and instant memory
on my phone or whatever it might be, that's not everything going back to a central API.

(01:05:24):
That's that's those things out there having smarts at a level that is incredibly, you know,
that feels integrated.
And again, it's where it gets to that sublinear scaling because like I'm telling you the Gemini
AI and the anthropic people, they're like, they don't want to be out there like, you know,
managing why, you know, something isn't working at some factory and whatever, you know, the

(01:05:48):
Philippines, they want to have like a very easy way for someone out there to deploy a
model and run it and have it be reliable and debug it and, you know, have metadata around
it, which is the other thing that I think is super lost and something that we support
actively.
But it's, it's, I think that everything is a graph, right?

(01:06:12):
And that, you know, all these things workflows and transformations and so on are super under
invested in by us as a community.
And it really is it distills down to the simplest possible thing, which is here's this artifact,
this binary thing that I want you to run.

(01:06:33):
And as you do that, I want to record in a way that is programmatic for the, the computer
to understand what went into that thing, what the thing did, and then what the output was,
right?
And simply by having a structured way to approach that will change so much, it will change CICD,

(01:06:54):
it will change execution, it will change all these various things.
And there are many like larger efforts around stuff like this open telemetry is a perfect
example, right?
Where you start to think about things as traces and so on.
But I do think that when you hear the word reproducibility crisis, or you, you see someone
at two o'clock in the morning trying to figure out what the fuck is going on and why things

(01:07:16):
are debug hard to debug.
It's almost always that problem.
I don't know what went into this thing.
I don't know how it ran.
And I don't know how it came out in a deterministic way.
And if you don't have that, we will continue to like try and build these like incredibly
hacky scripts to parse stack traces to figure out what the fuck was going on.

(01:07:41):
Do you think AI is going to contribute to that problem?
No, I think it'll be much worse because, because in its core AI is not deterministic.
And so I mean, like contribute to making more of it.
Like, so, okay, technically think about it.
If AI starts vibe coding a bunch of these things.
Oh, and we're going to be like, you know, we're already increasing the amount like of agents

(01:08:03):
and different things that are coming back to us.
When you don't know what the expected output should be.
Absolutely.
It's really hard to diagnose a problem.
So I think that not only are you onto things as just development in general, but like that
is going to almost be like multiplied by the new way of development.
Yeah, I totally misunderstood what you're saying.
Yes, exactly.

(01:08:24):
It's the, it's the fact that those models are not deterministic that are so brutal.
And, and whoever breaks through the determinism around AI.
I mean, you can do it.
You can get close with things like rappers and things like that, but it's not there.
And I think the thing is, is it's hard to be deterministic as a developer.

(01:08:46):
That's a human because there's so many ways to build things, right?
Like, and there's so many ways to like, argue like what in half the way, half the time you're like,
is it because you did this before?
Is it because you like this method?
So then the fact that humans can do it.
Yeah, there's another incredibly smart friend of mine who's, who's right up here.
Who's saying exactly what you're saying.

(01:09:07):
And, and, you know, the new hotness around ML is, so his name's homel.
He has a courses on this and things like that.
It's all about evals, right?
That is such a brutally important and totally must think by a lot of the ML people adopting ML and AI right out,
which is like, how do I programmatically verify that this model does what I said it should do, right?

(01:09:32):
Like, unless you have that, like, do not even begin to go down the ML path.
Because my God, you know, like, unless you put a human in the loop, which is fine,
you're never going to be able to like train or build your model in an insensible way.
I think I'm trying to figure that out.
Like, how do you use AI to be faster at things, but also the fact that you have to then go verify.

(01:09:54):
Is it faster?
You know, like, I'm trying to figure out how do you use it to learn and to get better at things
without just losing the abstraction and the knowledge that you need to gain?
I mean, you know, the, the, that, that piece that came out, I think they got it pretty wrong about the like,
oh, you know, coders are slower when they use the ML and so on.

(01:10:16):
I think that missed it.
Like, because it didn't really represent the way that people do this, right?
Like, what they'll do is they'll like stand, you know, vibe code something,
and then they'll like try and compile it, or then they'll lint it, and then they'll actually run it,
and then they'll Google like something, you know, and see whether or not this was a good approach.
And then they'll go back to vibe code some more.
Right.

(01:10:37):
So it's like, it's not this like vibe code only or hand code only.
It really is a mix and match.
And, and right now the only way to solve what you're describing is with that human in the loop where they look at the thing.
They become the evaluator.
Do you think we're going to break Stack Overflow though?
Because this is what I was conspiring and thinking about last night.
Okay, like, I mean Stack Overflow, I love it to death.

(01:11:00):
I listened to it.
I listened to that podcast from day one.
But like the website, like we so many developers kind of depend on the fact that we're all hitting these issues, right?
And someone hit the problem before you, and then we wrote it down somewhere and it's like the,
it's the notebook that we all go and look in and we're like, hey, did you have this error?
I'm old enough that that notebook has moved a couple of times on the internet.

(01:11:22):
Like if you're, if you're of a certain vintage, you remember.
But it's the fact that no one's going to write it down.
David, do you remember Experts Exchange?
Of course.
Yeah.
Experts Exchange was the expert sex change.
Yeah, exactly.
But the models are getting it though, because we're like, you know what I mean?
Like, like if nobody asks questions to like other humans anymore and we're only asking it to AI,
do we then break the loop of knowledge?

(01:11:45):
I mean, what's going to happen is I already know what's going to happen, right?
The one of these models right now, it's probably going to be Claude, right?
Or we'll, we'll get the majority of the questions and it will be able to do a little mini loop and say like,
oh, you know, this person did this in Python and then they ask it again and then it did it again.

(01:12:06):
And they'll be able to tie those together and say like, okay, this is what actually was happening.
And then their model will magically just become smarter.
Now, if they were like a social good, they would release those datasets to the world and make it easy,
but that's not going to be it either.
So it's funny.
I gave a talk and ML talk in the winter.
And it was an experiment because I took the exact same talk I gave when I was launching Kubeflow in 2018,

(01:12:32):
I want to say, and I gave it again with no changes, not a single change to the slides.
Everything was the exact same, which is hilarious, right?
It was all about security vulnerabilities through ML and like what you're going to face here and like how you defend your model
and how you do this and how you do that.
And one of them was around distillation of models, right?

(01:12:54):
And so what I said is true, Claude or whatever, you know, frontier model will walk away with like the best coding model today.
And someone else, the number two or number three on the list will use number one as a verifier as it's going through its testing.
Okay.

(01:13:15):
And it doesn't need to be a lot.
It's like a thousand queries or 2000 queries and they will get so much smarter.
They will.
It will not be defendable.
It will not whether not ethically or unethically that will leak out.
And then the second model will be good, right?
And then the third model and now you will have this consensus around these models and that will lift all the boats.

(01:13:38):
And so, you know, I would love if, if the, you know, whoever becomes number one model, like just releases data sets so that we can all grow together.
But, you know, what will not happen is that will never stay secret.
David, hold on.
David.
Yes.
David, we lost you.

(01:13:59):
Oh.
You just, you just broke up for the last like two minutes.
I didn't hear it.
No.
Did you hear it on him?
Okay.
What's the last thing?
I dropped it too.
Yeah.
I just got a notice saying it's there.
Can you hear me?
Still not.
Yeah.
You're back.
Okay.
Where do you want me to go?
Go, go back to, to what will happen once the second model starts training on the first model.

(01:14:24):
Yeah.
Okay.
The, the second model will come along and they will begin to train using some of the wisdom from the first model.
They'll use it as a tool just as a verifier.
And as for as much as the first model wants to block it, it will not be possible to block because it doesn't require very much.
Like you're talking about literally thousands of total queries over a few days and you can get a very accurate representation of the underlying model.

(01:14:52):
And then the second model will be good and the third model will be good and the open source one will be good.
And now everyone's boats are lifted and then you're going to do this again and again and again and again.
And so what does that mean for developers and for the tribal knowledge that we all share on the internet so we can all, you know, excellent question.
I think that, you know, I love Stack Overflow, but I think it's going to go away or whatever.

(01:15:18):
I don't know.
It will migrate to a new community, right?
Because I think people will ask their first few questions of this and then want to talk to a human.
But over time, the chat, whatever the chat interface with your code will become so good, you're like, maybe I don't need to talk to a human.

(01:15:39):
I don't know.
I don't know.
But does that mean that the humans no longer hold that knowledge and they don't need us?
I don't.
I think we'll have the same depth of coding.
No, I mean, you know, to some degree, like, I haven't, you know, I took compilers years ago as a class and hit the compiler.

(01:16:03):
I haven't done anything with a compiler.
I'd like to know how compilers work.
And I think it helps me when I am coding, you know, which is rare, like be a better coder.
But it becomes an abstraction layer that I just, I don't think about, you know, 99% of my day.
Maybe a lot of these concepts reach that point.

(01:16:26):
I don't know.
I don't know.
I mean, like half a dozen standard questions you ask when you're interviewing at Big Tech.
How do you do, you know, what do you want to pick a stack or a heap or whatever, like, you know, people are going to be like, why are you even asking that question?
I don't know why they ask it half the time.
It's, you know, you know, Joel, Joel Spolsky from from Stack Overflow and Joel on software talked about this, whatever, 20 years ago, he said that he asked, he would ask questions about, you know, basic HTML concepts, right, as part of his thing.

(01:17:00):
And he said that, like, it was just, it was just a filter.
That really is it.
It's not to say whether or not you know that, but like the amount that people would like exaggerate, I'll put it politely on their resume and not even be able to answer the most basic thing.
That's, that's really all you're like looking for from this question.

(01:17:21):
But, you know, I'm with you.
Like people are like, oh, you know, I have a, I want you to develop a queue that can handle throttling and so on and so forth.
And you're like, all right, really?
Like, I get it.
Nobody's going to actually build that.
I mean, you will, you will.
But the first thing you're going to go do is Google how to do it.
That's what I'm saying.

(01:17:43):
Yeah.
Like no one does that off of like knowledge.
You're going to go look it up and then you're going to compare three different things and optimize it and.
David, this has been fun.
Where should people find you on the internet?
So I'm a big blue sky guy.
I certainly continue to spew there.
Please come try.

(01:18:04):
Try our platform.
I would love to hear from you.
If you're a data, if you're touch data, right?
If you spend any meaningful amount on data, I want to hear how you, we can make our distributed data pipelines better.
Expanso.io.
I post all my talks at my website, David Aron check.com.
And I just love talking to people.

(01:18:25):
I, you know, I am the first person to say I like to be the dumbest person in the room, which is very easy because, you know, when you're this dumb, it's your, everyone's smart.
But like, I, you know, I just want to be smarter about like you and your business and your, I don't know what your opinions are.
I got in an argument last night with a guy who like rewrote basically all of Excel and raw JavaScript, like from scratch.

(01:18:48):
And I was like, what the hell are you doing, man?
He was like, well, this isn't this because he wanted it.
He's just a JavaScript guy.
And I was like, my God, how do you even do that?
Not TypeScript, not coffee script, raw JavaScript.
That's a quote.
I know.
That is, that is, but anyhow, I loved it.
It was amazing conversation.
We will, we have you in the blue sky starter pack for our guests.

(01:19:12):
I need to convert that to a list at some point.
So people can find you there.
I think you're iron yuppie on blue sky.
I am iron yuppie.
That is me.
So yeah.
And we'll, we'll definitely, we'll figure out a time we can have you on for a second round of history, some of your antics at Kubeflow and Microsoft.
I have so many questions.
I just want to know the tea.
Like the next whole episode has to be the tea.
I will talk about this until I'm blue in the face.

(01:19:34):
I like, I say this as someone like, I just don't believe in like speaking ill of people.
So like, don't, don't tune in if you think I'm going to like badmouth someone.
Like it's just the navigations of these things happening is just so fascinating.
I feel so lucky that I couldn't be there.
All right.
Thank you so much.
And thank you everyone for listening.
We will talk to you again soon.

(01:20:02):
Thank you for listening to this episode of fork around and find out.
If you like this show, please consider sharing it with a friend, a coworker, a family member, or even an enemy.
However, we get the word out about this show helps it to become sustainable for the long term.
If you want to sponsor this show, please go to fafo.fm slash sponsor and reach out to us there about what you're interested in sponsoring and how we can help.

(01:20:26):
We hope your system stay available and your pagers stay quiet.
We'll see you again next time.

All Episodes

Episode Transcript

Popular Podcasts

On Purpose with Jay Shetty

Ruthie's Table 4

The Joe Rogan Experience

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Colocating Data with David Aronchick

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}On Purpose with Jay Shetty

Ruthie's Table 4

The Joe Rogan Experience

All Episodes

Colocating Data with David Aronchick

On Purpose with Jay Shetty