Introducing SNIA Storage.AI

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:04):
All right, everybody, welcome to the SNEA Experts on
Data podcast, and we've got anexciting day today because it's
launch day.
It's a big announcement and onethat I think that is both
surprising and not surprising.
We've seen a lot of buzz aroundthe subject and really what is
most important is why we need togo back to the foundation and

(00:27):
really revisit this from thebottom up and really think about
what is the net effect.
So we're going to get jumpingright to the fun part, and I'm
very happy to be joined here byJay Metz, of course, for peaks
that are brand new to me.
My name is Eric Wright.
I'm the co-host of the SNEExperts on Data podcast.
So, jay, let's jump in andlet's talk about storageai,

(00:50):
which is super cool.
It is.
Tell me what it is.
When in doubt, we always haveto start the.
So what exactly is storageai?

Speaker 2 (01:00):
Storageai.
I love this conversation.
It's one of the greatest thingsabout what I do is I get the
chance this is the fun part ofmy job which I get to talk about
the really new, cool thingsthat we're doing.
One of the things that has beenan interesting development over
the last couple of years, asanybody knows, is the fun, fancy

(01:22):
stuff we can do with AI.
You know, making AI videos,making AI music all that kind of
stuff is fun on the user side,On the back end.
It's a little less fun On theback end of being able to create
stuff where you have to createthe videos and make people happy
.
The underlying architecture isnot quite so straightforward.

(01:43):
As a matter of fact, one of thebiggest problems we have in the
AI world is getting informationinto the places where they need
to be.
So what storageai is?
It's a way of doing that.
It's a way of getting all thatdata into the processors and
getting them into the rightplaces at the right time for the
right moment, into the rightplaces at the right time for the

(02:06):
right moment, and so what we'relooking to do is have an open
ecosystem for being able to haveall the different companies
that are working on theseproblems.
Come together, talk it all over, have a solution that everybody
can sign on board with andeffectively make things life
more efficient and more useful,quite frankly.

(02:27):
But ultimately, what we'relooking to do is we're trying to
create a nice, ubiquitoussolution for getting data to
where it needs to be at theright time for the AI workload,
and that is not as easy as itsounds and certainly wasn't as
easy as just saying it out loudit out loud.

Speaker 1 (02:47):
Yeah, it's always interesting because we see
there's a ton of earlyinnovation.
That's obviously it drips offthe tongues of us now.
We refer to so many differentcompanies and brands that if you
talked about five years ago, noone on the street would even
know who they are.
But now it is really kind oflike ubiquitous and part of what
we do every day.
But it is funny.
I think, that the industry gotthere way before.

(03:10):
A lot of the vendors and theplatforms hardware, software,
cloud, et cetera that are goingto be running this stuff and
looking at how SNEA has alwaysbeen so important in really
getting, I'll say, a cooperativeway of doing it.

(03:31):
It's cooperative but it's alsocompetition.
It's an interesting thingbecause these are folks that are
industry competitors but wecome together as a community
because we all move faster whenwe move towards and with a
standard.
So you know, looking at AI ingeneral, how have you found

(03:55):
where it is in the world versuswhere it is in?
You know where the standardsbodies are coming to the fore
now.

Speaker 2 (04:02):
Yeah, let's do that.
Let's take a step back andrealize what the stuff really is
.
Right, because the problem isthat if you don't already know
the problem, you're not going tobe able to understand why this
is a big deal.
Okay and realistically, no oneindividual can have a good scope
of understanding of all thedifferent things that are going

(04:23):
on.
And when we talk about industrystandards or the organizations
or the companies that areputting all this stuff together,
it's all very nice fluff, butthe reality is that there are
problems and the problem, one ofthe biggest problems is that
the understanding of the problemis a problem in and of itself.
So, if you talk about an AIworkload for those people who

(04:46):
don't really understand the AIworkload, they don't know that
it's not actually one giantmonolithic workload right, it's
not one thing, it's not likeyou're.
If you're talking about yourlaptop, it's not Microsoft Word,
right, it's not Mail.
It's not Outlook, it's not yourweb browser.
That's an application, that's aworkload.

(05:08):
When we talk about the workloadfor AI, it's considerably more
complex and it actually winds upbeing a series of workloads and
it's not this one big thing.
If you're talking abouttraining, or if you're talking
about inference, it's broken upinto multiple different
workloads, and the problem is isthat getting the data where you
need it to be means thatsometimes you have to do it in

(05:29):
certain ways, and sometimes it'sin different formats, and
sometimes it's in differentstructures and different
locations and it's complicated.
And then you zoom in evenfurther and you start to realize
well, dang man, that's a hardthing to get from point A to
point B without going through X,y and Z, right?
Oh, I'm sorry, you're CanadianX, y and Z.
So the issue here is that whenyou break it down and say I need

(05:54):
to get the data from here tohere, because it then has to go
from there to there and there tothere and there to there before
you could ever start doing thework, and that's the thing that
people don't normally get.
There's no one way to do that.
There's a thousand ways ofdoing that, and they're all
unique, oftentimes they'reproprietary, they're all
difficult and it's getting to beoverburdensome, right?

(06:18):
So when we talk about the stuff, what we're trying to say is
like how do we line up the dots,how do we make sure that we're
not doing all these detours forevery single thing that has to
go along the line when we starttalking about AI, because
they'll talk about GPUs untilthe cows come home, right,
they'll talk about networks.
I talk about networks, right?
We talk about all these thingsthat are component parts of the

(06:39):
architecture as if it's thisnice compressed box and all you
have to do is make the boxreally nice and efficient.
It doesn't work that way.
That's not what winds uphappening.
So let me just give you anexample.
I need to have a GPU have thedata process, but the problem is
that the GPU doesn't talk thesame language as the data that's

(07:00):
stored on a drive let's justsay it's on an NVMe drive.
I have to convert that datainto a method or to a structure
that the GPU can use.
Ah, but it's not just that theGPU doesn't have a direct
connection into that data.
It's got to go through a CPU,and that's assuming that you've
got it locally on side of aserver.
What if it's not locally onside of a server?
You got to go through a CPU,then out through a network

(07:22):
interface card, then out ofnetwork into another network
interface card and then anotherstorage unit, oftentimes object
or file or something along thoselines.
So you've got a control planeand a data plane just to start
the communication in the firstplace.
And, depending upon where youare in the process, you could be
doing a lot of reading, youcould do a lot of writing, it

(07:44):
could be sequential, it could berandom, it's not all the same
all the time.
And so you have these multiplepersonality disorders that are
going on inside of a server withthese different processors and
the networks and the storage.
And orchestrating this stuff isnon-trivial, right, excuse me.
So what we're trying to do hereis saying look, how do we get

(08:07):
this to work properly in thefirst place?
That's number one.
Number two how do we catalogall this stuff to make sure we
understand exactly where theinefficiencies are?
That's also non-trivial becauseI could do some of the
processing in a cpu to make surethat I get everything where
it's going.
I can move some of that into adpu on the network side.
So now I've got to split thatprocessing.
Just through the work.

(08:29):
I could offload things intoother accelerators for memory
movement and so on and so forth.
And that becomes reallyproblematic because as you start
to increase in scale, you haveother problems.
You have to deal with Power Bigproblem, right, they're talking
about creating nuclear powerplants just to run AI.

(08:50):
When you're talking about thatmuch power, to be able to run a
workload, a single workload, youhave to make sure that it is as
efficient as you possibly canget, because that's a lot of
power, right.
One watt in power per CPU up toa million nodes is a big deal.
That's a million watts of power, right.
One watt in power per CPU up toa million nodes is a big deal.
That's a million watts of power, right.
But what happens when we yeah,exactly what happens?

(09:11):
If we need to you know we needto to figure out what happens at
scale when it comes to errors,right?
So one of the things that'sreally noticeable is that the
bigger you go, the moreinevitable it is to have
failures.
That's why we do checkpointing,right?
There's so many errors in anormal system at that scale that

(09:31):
people will have to checkpointon a very frequent basis.
Well, if you're checkpointingthose very expensive processors
which can run up to like 50,000a piece, and you're talking
about thousands, tens ofthousands and even hundreds of
thousands of these things,that's an expensive proposition
if they're not actually beingused, right?
So if they're waiting for allof this movement to go happen

(09:52):
before they can actually getstarted.
That's an efficiency, that's awaste of effort, it's a waste of
time, it's a waste of energyand it increases the likelihood
of error before you actually getto a completed workflow.
So these are the problems thatwe're trying to solve, and
there's no one way to do it.
Everybody's got an idea of theway that it should be done, but
not a way of making sure thateverybody's playing by the same
rules so that you can take asingle plug and plug it in and

(10:15):
know that it's going to work.
What we're doing at SNIA istrying to accomplish the really
Herculean task of saying youknow what.
It's very important that we allget together and have a good
conversation about what'sactually required and how we
need to solve these problems,because sometimes I need to move
memory, sometimes I need tohave the access to the data from
the GPU or a TPU, sometimes Ineed to have good communication

(10:38):
over the network, sometimes Ineed to be able to have direct
access from one memory toanother memory location, and so
on and so forth, and you have tobe able to have a lingua franca
to be able to have theconversation, and that's what
we're looking to do, and that'swhere we're tapping into the
enthusiasm of a lot of differentcompanies right now.
Wow, that's a long way ofsaying what the problem is.

Speaker 1 (10:59):
Yeah, well, that's exactly why this is so important
, because just even in there, wecould unpack 10 podcasts just
out of that section ofchallenges.
And we will.
Well, that's it.
That's the beauty part.
You know, thinking about justyou, would you always imagine
that we're just going to assumethat someone else is handling

(11:20):
that next thing, like, oh, we'vegot data movers that are
already happening at thecontroller layer, where they've
got CPU and OS optimizations.
You're like, yes, but that'sfor a particular type of
workload that doesn't act theway the software does Right and
well.
That's now the fundamental partis no longer fundamental.

(11:40):
We don't have assumptions thatthis is going to go on a single
architecture.
We don't have assumptions thatthis is going to go on a single
architecture.
We have multiple chiparchitectures GPUs, tpus, dpus.
That's a lot of PUs.
And let me tell you PU, when itcomes to trying to guess which
one you're going to be using.
If you're writing theapplication, that has to make

(12:02):
assumptions that this stuff willbe where it needs to be when it
needs to be there, or it'sgoing to be punishingly slow.
And remember the latencythresholds and the latency.
The cost that we're paying isso much larger Because before
it'd be like ah, you've got aweb transaction.

(12:23):
You're literally doing likecrud operations.
Let's face it, 90% ofapplications were crud operators
.
We're doing very simple.
You know reads and writes andupdates to databases.
Yeah, but now we've gotcombinations of structured data,
unstructured data.
We've got regs, so that we'vegot combinations of the existing

(12:44):
model plus additional externalstructured and unstructured data
.
We've got RAGs, so that we'vegot combinations of the existing
model plus additional externalstructured and unstructured data
.
It's going to live in athousand different places.
It's going to be plugged in byAPIs and we're going to have MCP
servers all over the place.
Great, now we know whereeverything is, but how do we get
to where we need it and in away that's the most efficient
and effective to perform thistask at the lowest cost, at the

(13:07):
highest efficiency?
And, yeah, it probably wouldlook like an intractable problem
, or at least probably a stackof intractable problems.
This is the intracta stack.
There's a lot to sort out here.
Sounds like a new startup.

Speaker 2 (13:22):
Well, I mean, I think one of the things is that the
traditional architectures ofcompute, network and storage are
what we've always been workingwith.
That's what we've been playingwith.
We've been trying out differenttopologies.
We're trying out different waysof handling the network,
handling different datastructures.

(13:42):
Key value is a new possibilitythat people are trying to
approach, but the difficulty ishow do you get this to work in
existing traditionalenvironments?
Key value and deposits theydon't go well together Now.
So, at the same time, if youTake a step back and you think

(14:03):
about it for a second, we havean opportunity here to really
rethink the problem right,because we're talking about a
completely different type ofcluster, a type of networking, a
different type of layout of theway that the problem has to be
resolved.
You've got your data, you'vegot your compute right and

(14:27):
you've got the transit ways ofgetting there.
Up until recently, we've nothad a lot of choices in the
granularity between any of thesethings.
You either had compute here,you had storage there, you had
network there, and that's whatyou had to work with.
But now I've got compute inprocessing and I can put it
anywhere.
But now I've got compute inprocessing and I can put it
anywhere.
I can put it on the network, Ican put it right next to the

(14:47):
data.
I can split it out intomultiple different areas.
I can have multiple cores orcompute units solving different
problems simultaneously.
There's a parallelism issuethat is not inherent in the
traditional mode of computenetwork and storage inherent in
the traditional mode of computenetwork and storage, right.
So if I want to say I have thisworkload and sometimes it's

(15:11):
better if I do the processing onthe data location itself, so
that I don't have any IO at all,right, well, that's not good
for the entirety of an AIworkload, but it is very good
for some parts of it.
Right, I may want to do all mypre-processing and pre-training
modeling using that compute nearthe data, right?
Or I might want to haveadditional processing on the

(15:33):
network point, right On thenetwork endpoint.
That's a possibility.
So if I open up my imaginationfor a second, I can say well,
what if I put the processingwhere I need it, as opposed to
putting the data where I need it, because now I'm just putting
in all that data all the time.
You've got bottlenecks.
But what if I put processorswhere they're supposed to be
closest to the data.

(15:54):
Well, that means that I don'thave to worry about the time it
takes to move data from oneplace to another, and that means
that I don't have to worryabout the idleness.
And that means I don't have to.
I can reduce my worry about theerrors and how it's going to
affect my workload.
Now I'm becoming much moreefficient because I'm rethinking
the nature of the problem.
But what do I have to do that?

(16:15):
I have to have better datamovement.
I have to have better datalocality.
I have to have betterprocessing near the data.
Am I going to be doing it inhigh bandwidth memory or am I
going to be doing normal memory?
Am I doing it with a sharedpool?
Because now I've got clustersof GPUs acting like one big one
right that have to be able tocommunicate across, you know,
some sort of transport network,and where do I actually put the

(16:37):
tools where I need them?
There's no one way to do thatright now, and I'm not saying
that the way that storageai issupposed to address that problem
is going to be the one way todo it.
It's not the one ring to rulethem all, but what it is is a
way for people can actually gettogether and have a conversation
about at least some way wherethey can line up their shots,

(16:58):
and that's what storageai issupposed to be able to do.
It is a thematic approach tosolving a workload problem from
the data perspective, right Fromthe lifecycle of data where
does the data need to be andwhen?
And let me rephrase that whendoes the data need to be, not
where do you want it to be?

(17:18):
Right?
And because, if I can have thedata needs to be here, I can
move a processor there now,which is before I just had the
gravity was going the other wayaround.
Yeah, right, and so I think, Ithink, ultimately, and that's
that's, that's one paradigm.
It's not the only paradigm,right?
Um, there will still be a lotof opportunity for, you know,

(17:39):
more traditional approaches,because the workloads are built
for that way and we don't wantto ask the software guys to
rewrite their software, right?
So that's a.
That's a big thing, right?
We want the workloads tocontinue to work as as as
intended.

Speaker 1 (17:52):
Right, right, we don't want to.
It's.
It's sort of this classic thing.
Like we saw this withobservability.
People are like, oh, easy, alloh, and you had a team of
developers going.
No, no, like we've done what wewere supposed to do.
The sprint said to do this userstory.
It didn't include observabilitydata placement.

(18:15):
Like it's just like we make aset of assumptions that all of
you all are sorting this problemout.
We've got SREs, you know, Iassume that kid has the
knowledge.
Meanwhile that kid is just likejust praying that everything
stays up.
Like we've programmaticallysolved a lot of these things
with automation but none of themare latency really critically

(18:36):
sensitive.
Like there's always this ideathat we're going to send it out
there.
We'll fire a few containers outso they may take a little while
to spark up.
No big, we'll handle it in thequeuing, we'll handle it in the
caching, just so we don't dropthe session.
We maintain state Like, okay,cool, lots of assumptions that

(18:56):
don't care about sub millisecondlatency.
The stuff we used to thinkabout, like synchronous data
connections, you know sub 10milliseconds, you know
metropollinary networks likethis is baby problems compared
to what AI introduces as thefrequency of change of where

(19:21):
data needs to be, as you said,not where we want it to be, but
where it needs to be to beoptimal.
For all this other stuff ofaccess to CPU, gpu, like they

(19:43):
assume, uninterrupted access toeverything because interruption
could mean process stopped.
Start over, you know, and whenyou're training a model that
takes days to do, you can't justyeah, you can't just like oops,
you know, rebooted a couple ofsystems on Thursday night by

(20:03):
accident, so we got to startover.
You're like no, no, it's not anoption.

Speaker 2 (20:07):
Yeah, no, I mean you're, you're, you're spot on.
I mean the, the, the real, thereal question about, about this
is that when you get into whatpeople have normally considered
to be black boxes, right, theopacity of not being able to see
what's going on inside yourcompute cluster, what's not, you
know, if you're a computeperson or a software person, you

(20:31):
don't typically think about howthe processor does what it does
.
You just, most of the time, youdon't care, you just sort of
want it faster and better andbigger.
That's what you want, right?
However, the people who reallycare about this want to eke out
every single possible amount ofperformance as they can, because
, because the nature of the gameis different, right?

(20:53):
So now we've got all of thesedifferent things working in
parallel and the bottlenecks arein the relationship between
devices.
Like, let me say let me giveyou an example I have a CPU that
, let's just say generously has200 cores.
That's a good beefy processor,that's a good big CPU.
I've got a GPU that has 15,000cores.

(21:16):
Now, that 200-core CPU and that15,000-core GPU are not going
to have a one-to-onerelationship between the
communication between thosecores.
But if I've got a GPU thatneeds to access that data, and I
have to go through the CPU tobe able to do it.
I have to pin at some level therelationship between what the

(21:37):
cores are going to be asking forand what the cores can provide.
The transport is going to dothe best it can, right, but if I
need to be able to haveparallel ideas of how to
communicate to the storageitself, the data itself, do I
want to have an ever-decreasinglevel of that valve that has to

(22:00):
be open to be able to suck inthe data, right, and then, of
course, you have to go into thekernel and you have to come back
out to suck in the data, right,and then, of course, you have
to go into the kernel and youhave to come back out.
You know we have all kinds ofmagic that we do inside of the
operating system to be able tohandle, you know, the IO part of
it, and so what we're lookingto do is we're trying to solve
those problems.
Right, how do I handle anon-uniform way?

(22:21):
We talk about NUMA all the time, but what's a non-uniform way
of accessing data?
Right?
That is not part of the samechip cluster, right, and that's
a non-trivial problem, and youneed to have the cooperation of
processors and data and all thevendors of the above listed ones

(22:44):
.
Right, and what happens if youdon't have this locally?
You got to go over a network todo it.
Right, how do you do that?
Well, that's why you know we'regoing to be working with you
know, the partners for SNEA,right?
Ultra Ethernet is a really goodexample of this.
Right?
How do you do you know?
If you're going to dofile-based storage or

(23:04):
object-based storage over RDMA,that implies you're going to be
able to do it over AlterEthernettoo.
Yeah, right, so that's notSNEA's job.
Snea does the file and objectpart, but the RDMA part, the UE
part, that's a networking group,right?
So it's very important that SNEAwork with these other
organizations to be able tocontrol you know a end-to-end

(23:28):
way of solving these problemsand fortunately, snea and
UltraEthernet have a very goodworking relationship.
So it's not just where do you,where does the data need to be,
but where does the work need tobe, and so that's why having an
organization like SNEA, with itslong tail of you know, its
reputation and its you knowestablished workloads, work

(23:50):
streams, is a really good placeto handle the storage and the
data services, and then workwith groups like UEC and OCP and
DMTF for management and NVMExpress for the protocols and so
on and so forth.
Right, right tool for the job,right organization for the job.
And so that's why we're doingwhat we're doing, because if we

(24:12):
succeed at what we're seekingout to accomplish, everybody
wins.
It's the rising tide rins allboats right Now.
Obviously, some of the stuff isgoing to take a long time.
Some of the stuff is alreadythere.
Right, many of the projectswe're already working on have
been established inside of SNEAfor a long time, but we haven't
looked at it from theperspective of being a part of a

(24:33):
bigger workload.
Right, that's the new part.
Right, creating an entirecommunity of vendors and people
and partners and academics andorganizations to solve a
workload-specific problem.
That's new.
That's a new thing for us, andso we're very excited about that
in particular.

Speaker 1 (24:53):
Well, and this is really funny that as we started
off, we thought of this ideathat this is brand new, it's an
announcement, it's a big thing.
What it really is is theculmination of decades of
collaboration, effort andbuilding standards-driven
innovation, and the reason whythis is so important that it

(25:17):
lives in this Snea world and inbetween UltraEthnet, we've got
CXL, we've got all thesedifferent folks that are going
to be involved is just becausewe've got SDXI, we've got CXL,
we've got all these differentfolks that are going to be
involved is just because youknow we've got SDXI, we've got
CXL, we've got you know what'saround data storage.
We've got data security.
You know.
You talked about the idea likewhen do we?
You know where do we put thedata?
Okay, cool.
When do we move it?
Okay.

(25:38):
Second problem you know wheredo we move it to?
Okay, another the data, orprotect the data, or provide
resiliency and multi-path, likeyou know.
Like you said, we've we've hadmulti-cpu for, you know, decades

(25:58):
at this point, yet it took ustill not too long ago really to
really use all those cpus andcores, right, how many
applications have you seenwritten in enterprises where
it's like cpu one, 98.
You know, core two, three, four, five, six, seven, eight, all
the way up to 16, 0%.
Congratulations, you've justunderused the hardware in the

(26:22):
worst way possible.
And now, when we think of whatwe saw with stuff like TPM, when
stuff was moving what webelieve to be securely between
memory cache and CPU, it waslike, oh wait, a memory cache
and CPU.
It was like, oh wait, a secondsomewhere in the middle that
could get hijacked.
And so how do we make sure thatwe can actually protect it and

(26:44):
understand the path of, you know, the path of transit, securing
the path, securing at rest,securing like it's many, many
things.
And security, why we always saidwhy do you have to call it
DevSecOps, not just DevOps, like?
Because when we called itDevOps, no one invited the
security team.
We had to be overt.

(27:05):
And again, one of those thingsthat just we assume.
We've made this assumption thatsomeone's taking care of
security, like.
Well, now we've got all of thesemulti-purpose groups in a
single working community whothen also can go hey, you know,
we're also working on this.
Right, it's no longer yourstorage, your CPU, your GPU and

(27:29):
your security.
Now it's like your storage andwe're working with security and
network.
Everybody's got to collaboratetogether, and the only way you
can do that is to have, like Isaid, a lingua franca.
You have to have some kind of alittle orphan Annie decoder
ring.
For those older folks likemyself, drink your whole tea,
yeah.
So it's a great time to seethis, and while it may seem like

(27:56):
out of the blue to many, thisactually has just been lying in
wait and now it's just becomingvisible the work that's
happening, the collaborationthat has been occurring, and now
that can occur at a much fasterrate in public.

Speaker 2 (28:11):
Yeah, it's like the actor who's's taken 15 years to
be an overnight success.

Speaker 1 (28:15):
Right, exactly, and then you see them with.
That's such a perfect example.
Because then you see thatperson.
You're like, remember when MrRobot came around, and then the
guy from Mr Robot also, andthey're like, man, he's in like
10 different movies.
So like, yeah, because you know, when Mr Robot was written and
produced, it was four years ago,so then they knew it was going
to be good.
So then he's got an agent who'sbeen like, ah yeah, so this is

(28:36):
what's been going on while AI isthis brand new thing.
You're like, well, ai has anagent, not that kind of agent
and it's been out there shoppingaround.
So AI now has a little bit ofeverything, just like the guy
that sells cranberries.
Whoever's the cranberrysalesman, that person has been
getting cranberries in everydrink known to person.

(28:57):
Kind, hope, somebody's oncommission for that efficient,
secure and it's not negativelyaffecting the users and the
overall ecosystem.

(29:18):
We're really creating aprotection scenario as well,
because we've got power, we'vegot data protection, we've got
security, storage, retention,all this stuff like what data
stays, what state it can go.
There's so many questions thatare being re-asked that we think
we just solved in enterprisecomputing, but everything

(29:39):
changed.

Speaker 2 (29:40):
Well, you also put your finger on, maybe
inadvertently, but I think it'svery important to kind of zero
back in on.
You said you know people thinkwell, somebody should do this
right, Somebody should solvethis particular problem.
And why isn't somebody doingthat?
Why isn't, you know, anyonethinking about this?

(30:00):
Why isn't anybody working onthese problems?
And the short answer is becausesometimes the problems are very
big and very complex and no onesomebody can do it.
We are the somebody for thisright.
The data problem is a very realproblem and somebody should do
it.
We are the somebody for thisright.
The data problem is a very realproblem and somebody should do
it.
And it's true, and we are thatsomebody.

(30:20):
We are the ones who are goingto be working on a very complex
and very ambitious project tomake sure that the data does not
get overlooked or is anafterthought, or you know
something along those lines.
And so SNEA is the somebodywho's looking at that problem,

(30:41):
and so that's why we're lookingto make sure that.
You know, I've been personallyreaching out to as many
companies as possible.
I've been, you know, thehyperscalers, the storage
vendors, the processor vendors.
I've been on all kinds of callswith people and I hope if they
do see this and you listen tothis, they realize I've been
consistent in my message here.

(31:02):
But I honestly do think that wehave a moral and ethical
responsibility to doing thingscorrectly, not just from the
content perspective, but alsofrom the end user's perspective.
Right, yeah, and I think thatwhen we start to try to do the
shortcuts in getting thesethings out the door so that

(31:27):
various companies can make themoney that allow them to operate
nothing wrong with that, I'm abig capitalist Sometimes we
forget that we build in problems, we build in cul-de-sacs of
technologies in the future.
Now, I don't think that when youget to a certain stage, that's
a good thing.
I don't think that when you'retalking about a million inputs,

(31:49):
when you're talking aboutbuilding nuclear power plants,
to solve one AI workload problem, that it is sufficient to say,
well, oops, well, we just wekind of backed ourselves into a
corner Right and so the wasteand the inefficiency becomes
very, very real.
Architects, as technologists,to at least attempt to address

(32:13):
that problem so that in thefuture we don't have to
completely undo all of this workand all of the money that's
spent and all the effort that'sput into time and all the man
hours that have been put intothis are wasted.
Right, I certainly want all thework that I do to actually mean
something in the future.
I think a lot of people do aswell.
But even more importantly thanthat, you know, I do think that

(32:35):
when we come up with a solution,it's not just about the hype
cycle.
It's really about, you know,solving the problem.
And so, you know, from a youknow, from a personal
perspective, it's very important.
It's very important to me thatwe actually do something that
has an import has an import?

Speaker 1 (32:53):
Yeah, because long after the memes, the standards
will remain and you know, whilewe, culturally, are latching
onto whatever's current, activeand most front of mind and the
local problem that we have tosolve I think Jim Keller said a
great.
He says we, as developers, we,you know, we solve extremely

(33:17):
hard problems so that we canthen introduce the next really
hard problem.
Like that's the goal.
The goal is to solve somethingso well that it feels solved,
but then you find a new problemand this is it we have.
Now.
We understand how these systemswork.

(33:38):
Llms are everywhere, we've gotall this innovation going on,
but now this is the next hardset of problems to solve and the
fact that we're solving them asa community.
It's just a chance that whenyou're in a room of people that
are like-minded, with like goalsas parent, supporting companies

(33:58):
, all these member organizationsall win when we're in the room
going hey, so I just figuredsomething out.
They solved a problem thatwe've been looking at for a long
time.
Like that is the value ofinnovation as a group and
standards driven innovation,because sometimes standards come

(34:19):
, because we find them, andsometimes we create the standard
and then we develop into it.
I'd say that AI has effectivelystarted.
You know, we saw MCP just showedup out of nowhere.
You know, on Thursday.
Mcp was defined by Monday.
It was just like dude, are youeven running?
Like, if you're not running MCPserver, what are you even doing

(34:39):
?
You're like that didn't takelong before we realized what's
wrong with you.
Man, like geez really.
And then you think like, do Ireally throw myself at this
Cause?
In two weeks are people goingto be like oh, mcp was so last
month.
Weeks are people gonna be likeoh, mcp is so last month.
You're like we saw, we've gotmcpi now.
You're like but so some thingsget discovered and we're, we're

(35:00):
building in public, which iskind of cool, but then people
are losing sight of what'sactually being built on did you
ever see, oh?
I'm sorry I interrupted, pleasecontinue.
Oh no, no, it's like, but likeyou, you really said it.
For like, all the outcomes areare already being solved, like
we're.
We know we're solving and we'resolving the problems that make

(35:22):
those outcomes faster, secure,reachable, extensible.

Speaker 2 (35:27):
Yeah, yeah.
And when we were talking aboutyou reminded me of an old, uh,
an old, an old clip of uh, steveJobs talking about the laser
printer.
Did you ever see that clip?

Speaker 1 (35:39):
This one sounds familiar now, so but run it
through me.

Speaker 2 (35:43):
So he was doing a town hall, as he often does, or
did Um and uh he was.
This was back in the midnineties, back when open doc was
a big thing, and so one of hisengineers got up and braced.

Speaker 1 (36:00):
I think I know this one.
He goes oh, here it comes.
It's like one of the famousones, yes, yes.

Speaker 2 (36:11):
And he said okay, look, why are we not following
the industry standard on OpenDoc?
I did not expect the answerthat Steve gave.
Steve was like one of thethings about being in charge is
that you're going to get reallygood, difficult questions like
this and when we start to makethe decisions, you have to make

(36:33):
very difficult decisions.
You know there are, there aregoing to be really good
technologies and they're goingto solve a lot of different
problems.
The problem and he's he wastalking specifically from
Apple's perspective of selling aproduct, and I'm thinking more
in terms of selling an idea, butthe principle is the same.
So he said look, you know when,when we started working on the

(36:56):
laser printer, the Apple laserprinter which, for those people
who may not be aware, the the,the Apple laser printer, was the
first affordable printer forbusinesses that didn't cost
hundreds of thousands of dollars, that that was laser oriented.
And he said we've got reallygood technology in there, we've
got PostScript in there, we'vegot a lot of hardware stuff and

(37:18):
they got chips in there Allreally good technology.
Was there better ones?
Possibly right, but when yousaw the output of the laser
printer, he said this I can sell.
I can sell this.
I can't sell OpenDock.
I can't sell PostScript.
What I can sell this I can'tsell OpenDoc, I can't sell
PostScript.
What I can sell is this this iswhat's important.
I believe storageai issomething similar.

(37:41):
Right, I can go through eachand every one of the technical
working groups that we've gotand I can say to the cow from
home that this particularsolution is great, is going to
make you do everything you wantit to do, plus make you lunch
right.
The reality of it is that ifit's divorced from the workload
or if it's divorced from solvinga problem, I can't sell the

(38:03):
PostScript.
I can sell the laser printer.
I can sell conceptually theidea of storageai.
I can sell the fact that thesethings are working together to
bring a result to people whohave a problem.
That goes beyond whether or notthis bit moves to that register
Right, and I can sayconsistently, repeatedly over

(38:25):
time, reliably, that if you dothis, you will get the result
that you want, and I thinkthat's the thing that people
want more than anything else.
They don't want to have toreinvent the wheel every single
time I come up with a processor,every single time I come up
with a new memory format, everysingle time I come up with a new
nan format or new networkingprotocol.
Right, I want to make sure thatthe the thematic approach, just

(38:50):
like that laser printer, isgoing to solve the problem again
and again, and again.
And that is what we're bringingto the table here.
I'm not just bringing SDXI, I'mnot just bringing computational
storage, I'm bringing in theconceptual analogy to the laser
printer in storageai.
To me, it's I'm okay with that.

(39:12):
To me, that's a win 100%.

Speaker 1 (39:16):
Yeah, and this really is.
You know, you couldn't do itwithout being able to tap the
existing active working groups.
It is such a natural extensionto what already existed.
So this is.
It couldn't have landed in abetter community as far as I'm
concerned, and it also couldn'thave come at a more, you know,

(39:41):
important time.
Every day is a great day to bein technology and an incredibly
challenging day to be in it aswell.
So, yeah, and you know, all ofthis stuff that we were trying
to solve with software definedstorage.
Then it became, you know, wegriped over the terms and all

(40:03):
these things, and then we've gotnetwork storage and network
transports and all of a suddenwe've got, you know, containers
that are running on diskcontrollers that are actively
doing computing your data,running on disk controllers that
are actively doing computernear data.
And that's been around fordecades.
It's just that it's never beendone consistently at scale,

(40:23):
knowing that we have thepotential to reach the outcome.
Like you said, if you went tothe drawing board and said, if
we design a laser jet, it'sgoing to be able to do this
amazing thing.
Here's a mock-up of what it is.
Just go to all these differenttools and you can go there today
.
Just do a Google search.
Didn't mean to mention a brandname, but just do a generic

(40:44):
search and say AI, generation X,whatever tool it's going to be,
and it's there.
It's working today becausethey've tried to individually
solve a really strong problem.
But how are they solving it?
Cash flow we got to solve.
What's the best place for folksto get more information about
what just came out?
How to get involved withStorageai as an initiative and

(41:26):
how to dig in and get theirhappy little AI-loving feet wet
on this fantastic working group?

Speaker 2 (41:34):
What a visual did you just put in my head?
I'm going to let the AIfilmmakers get away with that
one.
So obviously, sniaorg is thebest way to go for all things
SNIA.
We are going to have a lot moreinformation over the next

(41:59):
coming weeks, next coming months, especially as we go through
and out of conference season inthe fall of 2025.
We also have Thiniaai as a newlanding point for where the
Storageai work is going to be.
Over time, you'll see that fillout with more and more
information.
Obviously, the SNEA DeveloperConference is going to be a

(42:23):
really good resource and if youcan't make it live, then over
time the videos will beavailable on YouTube.
So there's a lot of materialthat you can find out from the
SNIA and the SNIAorg family ofsites and, of course, there'll
be more posting on LinkedIn andex-formerly Twitter.
I still call it Twitter as well, without question.

Speaker 1 (42:46):
And I think the fact that people that are in the
organizations are obviouslyaware of the collaboration
capability in the organizationsare obviously aware of the
collaboration capability, but Ilove that more and more smaller
companies and startups have achance to participate.
It's not onerous or incrediblycostly to be involved.
Like it just everybody wins atleast to go in and tap the

(43:07):
community for questions, tounderstand what's being done.
Cause they're they are reallyjust an amazing group of
fantastic humans like yourselfwho are spending their hard time
and hard yards to do that earlyinnovation and discovery and
research so that we know when wego live that we're trying to do

(43:28):
it towards that optimized goaland security.
Folks get in because this isone of the best times to be in
there.

Speaker 2 (43:39):
The security folks, the energy efficiency folks, the
power folks all have a verystrong play in this.

Speaker 1 (43:46):
Yeah, and this is not a Bill Hicks style like, oh,
let's throw some cost savings ontop of it, let's put some
sustainability on there.
We don't want the marketingpeople to savings on top of it,
let's put some sustainability onthere.
You're like, no, like we don'twant the marketing people to get
ahold of this.
This is like we're buildingsustainability in reality.
I mean, as a marketer, I cansafely say that, but no
marketers were hurt in thecontext of this conversation.

(44:08):
But like that, that's it.
You know, people need to get in.
Every aspect is being discussedand these are the people doing
the work.
This is not, god bless, my finefriends who are pundits.
These are not the pundits whoare having fireside chats on
every AI conference stage aboutthe ethics of AI for the

(44:29):
438,000th time and sayingnothing.
Right, yeah, coming to the sameconclusion, which is we should
really talk about this more.
Wow, good luck with that.
So you know, while everybody'stalking, stuff's getting done,
and you got to be at SDC to makesure that you're there doing it

(44:50):
with them.
So we'll have links, of course,in the show notes.
And of course, don't forget,there's other amazing
conversations like this one withJay and other amazing humans
who are part of the SNEAorganization and other
contributing and partner orgs.
We've met with folks from allsorts of organizations in the
past.
So SNEA experts on data, jayMetz, thank you so much for

(45:11):
taking the time and if people dowant to find you, what's the
best way to do that?

Speaker 2 (45:15):
you so much for taking the time and if people do
want to find you, what's thebest way to do that?
Probably through LinkedIn.
My name J Metz, my Twitterhandle is Dr J Metz and at Dr J
Metz, and those are probably thetwo or.
I haven't been posting recently, but my website, jmetzcom, also
does talk about some of thesethings, although, like I said, I
haven't done much recently.
But those are the best ways toreach me, the LinkedIn is a

(45:39):
definite must follow.

Speaker 1 (45:40):
I love your newsletters and your takes on.
Stuff is really good to see,you know.
Sometimes you knowappropriately biting but honest
discussions about some reallyinteresting challenges that
we're seeing in the industry.
So with that, dr Jamez, thankyou so much.
And for folks, of course, checkit out, make sure you get to

(46:02):
sneaorg and now sneaai Now with100% more AI.
Sorry, that is the marketer me.
All right, folks, have a goodone and we'll see you all on the
next podcast.

All Episodes

Episode Transcript

Popular Podcasts

On Purpose with Jay Shetty

Stuff You Should Know

The Joe Rogan Experience

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Introducing SNIA Storage.AI

Episode Transcript

Popular Podcasts

.css-r6mb8g{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:1;overflow:hidden;}On Purpose with Jay Shetty

Stuff You Should Know

The Joe Rogan Experience

All Episodes

Introducing SNIA Storage.AI

On Purpose with Jay Shetty