Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:00):
All right, thank you
everybody for joining.
This is a great chance tochange the way that we're doing
the SNEA Experts on Data podcast, because this is a preview to
something fantastic If youhaven't seen it, if it's on
demand, but if you haven't had achance to attend yet, you
definitely got to sign up whichis talking about what we can do
(00:23):
in the world with RDMA.
And you're saying to yourselfwhat do I need to know about
RDMA?
Well, I'll tell you what thegoal of this webinar is to tell
you everything you need to knowabout RDMA, and you don't have
to be afraid to ask.
This is a chance for us toreally dive into number one what
is RDMA doing, what has it doneand also what's next, because
(00:46):
we think about the amount of usecases that are coming up.
It's a fantastic chance to beable to take a look at the
different ways that technologieswork and, especially, how new
use cases are arriving.
So thank you for joining.
My name is Eric Wright.
I'm the co-host of the SNEAExperts on Data podcast and also
the co-founder a lot of co's ofGTM Delta, and I'm joined by an
amazing group of folks who aregoing to be on this really,
(01:08):
really cool webinar, so I'mgoing to get you to do a quick
intro.
So, eric, if you want to get usstarted, Sure, I'm Eric Smith.
Speaker 2 (01:15):
I'm a distinguished
engineer working for Dell
Technologies.
Speaker 1 (01:18):
Excellent and Rohan.
Speaker 3 (01:21):
Hello, I'm Rohan
Mehta.
I'm working as a SeniorSoftware Engineer at Microsoft.
Speaker 1 (01:26):
And Michal, certainly
.
Last but not least, if you wantto introduce yourself as well,
then we're going to jump rightin.
Speaker 4 (01:33):
I'm Michal Calderon,
Distinguished Engineer at
Marvell.
Speaker 1 (01:37):
Fantastic.
Now there's a ton we want tocover.
I would love to, but I alsodon't want to give away too much
of the good stuff that we'regoing to see in the webinar.
But let's maybe get started.
Eric, what do you see as theopportunity?
What's your goal, because Iknow you're going to be driving
the conversation, and so what doyou see as the reason why
people really should attend this?
Speaker 2 (01:59):
Yeah, so thanks for
asking Eric and thanks for
having us on.
Yeah, so thanks for asking Ericand thanks for having us on.
So my interest I'm going to bemoderating the session and my
interest in the topic reallygoes back a couple of years.
Ai has been sort of driving theneed for RDMA and so, as a
person who's involved withnetworks and fabrics, especially
(02:20):
as they relate to AI, gettingto know how RDMA works has been
very important.
It's critically important to myfield and about a little over a
year ago, I started reallytrying to dig into the details
of it, especially looking fortraining information, and I was
frustrated that there wasn't abasic overview of RDMA.
(02:44):
I was frustrated that therewasn't a basic overview of RDMA,
and what I found was, even if Iwent onto a website like Open
Fabric Alliance, for example,they had training information
there and it went really deep,right into the verbs, right into
the stack, all the way down toas deep as you wanted to go, and
it was just a little too muchto get started, so I struggled a
(03:04):
little bit.
It was just a little too muchto get started, so I struggled a
little bit.
So what I'm hoping we can dowith this webinar is to give
people a framework on which tobuild, and so, yeah, that's why
I'm, and I think we've hit themark.
Speaker 1 (03:18):
clearly, you
certainly got some amazing folks
contributing to this discussion.
So, Michal, based on that,what's your view of you know,
sort of a nutshell view of RDMA,and what you look to bring to
the session from your own?
Speaker 4 (03:40):
you know previous
experience.
Yeah, so I was exposed to RDMAabout 10 years ago and really
feel connected to the technology.
So, like in a nutshell, right,rdma stands for Remote Direct
Memory Access.
It's a technology that enablesdirect memory-to-memory data
transfers between computerswithout involving the CPU and
the cache and the OS, making itsuper efficient.
You know, having highthroughput, low latency and, of
(04:01):
course, low host CPU usage.
And you know a lot of peopleknow just the buzzwords RDMA,
but they don't really know howRDMA works, how the RDMA NIC
actually writes or reads frommemory without involving the CPU
, and so on.
And so this is like.
This is the reason why I wantedalso to join this webinar and
(04:23):
show people you know how thisactually works and what makes
RDMA.
Speaker 1 (04:46):
Super ideal for
different applications and use
cases, which we'll also go intoin the webinar included.
But sometimes people just say Ikind of know what that means so
I'm not going to ask.
So I do love that you're goingto be able to dive in and it'll
be interactive so folks canparticipate and figure out you
know like and especially gettinginvolved in the SNEA community
this is such a fantastic groupof folks and all the
(05:07):
contributing companies thathelped to make it happen.
Rohan, based on your backgroundand experience, what's your
goal and what you want to bringto the session?
Speaker 3 (05:17):
Yeah, so we do.
Like Mikael said, we do want togive an overview of what RDMA
is.
When I joined Microsoft, Iactually started working on RDMA
right away, and so for me thatwas like being thrown on the
deep end, and I know what itfeels like to ramp up on the
topic that you know iscompletely new, while also
(05:40):
contributing to you knowsomething, that's, you know
production level and you knowdeploying with actual customers
on it and actually using RDMAout in the real world.
So we want to simplify thistopic as much as possible in a
way that somebody who doesn'tknow what RDMA is can also join
(06:00):
this webinar and learn about it.
Somebody who already knows RDMAbut needs to know more details
of how it actually operates atvarious layers of the networking
stack.
If I want to write anapplication that leverages RDMA,
what steps should I follow?
How should I go about writingsuch an application?
Even such a person can jointhis webinar and learn low-level
(06:25):
details like that.
Speaker 1 (06:28):
That is one of the
things that I really struggle a
lot of times with webinars is wenever get a chance to go deep
enough because we often don'thave the experts on the call or
it's tough to explore in thetime given.
So I've been able to get apreview of what you've got ahead
and for folks that are watchingthis even after the event, it's
(06:48):
a must attend, just because thedepth you can reach and also
the fact that this is acommunity of people that we can
continue to connect to after theevent and keep asking questions
.
It's fantastic as anopportunity.
Now, why now?
This is always the question.
We've had RDMA in our world fora while.
(07:10):
It's been a couple of yearsperhaps.
So, michal, based on that, whatis the importance of this RDMA
discussion today?
Speaker 4 (07:18):
Right.
So, like you said, yeah, rdmahas been here for a while, since
the 1990s, right, but it'sgaining traction again today
more than ever, you know, due tothe challenges that are imposed
by mainly the AI and MLtraining loads that are getting,
you know, much, much larger andhave really crazy demands from
the network.
So, yeah, it's definitelyalways been significant, you
(07:41):
know, due to its ability toenhance data transfers and its
low latency and high throughputand so on.
But I mean now, with AI and MLand you know these workloads,
really the computation is socomplex and the GPUs are so
expensive and you don't want anyidle time on the GPU.
You have to enhance the network, right?
(08:02):
You don't want the network tobe the bottleneck and RDMA is
close enough, right, to providethat, I mean, compared to
traditional other.
You know networking protocols.
So I mean this is the mainreason why we feel it's back and
it should be discussed again,helping professionals stay
(08:22):
informed on the technologicaladvancements and the benefits
that this brings.
Speaker 1 (08:58):
Team RDMA.
I love it.
We often joke CPU storage,network, all those core
artifacts, and this is truly thechance to revisit, because
everything changed once theworkload changed.
Workload changed and if wedon't adjust how we leverage the
capabilities or, especially,explore what's coming in, you
(09:19):
know, future growth and otherinnovations that could happen,
this is, this is really cool.
Some will say it's, it's RDMAor RDMAI.
So who's ideal to attend this?
Eric, you know the audiencewell, given that you're, you are
the audience.
So who's a?
Who's a person that's reallygoing to get value from sitting
for this session?
Speaker 2 (09:38):
Yeah, really anybody
who's interested in RDMA.
What I like about what Rohanand Mikal have done are the
breadth of the information.
It starts at a very high level,introductory, sort of you are
know, you are here, sort ofpoint of view, and then it gets
all the way down into providingwire traces.
(09:59):
So you know actually packets asthey show up in a wire shark,
and Mikhail will stitch thosetwo together and show you
basically how, like a conceptualmodel of RDMA, endpoints
communicate with one another andwhat that looks like on the
wire.
So it's really anybody who'sinterested in RDMA endpoints
communicate with one another andwhat that looks like on the
wire.
So it's really anybody who'sinterested in RDMA.
Speaker 1 (10:21):
And I'll close it up
just because I know we don't
want to overshare the goodnessthat's in the full session.
So, Rahan, based on yourexperience in looking at these
AI and ML use cases Because Iknow it's not just dripping off
the tongues of marketers andengineers alike, but we are
(10:42):
genuinely seeing use cases comeforward so where do you see the
importance of what this sessionis going to bring to folks that
are starting to dabble or evenwell underway with an AIML?
You know, adoption andtransformation.
Speaker 3 (10:59):
Yeah.
So the key problem or the keybottleneck in the AIML model
straight today is the movementof data, like the data transfer
itself, and that is where RDMAcomes into play with all its
benefits itself.
And that is where RDMA comesinto the play with all its
benefits.
There's a lot of data alreadyexisting in multiple storage
(11:19):
clusters around the world.
We need to move them as fast aspossible from this onto a GPU.
Maybe the GPU is sittingsomewhere else, so maybe across
the network, then onto the GPU.
So in both the scenarios, rdmais coming into play and playing
(11:39):
a crucial role in making thisdata transfer as fast as
possible, as efficient aspossible, bringing all those
benefits of bypassing kernel,bypassing operating systems,
bypassing the CPU utilization onthe host and so where the data
actually is going to land.
And then you know this, youknow further facilitates, you
(12:00):
know, the speed up of theprocess of training as well as
inference of all the AI models.
Speaker 1 (12:06):
Well, this is going
to be amazing.
So thank you all for giving aquick preview and I'm looking
forward to everybody givingfeedback on when they watch the
full session, because it's sucha fantastic deep dive and it is
just that right.
We're touching at the tip ofthe iceberg.
The deep dive that you're goinginto in the full session is
going to be amazing for folksthat really want to see where,
(12:30):
where it's happening.
This isn't just marketing.
This isn't buzzwords.
This is real opportunity tooptimize and build fantastic
things.
So it's happening.
This isn't just marketing.
This isn't buzzwords.
This is real opportunity tooptimize and build fantastic
things.
So it's everything you wantedto know about RDMA.
But we're too proud to ask andI always like to say make sure
you check out everything else atsneaorg.
We've got lots of other amazingsessions like this and other
podcasts and, of course, all thewebinars that are coming
(12:52):
together.
2025 is going to be the year ofdistributed systems getting
better because of distributedknowledge, and this is a great
venue to do so.
So thank you to Eric, to Rohanand to Michal for all of this
and we're looking forward to thesession and we'll see you all
on the webinar.
Speaker 4 (13:11):
Thanks, eric, thank
you.