All Episodes

March 30, 2025 43 mins
Summary
In this episode of the Data Engineering Podcast Roman Gershman, CTO and founder of Dragonfly DB, explores the development and impact of high-speed in-memory databases. Roman shares his experience creating a more efficient alternative to Redis, focusing on performance gains, scalability, and cost efficiency, while addressing limitations such as high throughput and low latency scenarios. He explains how Dragonfly DB solves operational complexities for users and delves into its technical aspects, including maintaining compatibility with Redis while innovating on memory efficiency. Roman discusses the importance of cost efficiency and operational simplicity in driving adoption and shares insights on the broader ecosystem of in-memory data stores, future directions like SSD tiering and vector search capabilities, and the lessons learned from building a new database engine.


Announcements
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
  • Your host is Tobias Macey and today I'm interviewing Roman Gershman about building a high-speed in-memory database and the impact of the performance gains on data applications
Interview
  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what DragonflyDB is and the story behind it?
  • What is the core problem/use case that is solved by making a "faster Redis"?
  • The other major player in the high performance key/value database space is Aerospike. What are the heuristics that an engineer should use to determine whether to use that vs. Dragonfly/Redis?
  • Common use cases for Redis involve application caches and queueing (e.g. Celery/RQ). What are some of the other applications that you have seen Redis/Dragonfly used for, particularly in data engineering use cases?
  • There is a piece of tribal wisdom that it takes 10 years for a database to iron out all of the kinks. At the same time, there have been substantial investments in commoditizing the underlying components of database engines. Can you describe how you approached the implementation of DragonflyDB to arive at a functional and reliable implementation?
  • What are the architectural elements that contribute to the performance and scalability benefits of Dragonfly?
    • How have the design and goals of the system changed since you first started working on it?
  • For teams who migrate from Redis to Dragonfly, beyond the cost savings what are some of the ways that it changes the ways that they think about their overall system design?
  • What are the most interesting, innovative, or unexpected ways that you have seen Dragonfly used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on DragonflyDB?
  • When is DragonflyDB the wrong choice?
  • What do you have planned for the future of DragonflyDB?
Contact Info
Parting Question
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?
Closing Announcements
Mark as Played
Transcript

Episode Transcript

Available transcripts are automatically generated. Complete accuracy is not guaranteed.
(00:11):
Hello, and welcome to the Data Engineering Podcast, the show about modern data management.
Data migrations are brutal. They drag on for months, sometimes years, burning through resources and crushing team morale.
DataFold's AI powered migration agent changes all that.
Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches.

(00:35):
And they're so confident in their solution, they'll actually guarantee your timeline in writing.
Ready to turn your year long migration into weeks? Visit dataengineeringpodcast.com/datafolds
today for the details.
Your host is Tobias Macy, and today I'm interviewing Roman Gershman about building a high speed in memory database and the impact of the performance gains on data applications. So, Roman, can you start by introducing yourself?

(01:00):
Nice meeting you, Tobias. And I am Roman. I am the CTO of Dragonfly DB
and also the original author of
Dragonfly project, hosted on GitHub. And do you remember how you first got started working in data?
Yeah.
So I joined Google actually in 02/2007
in Israel in Haifa, and,

(01:22):
my first
data oriented project was,
building Google Suggest,
which is a service that, everyone uses when they type, their search queries
in the search box and they get all those suggestions. And,
believe it or not, it hasn't been launched before that. Nowadays,
we are all used to it.

(01:43):
But back then, it was a 20%
project of someone in Google, and we,
took it on ourselves to,
to productionize it. And we launched it on YouTube, and the,
the goal was to launch it on, Google.com.
And there, actually,
I had,
my fair share of

(02:03):
challenges of how you scale
technology to Google scale, basically.
And you need to support
high throughput scenarios with very low latency.
And, of course, it was in memory,
domain
with lots of,
preprocessing
pipelines,
etcetera etcetera.
And there again, my

(02:24):
initial experience with
the with
the scaling infrastructure.
And in terms of the Dragonfly
project, can you give a bit of an overview about what it is and how it started and why you decided that it was worth putting the time and energy into
it? Yeah. Sure.
So after Google, I

(02:45):
moved to,
working in a startup called Ubimo.
Actually, my manager and the CTO of that startup is Oded, who is now my cofounder
in Dragonfly DB.
And, back then, we when we left Google, we kinda didn't know anything about infrastructure outside of this camp, of Google. And

(03:05):
we started learning about
all the building
blocks and pieces that exist in open source community.
And quickly, we discovered Redis and started using it, and it was incredibly useful in our infrastructure stack. But, unfortunately, it was very,
painful to manage and scale. So
we naively thought that we'll be able to take snapshots,

(03:29):
and we couldn't do it just because in that, like, our use case was a high throughput.
Right,
high write throughput.
And,
it just went out of memory.
And,
we tried to scale vertically
because everyone in the Internet said that it's the most the fastest,
data store that exists,

(03:50):
very scalable. And then, to my surprise, I discovered that it is not possible to scale it vertically.
But we kept using it, but I kinda had this,
thought that maybe it is possible to make it, better,
but
I kept this fault to myself.
And, then at some point, this startup was sold,

(04:13):
to a company and,
I moved on to another job.
And I kinda,
saw that there is an opportunity in the AWS,
Elasticsearch team in Israel. And,
I just wrote an email to the manager there, and I wrote to him that,

(04:34):
hey. I'm,
super excited about Redis, and I think I have some ideas of how to improve it. Could be
would be, like, really happy to join the team. And it worked. Like, I joined the Elasticsearch team
and, became a
an engineer there.
And actually
there, I had the chance and a privilege to learn about

(04:58):
lots of use cases of how,
people use Redis
in various ways. And really,
Redis is super useful. It's a Swiss army
of various data structures
and also how people try to use it. And, I saw that the challenges we experienced,
in our startup was,

(05:20):
were very much similar to how others,
try to scale radius and fail and the same pain points. And,
I just felt even more strongly that it is possible to improve,
the technology behind this incredible product.
And I tried to advocating this

(05:41):
in the company,
but it didn't
succeed.
And,
at some point, we just decided to separate,
our ways. I mean, I am still
very
friendly,
relationship with, Madeline, who is the lead developer now with Valky,

(06:03):
and, very enjoyed. I enjoyed very much working with the team, but I decided,
okay. Why not try
myself to do something, better?
And you mentioned Valky, which is currently
the more open alternative
to Redis since they changed their licensing.
And the other major player that I'm aware of in the key value database

(06:27):
space is Aerospike, which also focuses on high throughput, high speed.
And I'm wondering
what you see as the core
problem that is solved or the capabilities that are unlocked by having a, quote, unquote, faster Redis.
Yeah. Sure. So,
first of all,
disclosure, I've never run Aerospike.

(06:48):
Of course, I heard about it.
I think,
Aerospike, at least how I,
perceive it,
is more in
a persistent
domain area. So there are lots of key value stores.
And I think,
it,
mostly

(07:09):
useful for enterprise,
very huge scale workloads
that have maybe
a lower throughput
to
workload size ratio.
So maybe,
it can be, like, dozens of terabytes of data, but relatively

(07:29):
low throughput
use case.
With Redis
or Valkyrie
or Dragonfly,
usually people use it for its
very low latency characteristics,
sub millisecond latencies,
even, for p 99,
and
the throughput there can be,
relatively high.

(07:49):
In addition, as I said before, Redis
all in memory and data stores, they have
lots of
different data structures,
which being used by huge amount of framework.
So Redis is not just Redis. It's
the entire ecosystem with all these libraries.
Laravel, psychic,

(08:11):
BooleanQ,
Celery.
So all those libraries is the is the kinda
the front end of Redis for many developers.
And,
they consume Redis there, those frameworks.
So I think, one, this is why,
this ecosystem is extremely useful,

(08:33):
because of this,
of its APIs
and,
the libraries that built were built upon them.
And also, again,
high throughput,
scenarios,
job queues, caching scenarios,
all this, became ubiquitous with Redis.
With IrisPike,

(08:54):
I believe it's more narrow use case of key value store
and also kinda flavor of persistence
around it.
You mentioned some of the common application use cases that I'm familiar with in the Redis ecosystem,
that being Celery, application caching, queuing.
And
I also know in

(09:15):
recent memory,
one of the capabilities that was added was Redis streams to compete with the Kafka ecosystem.
And I'm wondering
broadly, what are some of the ways that you see Redis used beyond some of those
common well known patterns, particularly in the context of data pipelines, data engineering workflows?

(09:37):
Yeah. So,
of course, I can only speak from my experience,
and I believe the
use of usefulness
of,
Redis APIs,
decreases
with the API
generation numbers. So the most common ones are the first one, like basic sets and gets,

(09:59):
lists, etcetera.
And then
the recent APIs.
Yeah, sure.
People are using it like streams, for example, but it's just a minor,
market share of,
Redis users.
I
think to your question is that
I wouldn't consider Redis only as a cache or

(10:23):
a job queuing engine,
even though it started like it, like this.
But I see lots of use cases
for using Redis
or Dragonfly and Valkyrie as
a non cash,
volatile data store. It's not a database.

(10:43):
It doesn't have transactions
that can be rolled back,
but it's
a data store
that store like, it's a single source of truth could be for serving data,
but, usually, it's for non business critical use cases.
So, for example,
we wouldn't want our bank storing its transactions

(11:07):
in something like Redis,
but it's totally fine to,
use it for
feature as a feature store, for example.
And there are lots of use cases for it being a feature store.
So in case of, you know, your infrastructure crashes,
you can still maybe refill
the the data store with your cold data.

(11:30):
Maybe it's painful and you lose availability,
but it's not end the end of the world.
So this, I consider this use cases as non cache use cases because,
usually, this,
data store is configured
without
eviction
policy on.
So,
the the, you know, the classical cache use case is a is eviction policy enabled.

(11:54):
But once it disabled, I consider this like a data store use case. And there are lots of,
use cases like this. For example, with gaming companies, they can store
scoreboards
per team or player or whatever.
Lots of use cases for,
data engineering applications,

(12:16):
GEO
APIs, etcetera, etcetera.
Another interesting aspect of the work that you've undertaken with Dragonfly
is the tribal wisdom that has grown up over the years that it takes ten years for a new database engine to really settle in and grow to maturity and sand off all the rough edges.

(12:37):
Obviously, the past five to ten years has seen a massive growth in the number and variety of databases that are being introduced, and many of them are already in production contexts.
I know a portion of that acceleration
is due to the
investment in various components that
are used to compose a database

(12:58):
of those different pieces, particularly in sort of the disaggregated
big data stack,
but also in terms of, like, the c star framework that I know that the folks behind CillaDB helped to
introduce for make taking advantage of modern hardware capabilities, parallelism.
Wondering if you can talk to some of the ways that you approached the implementation of Dragonfly DB and the evaluation

(13:22):
and selection of some of those underlying pieces to help accelerate your work so that you didn't have to build all the way from the storage engine, paging system, etcetera, all the way up through to the user interface to be able to get to where you are?
Yeah.
So the
the answer is, kinda embarrassing.

(13:43):
I had to,
invent or
reimplement lots of things from scratch.
But let's start from maybe from the the end.
One
huge assumption
that I did when I designed Dragonfly
is that we are not going to change the protocol

(14:05):
and the compatibility.
It was really important for for me
to make it drop in replacement for Redis.
So, basically, I didn't want to come up with, I don't know, HTTP protocol
or change the semantics of the commands. And it was quite a challenge, I must say,

(14:26):
because
Redis wasn't designed for multi threaded engine.
So
I had to,
adjust,
Dragonfly technology
towards
250,
commands that
were not designed for multi threaded,
scenarios. So for multi like for reusing multiple,

(14:49):
CPUs.
It was quite a challenge,
But at least in terms of the product design,
if you're talking about
reusing components,
I felt that the the most important decision that people that use Laravel Celery and the PollenQ
or or running low scripts with Redis will be will still be able to run all those components,

(15:14):
and that was a deliberate design choice.
Now if you're talking about the implementation
of Dragonfly,
so
for me, it was also
it started as a challenge,
basically, to myself. I didn't start with from,
you know, thinking about, opening a startup.

(15:35):
Basically, I left,
Elasticsearch team and,
stayed at home, during COVID. It wasn't very nice period, of my life.
Had to code a lot of
lines of code, nights because,
my twin daughters were just born
and, had to juggle everything,

(15:57):
and I was not working. And, yeah.
But basically, I started with a very simple,
kinda,
challenge
or milestone.
Let's
implement a toy back end,
that can only answer
on the like, a or only can handle set and get commands, and I called it mid midi regis.

(16:25):
And, again, it was more like a learning experience for me of
is it can we really do it? Like, I I wanted to learn about shared nothing architecture, how to do it. Also,
a new Linux API was recently released called IO U ring, and I was excited about it. It was just released.

(16:46):
And I took everything as a, you know, as a learning experience,
opportunity.
And I spent
more or less two or three weeks just coding. And
and so it works. Like, I could
create a or wrote a very simple back end, and it's by the way, it's still available on GitHub. It's called Midi Redis under my username.

(17:09):
And,
it reached, like, 4,000,000 TPS on a single machine
on, AWS, and I was really excited about it. And I thought,
okay. I have something.
And, then I collaborated with, my cofounder, Oded, and we decided to push it forward.
And my next
milestone basically was

(17:31):
around innovation of hash tables.
What kind of hash table can I use for
Dragonfly?
And how can I improve,
the things, the major things that bothered me when I was using Redis?
And it was, again, single thread,
single thread nature of Redis and

(17:54):
its lack of
resilience
when doing snapshotting.
Basically, I really hated its snapshotting algorithm that was based on fork,
SystemCo.
I felt that it's,
very unreliable.
So it was my next challenge and
I solved that challenge. And then I continued with blocking comments. And around April,

(18:18):
so I started like November of twenty twenty one and around April,
I felt, okay,
we have something
we have something workable. And,
I succeeded to implement,
my first, like, blocking command. I thought, okay.
Maybe we have something interesting here. What could be interesting,

(18:39):
to other
members of Redis community?
So that's kinda
a short version of how I started. And,
sorry. Yeah. I had to reimplement
most of the,
building blocks myself.
Maybe I could use Cstar,
for that for for some of the work, but,

(19:00):
I just felt that I need
this,
hands on experience to really understand what I'm doing.
And
I'm not saying that it's the right way to build stuff.
For sure, it's not the fastest one,
but that's something that worked for me.
In terms of the overall

(19:22):
objective of Dragonfly,
you mentioned running into limitations
working with Redis as far as vertical scalability.
You've mentioned some of the limitations in terms of performance, the throughput capabilities that you've unlocked with your earlier experimentation and what you've now built with Dragonfly. I'm curious if you can talk to what were
your overall objectives in building this system and some of the ways that the

(19:47):
design and implementation
as well as the overarching goals of the project have changed from your first phases of experimentation
to where you are now where you're actually building a business around this core technology?
Great question.
I I don't think I have the full answer even today.
So, basically, when I started, I think,

(20:09):
also
Dunning Kruger effect kicked in. So I thought I have a solution. Right? I,
did this
experiment, came up with a very efficient,
data store that could answer
cert, get, and maybe other comments very fast. And I thought, okay. I correct this piece of data structure sorry. This,

(20:32):
piece of infrastructure.
But then I discovered
how complicated the the whole ecosystem is
and how much effort is, to support, law scripting properly.
How much,
effort is to support pipelining,
properly.
All those, things that I haven't thought about

(20:52):
I have sorry. Haven't think about.
And,
my kinda first naive,
thought was just by improving the
performance of those basic operations, I will be able to
win other developers
right away. And,
then I,
saw that, again, all those

(21:14):
fragmented use cases that,
cover the entire ecosystem of Redis
with all those frameworks.
And,
we we just had to
kinda optimize
memory usage,
efficiency for all those use cases.
And, yeah, the kinda the
strategic

(21:34):
goal was
cost efficiency.
That kinda the the
umbrella, the high level umbrella of what we try to achieve is
a cost efficiency.
After that,
if we add a new command,
or new API
to,
let's say, Dragonfly,
it won't move a needle.

(21:56):
And there is a natural inertia
in the market, right, because of the frameworks.
They need to pick up this, command in order to use it, but
they are, right now, largely dependent on the Redis.
So I felt that
just by
disrupting
the
cost factor
of the current use cases, we'll be able to win over the,

(22:18):
you know, the the market.
And we're still working on it. Right? So we started with multi threading.
Then we continued with,
memory efficiency. We implemented
sorted sets.
We improved, greatly improved snapshotting algorithm.
And we started seeing how people,

(22:38):
started switching
over to Dragonfly because of these advantages.
But there is
long road
ahead of us. So basically
really huge
use cases.
Say enterprise
scale use cases require,
more sophisticated
features like cross region replication,

(22:59):
maybe,
like better auto scaling support. So basically now we are going into this control plane territory
where we have maybe the basic block of the this,
fairly efficient backend.
And now we need to build upon it those sophisticated
use cases to support,

(23:20):
enterprise customers, and that's what we do,
nowadays.
As you have
seen teams migrating
from Redis or Valky onto
Dragonfly,
obviously, there are the cost and efficiency gains, and you've mentioned the scalability benefits. But beyond
those,

(23:40):
I guess,
financial motivations,
what are some of the ways that you've seen it change the ways that teams think about
how and where to
apply that Redis and key value and queue based functionality
in their overall architecture or the ways that they're thinking about the
role of that memory,

(24:02):
efficient memory storage
in their overall system design?
Yeah.
So, actually,
our
first,
early adopters,
were not necessarily
people that
try to,
save on costs.
These were teams that were self hosting

(24:24):
Redis clusters
with,
let's say,
up to dozens of shards,
and it was
an operational nightmare for them.
And they switched to single
node Dragonfly.
So for them, it was the operational
pain that was been has been solved, and

(24:46):
they
not necessarily became our customers. So that was,
mostly community users.
Yeah. And,
this one case another case,
so we so and, again, it's around the operational complexity, I guess.
We saw people

(25:07):
optimizing,
let's say,
dividing their infrastructure, their,
clusters
into
separate entities just because
of the different
throughput
needs. So for
some high throughput use cases, they needed read replicas, for example.
And for others,

(25:28):
they could use a single
single master
cluster.
And then and it was totally artificial division
just because
their original,
cluster couldn't cope with their load. So they had, like, to, separate and optimize this.
And again, with Dragonfly, with its vertical scale, they could just unify

(25:51):
everything.
And not only it simplified
their infrastructure,
like, reduce the complexity of their,
infrastructure, but also
reduce the hardware footprint because
now they could
average,
their workload, and the traffic actually became less volatile

(26:11):
just because they unified
their infrastructure pieces together.
It's a
similar use case, I guess.
And the another, I would say, anecdotal example is that we had a customer actually,
that was
that is very excited about Dragonfly. One thing that

(26:33):
they told us is that they accidentally
stumbled upon
an API that is, Dragonfly specific called the CL throttle,
that originally came from a model that someone implemented for Redis, and we just implemented as a core Dragonfly functionality.
And before that, they used,

(26:55):
Golang library
for that.
It was kinda complicated
and,
very inefficient.
And now they have this,
built in API call, that simple that they could use.
And they're
very happy about it because
it scales very well with Dragonfly.
So, basically, there is no, you know, one single answer to your question,

(27:20):
but we
hear,
all the time about
different advantages of using Dragonfly.
Just
sorry. Maybe it's too long, but I'm super excited about, every time when I hear about,
people using,
Dragonfly
different ways,
it,
it's just another,

(27:41):
morale boost. So
few, months ago, we heard about,
Mastodon Fox
adopting,
Dragonfly. And it started with them opening a bug in Dragonfly repo.
So it wasn't a small sale for them. But, basically, once we fix this bug, we're super happy about Dragonfly
because,

(28:02):
again, it allowed them to reduce their hardware,
footprint
because of Dragonfly being super efficient in memory.
So instead of if I remember, instead of using 20 gigabytes of,
RAM, they could,
store the same workload with six or seven

(28:23):
gigabytes with Dragonfly.
Those are kinda the use cases.
I guess the last one still falls under the cost efficiency umbrella.
And the,
I would say,
the scaling factor
is,
I would say,
it
around 40% of the use cases when people can't

(28:45):
scale
their workload with Redis Cluster.
So it's not even about cost efficiency.
It's about them scaling horizontally
and still
seeing their charts overheating.
And that's,
I would say, at least,
based on my experience,

(29:06):
it's
not the majority of use cases. Majority is still cost efficiency,
but still,
and
a significant
amount of use cases that come because of that as well.
Another interesting aspect of what you're doing is that it is a very memory intensive
system.
It's very focused on speed and efficiency. But with memory being the key

(29:30):
resource that's required, obviously,
correctness around memory usage is very important. And I'm wondering what your thoughts are on if you were to restart it today, would you still go with c plus plus, or do you think that it would be useful to at least use Rust for portions of that? I'm just wondering what your,
analysis of that in terms of language choice has been as you've continued to build and evolve the system.

(29:53):
We had huge amount of bugs
around
memory semantics and multithreading.
We had huge amount of bugs,
around
other areas that
with any language wouldn't help there as well.
But I would answer your question like this. If I would start today, I probably would still use c plus plus just because it's

(30:17):
the tool that I know best.
If I would start
twenty years ago,
being twenty years younger
and Rust would be exist
back then, I would probably start with Rust. Yes. I I just didn't want to waste,
my time on learning.
And it's not just about the new language. It's about the entire ecosystem of libraries

(30:43):
that I didn't want to spend time, you know, learning.
But I totally get the advantages of using Rust.
I'm not against Rust. I actually used it in AWS.
I learned it there a bit, and I enjoyed
my short time with Rust back then.
And in your work of

(31:04):
embarking on this experimental project leading to building a new database engine and building a business around it, what are some of the most interesting or unexpected or challenging lessons that you learned on that journey?
Just about
our assumptions of how people use memory stores.
It's basically 50 shades of gray.

(31:26):
And I kinda knew about it,
when
I was working in,
the ElastiCache team. But,
then
when,
we launched the project in the community,
I actually started
only then to start it to understand the complexity

(31:47):
of the ecosystem,
about our assumptions,
what
who would be our first,
kinda early adopters.
Like,
our hope would was that it would be cloud users, but it was the opposite.
And then, like, with, cloud users, I think Dragonfly
is,

(32:08):
is best for large scale workloads. But first, you know, we we went through the whole journey of onboarding small users,
building the
data store,
suitable for them,
optimizing
even for smallest use cases, and then slowly,

(32:28):
going towards any bigger and bigger customers.
It was,
kinda
maybe naive expectation on our side that it's gonna be reversed just because,
the market is already there. That was our kinda thought that just because we are building drop in replacement and it's fully compatible, it will be easier to onboard bigger

(32:53):
customers.
And it didn't happen,
easily.
But besides that,
I don't know, just a random request
from, commercial users that were
unrelated to, maybe, to the data plane requirements and to the technology of data of Dragonfly itself.

(33:14):
All the mechanics of the
cloud system of the, you know, automated service,
all those features that we need to implement
before commercial users,
start
working with us. It was also kinda unexpected.
Luckily, we had,
we have, the best team, cloud

(33:37):
team, engineering team,
and we could
solve all those, challenges very quickly.
To the point
of Redis compatibility,
obviously, that gave you a very focused target to aim for in terms of the implementation that helped with
the adoption curve as far as people not having to reimplement any of their tech stack, their libraries. They could use the existing set of

(34:04):
technology that they were using. They just swap out one component of it. Now that you have that in place, I'm curious
what your thoughts are as far as extensions to that interface
that would be useful or additional features beyond the bounds of Redis that are,
in consideration
for adding to or extending the capabilities and use cases of Dragonfly?

(34:28):
I could actually ask the same question back, to you.
Let's say you'd develop a new SQL database.
Do you think that there is an API command that would be disruptive
in this,
market
or something that, let's say,

(34:49):
would
quickly move MariaDB or MySQL users to your database if you'd create it.
So
I I don't know what you think about it, but,
my
thinking is that
there is no such magical
command
that would do any, like, quick wins in terms of

(35:13):
adoption
just because we have it and our, like, other technologies do not have it.
And we are still
on our path
to disrupt,
I would say, the the core
attributes of in memory data store. And what I mean by that is that
our next goal

(35:35):
is to make Dragonfly a fusion store,
Basically, something that can
use SSD, local SSD, very fast NVMe
drives,
and to provide the same look and feel as in memory store,
but
reduce,
this dependence
on memory. Like, you you mentioned,

(35:56):
memory before that it's very important component. Usually, it's also
the biggest
cost contributor
when people use in memory stores, and that's something that
also presents
them to move huge workloads,
to in memory space. They would love to benefit from low latency, high throughput,

(36:19):
but they can't due to high memories,
costs.
And the memory costs are not
going down
as fast at least as, you know, SSD cost.
So our I don't say I wouldn't say near term, but, like, maybe midterm goal milestone
is,
to introduce

(36:40):
SSD tiering
that would
be able to benefit from,
SSDs
and,
offload
few chunks of data from, Dragonfly and
by that to reduce
total cost of ownership
by,
you know,
a factor

(37:00):
of 10, let's say, five,
it will it will still be a huge
win.
For people who are
using Redis or evaluating
use cases that are adjacent
to Redis in that ecosystem,
what are the cases where Dragonfly DB is the wrong choice?

(37:21):
I think,
when
people
not Dragonfly, but
in memory store in general,
some companies
marketed,
Regis, for example, as a database.
As an engineer,
it
makes
me hurt inside.

(37:41):
Regis is not a database. Dragonfly is not a database.
And some
maybe naive,
thought folks,
think that,
this can be used,
or in memory store can be used as a database.
I think, everything that,
involves
durability
and, you know, strong consistency guarantees

(38:02):
of, transactional
guarantees
of,
all operations.
For those use cases, you can't use in memory store.
There is inherent
trade off there
that in order for a memory store to be fast,
it,
never records
its,

(38:23):
actions, so it can't roll back transactions
in case they fail. And I think it's a great trade off
for use cases that do not require,
transactional guarantees.
And there is a huge
market for those use cases.
But people should be aware of these trade offs.
What else?

(38:44):
Besides
that, I think,
anything that, requires high throughput and sub millisecond latency
must use in memory data store.
And
it's kinda an unfortunate
outcome. Like, people think, okay. Local SSD, for example, let's build something that uses SSD.

(39:05):
The thing with SSD is that it has low latency,
actually. It it's really great in terms of latency,
but it is limited in terms of IOPS that you can perform compared to memory.
Like, several orders of magnitude lower,
operations per second that you can do.
And

(39:26):
I don't think it's gonna be
solved
in the near future unless,
something like persistent memory will appear again. Like, inter tried this, it didn't work
out. But,
without it, high throughput use cases
won't run anywhere else in a cost efficient manner.

(39:48):
That kinda my general advice. Like, use in memory data store for
super high throughput use cases, and do not use it if you require transactional semantics
and good durability
guarantees.
Are there any other aspects of the work that you're doing on Dragonfly,
the overall ecosystem

(40:08):
around Redis, the use cases for memory stores that we didn't discuss yet that you'd like to cover before we close out the show?
Yeah. Sure.
We also,
follow the general trend of,
you know, the AI revolution,
and we added support for vector search.
It's still,

(40:30):
very much, naive,
I would say,
but it exists.
And for people
who already use in memory data store
and they need attribute search together with vector search in a single query,
which is extremely useful,
we can provide
a very,
good,
alternative to other solutions.

(40:52):
So, basically, there is a this debate
of either vector search database that are narrow,
focused on solving,
this problem
will
survive in the long term.
And the the kinda the the general,
opinion
is that I mean, they they don't they don't have enough reasons to run a dedicated,

(41:17):
database only for vector search.
And,
as I said, in memory data stores,
very,
flexible
with their use cases, and usually people already run them for other needs.
So
here is the chance of using
something like drag Dragonfly

(41:38):
for classical use cases and also for vector search and document search,
like JSON and
etcetera and etcetera.
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.

(42:03):
Interesting question.
Actually,
I wouldn't
lie. I do not have enough insight for to answer this,
just because
I am on the other side
of the mirror.
Basically, I learn about
data store, database needs from my users, from my customers.

(42:29):
Surprisingly,
I do not use lots of databases
myself,
ironically, maybe. So I can't answer your question,
unfortunately.
Fair enough.
Alright. Well, for anybody who wants to
try out Dragonfly, I'll add links in the show notes. I appreciate you taking the time today to join me and share the work that you've done, your journey to

(42:55):
building this system,
and all of the effort that you're putting into
improving the scalability and cost efficiency
of these memory store use cases. So thank you again for that, and I hope you enjoy the rest of your day.
Thank you, Tobias, and, have a good day.

(43:18):
Thank you for listening, and don't forget to check out our other shows.
Podcast.net
covers the Python language, its community, and the innovative ways it is being used, and the AI Engineering Podcast is your guide to the fast moving world of building AI systems.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host at data engineering podcast dot com with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Advertise With Us

Popular Podcasts

Are You A Charlotte?

Are You A Charlotte?

In 1997, actress Kristin Davis’ life was forever changed when she took on the role of Charlotte York in Sex and the City. As we watched Carrie, Samantha, Miranda and Charlotte navigate relationships in NYC, the show helped push once unacceptable conversation topics out of the shadows and altered the narrative around women and sex. We all saw ourselves in them as they searched for fulfillment in life, sex and friendships. Now, Kristin Davis wants to connect with you, the fans, and share untold stories and all the behind the scenes. Together, with Kristin and special guests, what will begin with Sex and the City will evolve into talks about themes that are still so relevant today. "Are you a Charlotte?" is much more than just rewatching this beloved show, it brings the past and the present together as we talk with heart, humor and of course some optimism.

On Purpose with Jay Shetty

On Purpose with Jay Shetty

I’m Jay Shetty host of On Purpose the worlds #1 Mental Health podcast and I’m so grateful you found us. I started this podcast 5 years ago to invite you into conversations and workshops that are designed to help make you happier, healthier and more healed. I believe that when you (yes you) feel seen, heard and understood you’re able to deal with relationship struggles, work challenges and life’s ups and downs with more ease and grace. I interview experts, celebrities, thought leaders and athletes so that we can grow our mindset, build better habits and uncover a side of them we’ve never seen before. New episodes every Monday and Friday. Your support means the world to me and I don’t take it for granted — click the follow button and leave a review to help us spread the love with On Purpose. I can’t wait for you to listen to your first or 500th episode!

Dateline NBC

Dateline NBC

Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com

Music, radio and podcasts, all free. Listen online or download the iHeart App.

Connect

© 2025 iHeartMedia, Inc.