Short Stuff: DNA Data Storage

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:04):
Hey, and welcome to the short stuff. I'm Josh, and
there's Chuck, and Jerry's here too, and so's Dave and
Spirit and we're coming at you from the future of
right now.

Speaker 2 (00:14):
This is one of those where it's so interesting, so cool,
so mind blowing and so promising, and then you get
to the very end and then you're like, oh.

Speaker 1 (00:25):
To me, that just meant just give it a little
more time.

Speaker 2 (00:28):
No, And in a lot of times that is the case,
and probably will be in this case. But it was
such a oh. And you'll see what in about you know,
twelish minutes what we're talking about.

Speaker 1 (00:36):
So essentially, what we're talking about first is data. We've
got a lot of data. Like anytime somebody says something,
thinks something writes something down, somebody comes up with a
new recipe or a new patent or whatever, that gets encoded.
It's data that gets saved. We don't really throw stuff
away anymore. And so we're kind of a wash in data.
And if you want to take that data, you want

(00:57):
to save it, you want to preserve it. Let's say,
it's really like you're the Library of Congress. Sure get this, man,
I did not realize this what you do is you
take that data and you transfer onto the same kind
of magnetic reels that those old room sized computer mainframes
used to read and write data. Yeah, you put it

(01:17):
on tape, Yeah exactly. Well I didn't realize that, but
it's just the proven go to means of long term
it's called archival storage of the kind of data that
you don't really need to access anytime soon. It's called
low touch data. You're just putting it literally in cold storage.

Speaker 2 (01:38):
Yeah, I mean, it's been around for a long time,
very dependable, very durable, very reliable. It doesn't cost a
lot of money. It can hold a ton of data.
One tape can hold between one million and fifteen million
gigabytes or one to fifteen petabytes. That's a lot of stuff.

Speaker 1 (01:58):
Really.

Speaker 2 (02:00):
The problem is, and you know it's all relative, but
they're kind of big, but not big big. They're three
inches by three inches and you're like, Chuck, that's not
very big at all, But that is big when you
talk about you know, potentially billions of these things and
having to store them in a place that is, like
you said, cold storage. So it's the cost of building

(02:23):
these cold storage buildings that is the issue. When it
comes to this three by three inch thing.

Speaker 1 (02:29):
That and then also you know they've been around for
three quarters of a century, so we know they last
that long if you keep them in cold storage, but
we don't know exactly how long they will last, so
there's also a question of that. So that combined with
so cost questions about how long it will last, and
then also just the enormous amounts of information we're adding

(02:52):
every year are making people look for other ways to
encapsulate data, to encode data in ways that are cheaper,
that are smaller, that are require less money to keep cold.
And what they've come up with, chuck. For anybody who
has looked at the title of this episode, they won't
be very surprised. But DNA, that's right.

Speaker 2 (03:13):
I know it's early, but we got to take a
break right.

Speaker 1 (03:15):
There, right agreed, all right, we'll be right back.

Speaker 2 (03:38):
All right. So you dropped a pretty big truth bomb
on everyone. I'm sure there are people that for sixty seconds,
where like, what storing data on DNA? Dude, that's in
my body? Like, what are you talking about putting data
in my body?

Speaker 1 (03:54):
You got that straight?

Speaker 2 (03:56):
You don't have that straight. But here's here's a pretty
good as far as how much this stuff can hold.
This is pretty staggering stuff. And this is from a
couple of dudes from the Los Alamos National Lab, and
I think you got it from Scientific American. Here's how
much DNA can hold. Seventy four million million bytes of information,

(04:18):
which is basically the Library of Congress.

Speaker 1 (04:20):
That's a lot.

Speaker 2 (04:21):
That's a lot. You can put that, if you were
putting it on DNA, into the size of something as
big as a poppy seed, six thousand times over. Right.
Said another way, if you split that seed in half,
you could store all of the data on Facebook.

Speaker 1 (04:39):
Yeah, and then by twenty twenty five, the size of
the data that humanity's generated, it will reach an estimated
thirty three zeta bytes, so three point three followed by
twenty two zeros of bytes of information, a lot of bytes.
If you can transcribe that all to DNA, you could

(05:01):
fit the whole thing into a ping pong ball. Yeah,
not a three by three plastic cartridge, multiple times over.
A single ping pong ball could hold all of the
world's data. And you can make multiple ping pong balls
as backups too.

Speaker 2 (05:17):
Yeah, and you don't need to. And it's pretty easy
to duplicate them, apparently, and you don't need to keep
them in the fridge, even though you could put it
in an egg carton sure and be set. You don't
even have to. It's going to last a long time
not being in cold storage, and probably even longer in
cold storage.

Speaker 1 (05:34):
You could give a ping pong ball to every living
human to keep in their fridge and like it would
have no problem whatsoever. Be like here, you keep this
cold for one hundred and fifty years, and only.

Speaker 2 (05:45):
Half of them would eat that ping pong ball thinking
it was an egg.

Speaker 1 (05:49):
Yeah. Yeah, so you'd still be left with all of
those backups.

Speaker 2 (05:53):
Here's where it gets super interesting though, because you know,
as most people listening probably are, Like I said, as
I was read this, I was like, Okay, that's a
cool idea, but like, how in the world does this work?
And it turns out that it's not that mind blowing
your difficult I'm not saying I could go out and
do it, but it makes a lot of sense to
wrap your head around. Because DNA, as we all know,

(06:18):
is composed of four nucleotides, or at least combinations of guanine, thymine, addenine,
and cytosine. Just remember GTAC and attica. Yeah, ooh, ironically,
all this digital data though is included. That's out there

(06:39):
in the world and as everyone knows, and ones and zeros.
So it's it's you know, it sounds like you know,
and it can be any combination of ways. But when
you really break it down, it's really you can either
just have zero zero, zero, one, one zero or one
one as far as those combinations go. And that's four things.
And there are those four nucleotides. So if you just
like say, hey, each one of these nucleotides is going

(07:02):
to be assigned to different number, then that's all you need.
There's the key to your map.

Speaker 1 (07:06):
Yeah, so say adenocene stands for zero zero, and guanine
stands for one to one, and so forth, each one
stands for one of those pairs of possible combinations. Then
you can take any string of binary data zeros and
ones and turn it into genetic code based on those nucleotides.
So like you would just have you go from a

(07:29):
string of ones and zeros to a string of ATG's
and c's. That's it. The thing is is you're you're
not turning ones and zeros into letters. You're actually transcribing
the ones and zeros from binary code into physical genetic material.
You're actually putting a base of at Adenocene right there.

(07:49):
You're putting a base of thiamine next to it, like,
depending on how the code reads with the ones and
the zeros and what order they're in. You're actually physically
creating genetic material DNA. But rather than encoding the information
to building a living thing, you're encoding the information to

(08:10):
the entire catalog of stuff you should know. And honestly,
isn't that the first thing we should preserve in DNA? Sure? Good?

Speaker 2 (08:19):
After the movies of Gene Wilder.

Speaker 1 (08:22):
How about at the same time as the movies of
Gene Wilder, can we just agree to that?

Speaker 2 (08:26):
How dare you?

Speaker 1 (08:28):
Hey? I think highly of us and Gene Wilder. Uh.

Speaker 2 (08:33):
I don't know why he's been on my mind lately,
but he has been.

Speaker 1 (08:36):
He's been shaking it for you in your head.

Speaker 2 (08:38):
He's been shaking it for me. So this all sounds great.
And like I mentioned at the very beginning, this is
one of those things where like, holy cow, this is
the future, this is it, and then the L at
the end, and the L is that it's really expensive
to do this. Like, we can do this, we figured
out how to do this, it's possible, we have the
tech to do this, but that here's a tape name

(09:02):
lto DASH nine. It's a magnetic storage tape. You can
get it for eight bucks and you can get one
petabyte of storage. That would cost you about a trillion
dollars to do for DNA.

Speaker 1 (09:14):
Yeah, there was a guy who was interviewed in Ours
Technica named Hugh and June Park. He's the CEO of
a data storage company called Catalog, and he even estimated said,
let's say it cost you three cents to print a
single nucleotide. Yes, that's cheap, but for each base pairrot
now you're up to six cents. And then now you're
translating gigabytes, you're entering millions of dollars. So if it

(09:34):
cost millions of dollars to translate a gigabyte, it cost
trillions of dollars to do a petabyte. And the other
problem of it, too, Chuck, is that it's really really slow, right.

Speaker 2 (09:46):
It's super slow. So this is a clear case of
one of those things like you mentioned, which is like
just wait, because like with any technology, it's going to
get quicker, it's going to get cheaper. I don't know
if this is like one hundred years into the future,
but I don't think at this point the cost is
just so outrageous that there's no government is going to
fund something like this.

Speaker 1 (10:06):
I mean, a trillion dollars for one petabyte of information
is not You're not going to sell that very very easily.
And then yeah, like I was saying, the speed, if
you're transferring information from one of those magnetic storage tapes,
you're transferring it about a gigabyte per second typically if
it takes even like a second to print a single nucleotide,

(10:29):
which is still very fast, but you're we're thinking on
human level fast. We need to think on like how
many ones and zeros are in the average gigabyte of code.
Now you're talking about decades to transfer a petabyte discs
worth of information using DNA technology. Yeah, so, yes, it's

(10:50):
very slow right now, it's very expensive right now, But
I don't think we're one hundred years off, Chuck, because
we're able to do this now relatively cheaply because the
Human Genome Project came along that was twenty years ago.
Think about how much, how long, how far we've come,
And this is like the hardest chunk the first twenty years.
I think it's just going to get faster and easier.

(11:10):
I don't think we're going to be waiting one hundred
years to see DNA data storage.

Speaker 2 (11:14):
Does that mean that stuff you should Know is in
the hardest chunk when you're fifteen?

Speaker 1 (11:19):
I think so. Yeah, it feels like it. Okay, I'm kidding. Well,
I'm teasing Chuck, right, Yeah, just teasing, which means, of course,
short stuff is out.

Speaker 2 (11:34):
Stuff you Should Know is a production of iHeartRadio. For
more podcasts my Heart Radio, visit the iHeartRadio app, Apple Podcasts,
or

Speaker 1 (11:41):
Wherever you listen to your favorite shows.

All Episodes

Episode Transcript

Stuff You Should Know News

Follow Us On

Hosts And Creators

Chuck Bryant

Josh Clark

Show Links

Popular Podcasts

Stuff You Should Know

New Heights with Jason & Travis Kelce

24/7 News: The Latest

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Short Stuff: DNA Data Storage