Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:04):
Welcome to tech Stuff, a production from iHeartRadio. Hey there,
and welcome to tech Stuff. I'm your host, Jonathan Strickland.
I'm an executive producer with iHeart Podcasts and How the
tech are you? So I'm getting ready to go on vacation,
which means we've got some classic episodes lined up for you. Actually,
(00:26):
these aren't that classic. These came out last year and
today I thought I would bring a short one for you.
This one was published originally on June seventh, twenty twenty three.
It's a fun little episode. It is titled what was
the First MP three? This is like one of those
(00:46):
pub trivia style tech stuff topics. I hope you enjoy.
It's time for a tech stuff tidbits. I'm going to
answer the question what was the first MP three? Well,
here's the too long didn't answer. It was Tom's Diner
by Suzanne Vega. It's a song I personally do not like.
(01:07):
It's not to say it's a bad song. Just because
I don't like something doesn't mean it's bad. I just
mean I personally do not find this song at all appealing.
But it was, in fact the first MP three. Now,
if you don't know Tom's Diner. It features Vega giving
a little slice a life moment from the perspective of
a man sitting in a diner who feels kind of
(01:29):
distanced from the world around him. In case you need
a reminder, here's the first verse of the song. I
am sitting in the morning at the diner on the corner.
I am waiting at the counter for the man to
pour the coffee, and he fills it only halfway and
before I even argue, he is looking out the window
(01:51):
at somebody coming in. Now that song doesn't work for me.
I get that it got really popular, especially after someone
did an unauthorized remix of it, which is the version
most people know. But it turned out to be an
absolute perfect song to test the MP three compression algorithm.
(02:13):
To understand why, we need to learn about the purpose
of the MP three compression algorithm in the first place.
So in this case, the compression we're talking about is
relating to file size. There's an interesting side note. There's
a different kind of audio compression. This refers to the
reduction of dynamic range in a recording, and by that
(02:35):
I mean reducing the volume distance between the loudest and
the softest parts of a recording that can actually take
a part in file compression as well, but that's we're
going to set it aside. Just put a pin in that,
take a look at it later on. But with file
compression generally, the whole goal is to find ways to
(02:58):
pack information into smaller file sizes. That makes those files
easier to manage. That's important if you are dealing with
a limited amount of storage, or maybe you want to
send the file from one machine to another and you've
got limited bandwidth so you need smaller file sizes, or
else the process is going to take way too long,
(03:20):
But how do you do it well? One approach to
file compression is to take a real good look at
the file you're trying to compress, and you ask the question,
is all the information that is inside this file necessary?
Or could I get rid of some of that information
and still have a usable file on the other side
(03:43):
of it With music, That means figuring out which bits
of data you can drop without it having a noticeable
effect on the audio quality. Ideally the compressed file would
be indistinguishable from the original raw audio, but since tossing
out information that's not necessarily a guarantee. This is what
(04:05):
makes the MP three a loss e file format. MP
three is just one example of a loss e file format.
There are others, and the word loss e means just
exactly what you think. It means that some information is
tossed aside or lost in the process of compressing the
(04:25):
file to a smaller size. The folks who worked on
the MP three format had to figure out which information
was most likely to have little to no impact on
audio quality within an audio file. To do that, they
had to take into account human psychology and the limitations
of human hearing. So psychoacoustics played a big part in
(04:49):
determining the MP three compression algorithm. So for example, by that,
I mean, let's think of the range of human hearing
in terms of frequencies for a second, so your typical
human is able to hear frequencies as low as twenty
hurts and as high as twenty thousand hurts or twenty
(05:11):
killer hurts. Hurts in this case references an oscillation per
second or a vibration per second, So twenty hurts means
that something is effectively vibrating twenty times per second. So
if you had a string that when you plucked, it
would vibrate twenty times per second. That string is vibrating
(05:31):
at twenty hurts. That would be a very very low note.
The higher the frequency, the higher the pitch, and as
we age, we tend to lose the ability to hear
some of those higher pitches, which is why you would
hear about some convenience stores experimenting with playing very high
pitch noises to discourage young punks who wanted to loiter
(05:52):
in the joint. So human hearing has limitations, and in
theory you can eliminate sounds that would fall outside of
those limitations. If a sound file contains frequencies that are
at twenty one killer hertz, but your typical person can't
hear anything above twenty killer hertz, well, at least theoretically,
(06:14):
you can just toss that information and it won't change anything.
If a sound file contains a sound but no one
has the capacity to hear it, does a tree fall
in the forest? Might be getting a little lost in
the woods here anyway. That frequency example, that's just one
example of the sound that humans would have trouble hearing.
(06:35):
So another is when we hear a very soft sound
that immediately follows a very loud sound, we don't actually
perceive the soft one. The loud sound we hear eclipses
the soft sound, and it turns out we can't hear
the soft one at all. So again, if we can't
hear that soft sound that played immediately after a loud one,
(06:58):
why would you keep it? You know, you might as
well just get rid of the information you can't hear
it anyway, Just get rid of it, save the space.
This psychoacoustic approach to sound would lead the developers of
the MP three format to create a strategy regarding what
information to keep and what information to ditch. On top
of that, the algorithm had sort of a sliding scale,
(07:22):
So maybe you want to keep as much information as possible,
so you select that when you create the MP three
So you're losing less information in the process. You're still
compressing the file, but not to the extent that you
could if you chose. Maybe the most important thing to
you is that you reduce the file size as much
as you can, so you crank the compression up. Now,
(07:44):
obviously the harder you go, the more likely you're going
to lose information that will make a noticeable difference in
the playback of the audio. File, and you'll you would say, oh,
the quality here is not as good as I thought
it would be. This is where Tom's Diner comes in.
Carl Heinz Brandenburg, who was one of the leads on
(08:05):
creating the MP three format, used Tom's Diner to listen
back to compressed files and determine how the compression was
affecting the audio quality. So it was a great track
to use because the actual qualities of the recording itself
were such that it was easy to detect if something
(08:28):
was not quite right. The original recording of Tom's Diner
is not the one that has the catchy beat and
the horns in it. It's a very simple a cappella
recording of Suzanne Vegas singing her tale of looking at
the world from a male perspective through a sense of
distance and attachment. Brandenburg would use that track while tweaking
(08:49):
the algorithm, trying to create the thin line between an
effective data compression technique and a minimal impact on sound quality,
and for her contributions to the effort, although she made
them unknowingly, branden Berg would name Suzanne Vega the mother
of the MP three. Interestingly, Ryan maguire decided to take
a sort of negative image of the compressed Tom's Diner.
(09:12):
He identified sounds that were deleted in the process of
creating a lossy version of Tom's Diner, and then it
created a new recording that contained only the bits that
had been cut from the file. And it's almost like
listening to the ghost of a song. In fact, I
think they called the project the Ghost of the MP three.
(09:33):
It's pretty creepy stuff. It would not be out of
place in a horror movie. The fact that lossy files,
by definition lose information in the process of data compression
meant that audio files dismissed. The MP three format is
inherently inferior to others, at least as far as listening
experiences go, and there are arguments that some of the
lost information, while potentially being imperceptible within the song itself,
(09:57):
helped shape the overall sound and tone the piece. So
though you can't directly hear the stuff that's being cut,
that stuff actually influences how you perceive other things, so
you still change the experience of hearing the finished audio.
But the MP three format create the opportunity to store
(10:17):
and transfer audio files without having to deal with massive
raw audio formats, and back in the day that was
not a trivial thing. And so that is the answer
to the question. Tom's Diner the first MP three Hope
you're all well and I'll talk to you again really soon.
(10:42):
Tech Stuff is an iHeartRadio production. For more podcasts from iHeartRadio,
visit the iHeartRadio app, Apple Podcasts, or wherever you listen
to your favorite shows.