Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:04):
Get in touch with technology with tech Stuff from how
stuff Works dot com. Hey there, and welcome to tech Stuff.
I'm your host, Jonathan Strickland. I'm an executive producer at
iHeart Radio and how Stuff Works in I love all
things tech, and recently, as I was looking over tech news,
(00:25):
I saw that the Venerable Transmission Control Protocol a a
t c P is getting ready to take its bow
upon the release of the next version of Hypertext Transfer
Protocol or h t t P. In other words, it
will no longer be part of how ht t P works.
So wait, what does all that mean? That's a whole
(00:47):
lot of initialisms, and why does it matter. Where did
TCP come from? Anyway? Well, most of the time we
group TCP together with Internet Protocol or i P, so
it's pretty comedy here. People talk about the TCP slash
i P protocol, but that name is misleading as generally
(01:07):
what is meant by that is a suite of protocols,
not just those two. Though. To be fair, when Robert
Khan invinced SURF we're first working on the transport rules
of the Internet, they lumped it all together in one
protocol called t c P, So just a reminder. Let's
start on the very basic definitions here. A protocol is
(01:28):
essentially a set of rules or directions. So it's the
parameters that we create so that computers know what to
do when we tell them to do stuff. And a
good protocol should be functional and consistent. You should be
able to get the same result every time you follow
those rules if you give the same inputs. Now, while
there are several protocols in the TCP I P suite,
(01:51):
the t c P and i P ones are particularly important.
No big surprise, since that's the ones we use whenever
we refer to these protocols. So a quick reminder about
how computers send data over the Internet. Computers do not
send enormous files all in one go, because that would
be difficult to scale as the network of computers got
(02:15):
larger and the file sizes got bigger as well, and
it would mean that if something were to go wrong
during the transmission of a file, you would at best
end up with a corrupt file and you'd have to
start all over again. At worst, you wouldn't end up
with anything at all. So either way, you would have
to figure out how to start up the process so
to facilitate sending this information across the network so that
(02:38):
it's not unmanageable. This protocol suite divvies up the file
into smaller packets of data. Each packet has information associated
with it that identifies where it is coming from, where
it is headed on the Internet, and how the information
contained within the packet fits in with all the other
(02:58):
packets of information for that same file, so that the
computer on the other end of the communication channel can
get all those packets and then put it all back
together so that you get the file that was sent
by the first computer. TCP defines how applications can create
channels of communication across a network. It also manages how
(03:21):
a message is assembled into those smaller packets before they
are then transmitted over the Internet and then reassembled at
the destination address, and it makes sure that the recipient
computer has actually received each packet in sequence to verify
that the entire file has made it across. So it's
kind of an error checking mechanism, or a way of
(03:44):
ensuring that the information computer A is sending to computer
B gets to where it's going. Without these rules, you
would never really be sure if you send something from
computer A. If computer B got it. This is a
set of rules that tells computer or be to say, hey,
by the way, once you get all these, let computer
A know so that everyone knows that the transmission is complete.
(04:09):
I P, by the way, defines how to address and
route each packet to make sure it reaches the right destination.
It's technically on a layer lower than the TCP protocol. Now,
way back in the early nineteen seventies, you had a
team working on a project called ARPA net. This is
(04:30):
going to go back to DARPA, which I covered in
a series of episodes recently, so this kind of ties
in with that more than a little bit. Ar Ponett
was an early computer network, and in a way it
was a precursor to the Internet. Remember the Internet is
a network of networks. Ar Ponnett was a network, period
(04:52):
and while the team was working on this, they realized
that the protocols they had been using for our bonnet
were functional but not scalable. If you were to go
beyond just one network, if you wanted to connect two
networks together, you really needed a different solution. And as
this network would get bigger, the situation would become untenable
(05:13):
and so some of the team got to work designing
new sets of rules for networked communication that could keep
things running smoothly even as the network would get bigger
and bigger, something that was truly scalable. One of the
people working on this was a guy named Robert Kahn,
one of the fathers of the Internet. You often hear
about him and his buddy Vents Surf, who together would
(05:36):
create TCP. So Robert Kahn comes on over to DARPA
and he's part of the I P. T O Department.
That's the department that's in charge of creating networks and
that sort of thing. He specifically wanted to replace an
earlier set of rules, an earlier protocol called the Network
Control Program or in c P. And the reason for
(05:58):
that gets a little technical, but I figure we can
go a bit further than just it doesn't scale well,
because that doesn't really tell you much. So for just
a second, let's talk about what con was envisioning back
in the early nineteen seventies. He wanted a protocol that
was going to do certain things. He felt that a
computer scientists had connected these distant computers together using the
(06:19):
telephone system that created the first wide area network, but
that was not going to be sustainable on a broader scale.
He knew that the key component of the network technology
at our ponet was the Interface Message Processor or i
MP and MP and an MP is kind of like
a router. It was a packet switching node that would
(06:41):
serve as a connection between different computers on our bonnet.
In addition to imps, the team on our Bonnet was
working on a host to host protocol which would become
the network control protocol, and developers began to create applications
to run on those networks like email and con would
demonstrate even a twenty node large network in nineteen seventy two,
(07:05):
so a network consisting of twenty computers. Khan was also
working on technology for a packet radio network that would
actually use radio waves to send data back and forth
across different computers, and that was going to use packet
switching specifically because radio is tricky stuff. If a signal
(07:26):
were lost or jammed, then the information that was being
sent across the network would be lost and you'd end
up with miscommunications and failures, so you had to have
a way to deal with this. Originally, Khan had intended
to develop a protocol specifically for radio packet networks to
have the sort of error correction mechanism in there, a
(07:47):
way of guaranteeing that the information from one system would
get to another, and then this network would be able
to interface with other networks like arpanet, using the already
established in CP as a transport layer. But there was
a big problem. N c P could only address networks
(08:08):
and machines down to the imp level. N CP would
rely upon our bonnet itself for end to end reliability,
so it worked just fine if you were in our
pannet if your machine was directly connected into that network,
But if you wanted to interconnect the arpanet network with
another network, something had to change because n c P
(08:32):
could not handle identifying and responding to errors and delivering
information to the computers that were outside of our bannet.
N CP just didn't have that capability. Arpanet handled everything
within the network, but it didn't have anything designed to
handle stuff from outside that network. So at first Kahn
had planned to only work on the packet radio networks
(08:54):
and just concentrate on that, but ultimately his quest to
create a protocol that could ensure message is were arriving
at their destinations across different networks, expanded beyond just the
packet radio application. So Cohn wanted an open network architecture,
something that would allow any sort of networked system to
(09:17):
interconnect with another and still have rules in place to
ensure that the data was getting to where it needed
to go. So Robert Kahn and vent Surf were two
computer scientists who were working on this. They were the
authors of these protocols. Vent Surf had been one of
the people to create in CP well fun fact, by
(09:37):
the way, TCP did not originally stand for Transmission control
Protocol back when con and Surf first proposed it. Instead,
it stood for Transmission Control Program, and that's a subtle difference,
to be sure. They wrote the first version of TCP
in ninety three, and they published a fully documented and
(09:58):
revised version in four under RFC six. It was specifically
titled Specification of Internet Transmission Control Program. Not long after
the initial creation, other folks began to realize that it
might be a better idea to break out the functions
of t c P into two sets of protocols, and
(10:19):
that's where we get t c P I P. Because, again,
before it was all lumped together, and then they figured
this would make more sense if we separated them out
into two sets of rules. The creation of t C
P I P predates the Open Systems Interconnection or O
S I layer model. And I've talked about the O
(10:39):
SI model in a past episode of tech Stuff. But
the O SI model describes how different parts of a
telecommunications or computer system communicate with one another. They have
layers to describe the different functions. But T C P
I P layers are are pretty similar to O SI layers,
so we can we can talk about the two as
(11:00):
being at least somewhat analogous. It's an abstract idea that's
meant to describe how each layer fits within a grand scheme.
So layers that are near the bottom of the stack
support all the layers that are on top of it.
Layers at the top do not necessarily support any other layers.
They rely on the ones below them, but they don't
(11:22):
support anything any layers above them. And again this is
an abstraction. There are not actual literal layers in these systems,
but within this framework, you could say TCP would be
on layer four. That would be the transport layer. The
Internet protocol is one layer further down. It's on layer three,
(11:42):
meaning it is a little closer to the basic hardware
layer of the system. That's the lowest layer is the hardware,
and that uh means that the i P protocol supports
the TCP protocols above it and above TCP are the
application layers where you have stuff like file Transfer Protocol,
(12:04):
email and h T t P. Those are all on
top of it. I've got more to say about what
TCP is and what it does in just a moment,
but first let's take a quick break to thank our sponsor.
So when laying out the rules for TCP, bob con
(12:27):
had a few requirements and this is from the Internet
Society's page on the History of the Internet, and they
were each distinct network would have to stand on its
own and no internal changes should be required to any
such network to connect it to the Internet. Next, communications
would be on a best effort basis, so if a
(12:50):
packet did not make it to the final destination, it
would shortly be retransmitted from the source. So this is
the error correction part. A packet on its way to
computer B never makes it, then computer A will retransmit
that same packet black boxes would be used to connect
(13:10):
these networks. These would later be called gateways and routers,
and there would be no information retained by the gateways
about the individual flows of packets passing through them, thereby
just keeping them very simple and avoiding complicated adaptation and
recovery from various failure modes. So they were really just
a means of controlling traffic flow, but not monitoring traffic flow,
(13:35):
and there would be no global control at the operations level.
Those were his requirements. Those would develop into more granular
requirements as the work would continue on the protocols. And
Vince Surf did a really really good explanation about how
TCP works in a short video, and he used a
postcard analogy, and I highly recommend checking it out because
(13:59):
he just puts it very simply. I'm gonna kind of
paraphrase what he said here. He compared TCP to sending
a book to a friend, and you're using the postal service,
except your postal service is very peculiar. They will not
carry anything other than postcards. So you cannot actually send
the physical book as is to your friend because the
(14:20):
post office is not gonna carry that. So what you
have to do is cut your book up so that
you can fit maybe about half a page on a postcard,
and then you can send that postcard through the mail,
and then you have to send all of the book
in a series of postcards to your friend. But then
you realize, hey, wait, because of the way I have
(14:42):
to cut up this book, sometimes there's no indication there
about a page number, so there's no way of knowing
just on the page where this page fits in relation
to the rest of the book. So then you number
every single postcard, and that way your friend knows what
order they go in, they know the sequence, so there's
(15:02):
no guarantee that any one postcard will actually make it
all the way through. There's also no guarantee your friend
will receive the postcards in the same order that you
sent them. But by numbering, your friend will know which
postcards they have received. So if they get postcard number
eighty three but they didn't get postcard number A D two,
they can send you a message alerting you that they
(15:24):
are missing one, and you can read, transmit or re send,
so your friend can send a postcard back to you.
Essentially says, hey, I got all the postcards up to
number eighty two or whatever. But that's it, and this
would let you know that you need to resend those postcards,
which means you have to keep a copy of the
postcards you've submitted. You can't just send your only copy
because you'd be up the creek if your buddy says, hey,
(15:48):
I didn't get that, and if nothing comes back to you,
if your friend never says, oh, I received everything, then
you would have to start re sending postcards until you
finally got a message that says, hey, a toad, scut
all the postcards, thank you, I'm going to read the
book now, or whatever it might be. But that's how
TCP works. But instead of it being you know, physical postcards,
(16:08):
we're talking digital information. There's never a guarantee that the
information you send is actually going to get to your
destination or that will all arrive sequentially. But these safeguards
mean that your computer will know when to send stuff
again to guarantee transmission. The United States Department of Defense
adopted TCP i P as a standard in nineteen eighty.
(16:31):
DARPA was able to change over in advance of everyone else,
which allowed for partitioning of the military networks from non
military networks, and that would carry forward, so you have
mill net that's its own separate network that's based on
essentially the same architecture as the general Internet. T c
P i P would become the official transport layer for
(16:53):
ar PONNETT on January one, nine three. This was called
a flag day. Now, that is an event that involves
incorporating a critical change in a very large system UH
in a simultaneous way, like it has to change throughout
the system the same time, UH, and that is really
tricky to do. The bigger the system, obviously, the harder
(17:14):
it is for you to make a global change all
at the same time. The transition had been planned out
for years in advance because this would require network administrators
to change over to the t c P i P
protocol all at the same time, and surprisingly it went
off without any really major problems, So that's pretty cool.
By the mid nineteen eighties, the Internet was an established thing,
(17:38):
though really only a relatively small number of people were
aware of it. If you worked at DARPA, or if
you were at a university with a really good computer
science curriculum, or maybe you worked in a research facility,
or maybe you were in the military, then you might
know about it. A few other government offices also were
on the early Internet, but apart from that and a
(18:02):
few major businesses, it was largely a thing of mystery.
The general public was pretty much ignorant of the Internet
for almost a decade. It wouldn't be until the emergence
of the World Wide Web that more people would become
aware of the Internet, and in fact, at that point,
the Worldwide Web and the Internet would often be confused
as meaning the same thing for a lot of people.
(18:24):
A lot of people would refer to the Worldwide Web
as the Internet, not realizing that really the world Wide
Web is one application built on top of the Internet,
it is not itself the Internet. In the early nineties, uh,
speaking of the Web, a guy named Tim berners Lee,
he was working for a little scientific research organization called CERN,
(18:46):
had a bright idea. And his idea was for an
application protocol on top of the Internet that would facilitate
communications between client computers and server computers, including file transfers
and the ability for a sir her to refer a
client to a different server. And that would be the
foundation for the Worldwide Web. And just in case you
(19:07):
didn't pick up on my stupid joke. CERN is not
a little scientific research organization. It's the European Organization for
Nuclear Research and it is a huge, huge deal. Among
the many things it does is oversee the large Hadron collider,
so big, big organization. Ultimately, the purpose of h t
(19:30):
t P, which was created by Tim berners Lee, was
to create a means of linking different documents together through
what is called hypertext. And you've seen these. These are
those highlighted words and web pages, and when you click
on it, you go to a different web page. And
that's the whole point is clicking on hypertext sends a
command to navigate to a new page. And because the
(19:52):
rules for h T t P allow for one server
to refer a client to another server, those two web
pages don't have a quote unquote live on the same
server together. So we're talking about the very basic foundation
of how the Worldwide Web works with the interlinking documents
that allow you to hop from one page or one
(20:13):
site to another. The features of HTTP version zero point nine,
which was the first one released to the public, included
the following clients Server Request Response Protocol as key protocol
running over a t C p I P link. It
was designed to transfer hypertext documents or h t m
(20:36):
L and the connection between server and client is closed
after every request. And that's it. It was bare bones stuff,
but this was the beginning of something truly transformational. In fact,
I could honestly say I would not have the career
I have without this invention. So from the h t
(20:58):
t P standard of pretty quickly, Tim burns Lee had
set the stage, and then a team at the National
Center of Supercomputing Applications or in c s A made
the first popular web browser called Mosaic. One of the
programmers on that team was a guy named Mark Andreason,
who went on to co found the Mosaic Corporation and
(21:20):
eventually publish a new browser called Netscape. Now, at that
same time, the Internet Engineering Task Force was organizing a
team called the ht t P Working Group dedicated to
improving this HTTP protocol, and it was quickly developing in
several different directions. And by that I mean a lot
(21:42):
of different people had started by taking the version zero
point nine h t t P and then tweaking it
independently of each other. So it's evolving in different directions simultaneously.
So while there's a shorthand that refers to h t
t P one point oh. There is not an actual
(22:04):
standard one point oh. There were many quote unquote flavors
of one point oh because there were so many different
variations on that, the I E. T F Working Group
would publish a standard for HTTP Protocol version one point
one under RFC two zero six eight if you want
(22:25):
to read it. It's a little technical, but this version
would be tweaked and updated before it was officially released.
In Version two point oh of h T t P
wouldn't come out until two thousand fifteen. That is a
long time between versions one point one came out in
(22:47):
two point oh and two thousand fifteen, and only about
a third of all websites in the world today support
version two point oh as the standard. Most websites are
still using one point oh or one point one, So
it may come as something of a surprise to hear
that h T t P three point oh is right
(23:09):
around the corner when not even a majority of sites
are on the most recent version of two point oh,
and perhaps an even bigger surprises that, unlike the earlier versions,
this h T t P protocol will not rely upon
t c P. I'll explain more in just a second,
but first let's take another quick break to thank our sponsor. Okay,
(23:39):
So why would h T t P three point oh
ditch TCP, which has been a part of the framework
of the Internet since the very beginning, since before there
was an Internet. Well, it mostly comes down to two
big things, speed and efficiency. So when Robert Cohn and
(24:00):
when vent surf we're working on TCP, they were building
out a protocol to handle any sort of application that
would be built on top of what was to become
the Internet. So and it has a very much a
one size fits all kind of approach to that it
provided useful or really I mean, at this point I
should just say necessary set of features to facilitate communication.
(24:25):
But some of those are excessive or not as pertinent
to the types of traffic that happened over h T
t P, or they impede some of the functions that
htt P handles. For example, in an effort to establish
a connection between a client and a server, TCP requires
(24:47):
a number of back and forth messages, essentially saying, hey
over there, services, Yeah, what is it? Client says, I
want to talk to you? The services all right? Hang
on a second, and the computer says now, good time,
and the services yeah, yeah, let's go ahead and do that.
It's far more technical than that, but there's this series
that goes back and forth in order for a communication
(25:09):
channel to be established between client and server. That gets
even more complicated if you want to have an encrypted
connection over Secure socket Layer or s s L using
a website. So you know the little lock that you
see in the address bar when you visit a secure website,
that's part of s SL. Well, to establish that kind
(25:31):
of connection between your computer or your computer's browser, which
is the client, and the server which houses the website
you're visiting to, it requires even more round trips between
the two to first establish the connectivity and then established
the encrypted communications. So the process is good for making
sure that there is an actual route for data to follow,
(25:54):
but it's not the most straightforward approach if you want
to use HTTP, particularly if you want to use encrypted connections.
There is another protocol, however, called User Data Gram Protocol
or u d P, and that can serve as the
foundation for a new transport layer for h T t
(26:16):
P three point oh. U d P has a big
advantage over TCP. It is incredibly simple and it is
incredibly fast. It is a transport layer protocol just like TCP,
but unlike TCP, u DP does not have the same
features to ensure communications are established or successful, so that
(26:38):
could be a big drawback. Right. U DP transmissions are unordered,
so a later message can arrive ahead of an earlier one,
and that can be very confusing if you haven't built
in a way of dealing with that. There's also no
means for the receiving computer to know if something has
gone wrong, if packets go missing, like it doesn't know
(26:59):
if doesn't have all the different pieces, if it's just
over pure U d P. But U d P can
act as a base to build upon. It doesn't have
to be uh. The protocols don't have to be the
end all be all. That's your starting point. So Google
has taken you U d P as its starting point
and built upon it to create an experimental network protocol
(27:21):
called q U I C for Quick U d P
Internet Connections. The I E t F has taken this
experimental protocol and worked on creating a standardized version, which
in some ways has moved away from what Google's initial
design was all about, But the writing is on the
wall for t c P as far as the h
(27:43):
t t P standard is concerned. Moving forward, the transport
layer that was initially published in the nineteen seventies is
going to have to make way for a lighter, more agile,
and less cumbersome standard. In addition to the move away
from t c P, the new version of h T
t P will be more secure. QUICK, as designed by Google,
(28:06):
transports data by encrypting it by default, is not the
added layer on top of everything. It is the default layer.
Google's build, which is sometimes called h t t P
over QUICK, is supported in the latest versions of Google
Chrome and in the Opera web browser. Right now, only
(28:27):
a few websites actually make use of it, most of
them belong to Google, though Facebook has also been incorporating it.
It's going to be a really long path to travel
to get widespread adoption because right now less than two
percent of all websites support QUICK. Meanwhile, TCP will still
be in use. Just because it's being phased out of
(28:49):
future versions of h T t P does not mean
that this protocol is completely obsolete. As I mentioned earlier
in this episode, the Worldwide Web is just one implication
on top of the Internet, there are lots of others
that will still make use of that venerable set of rules,
and some websites may never move off of it, since
(29:10):
it requires work to make the transition, and let's be honest,
it's not always the highest priority for some businesses that
maintain websites out there. But it is interesting to me
to see this move away from TCP. I have always
associated TCP as being a truly integral part of the Internet,
and it still will be. It just won't necessarily be
(29:33):
as integral to web browsing as it used to be.
Fascinating stuff to me. I hope you guys enjoyed this episode.
I know it got a little more uh techie with
the protocols than usual, but I thought this was a
big deal and one that maybe, probably I'm guessing, is
not going to get widely reported outside of tech news circles.
(29:55):
I doubt that you're gonna, you know, turn on the
local news and some anchor is going to say and
another news, the Worldwide Web is moving away from this
ancient set of rules. I just don't see that making
the news, but it is important anyway. If you guys
have any suggestions for future episodes of tech Stuff. Maybe
(30:17):
it's a technology, a company, a person in tech. Maybe
there's someone you would want me to talk with about technology.
You should send me those thoughts. You can email the
show it is tech Stuff at how stuff works dot com,
or you can go to our website that's text Stuff
podcast dot com and you can find other ways to
contact me there. Don't forget to head on over to
(30:39):
our merchandise store that's at t public dot com slash
tech Stuff. Everything you purchased there goes to benefit the show.
We greatly appreciate that, and there's some pretty cool things
over there. If you haven't checked it out, go see
if there are any designs that you particularly like. And oh,
remember we've been nominated in the Science and Technology category
(31:00):
of the I Heart Radio Podcast Awards. You can head
on over to the website for the I Heart Radio
Podcast Awards and vote up to five times a day.
You can dedicate all five of those votes to tech
Stuff if that is your desire. But whatever you want
to do, you should go check out all of those
different categories see if there are any other shows you
(31:21):
really like. Maybe you can discover some shows you didn't
even know existed. I always love finding new podcasts. This
is a great way of finding some really high quality ones,
and I'll talk to you again really soon for more
on this and thousands of other topics, because it how
(31:41):
stuff works. Dot com