Rerun: What is an algorithm?

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:04):
Welcome to tech Stuff, a production from I Heart Radio.
Hey there, and welcome to tech Stuff. I'm your host,
Jonathan Strickland. I'm an executive producer with I Heart Radio.
And how the tech are you. I'm not doing so
great myself right now. I'm running a low fever. I'm
hoping that I just have like a stomach bug or

(00:24):
something and that it's not a second case of COVID,
because I don't want to be that guy. Uh. And
that's a good reminder that if you are in an
area where the COVID numbers have not really slacked off,
still a good idea to wear a mask, you know,
limit your exposure to other folks. Seriously, this is no fun.
You don't want to go through it if you can
avoid it. So yeah, just you know, I care about

(00:47):
you guys, so take care of yourself. Also, you know
that means that I'm not really able to write a
new episode because I can't read good when I got
a fever. So we're gonna have a little bit of
a rerun. I apologize for that. I wish I could
avoid it, I really do. But I also feel like
this is an episode that is timeless. It's one that

(01:07):
is important. It's called what Is an Algorithm? Which originally
published last year on July twenty one, And the reason
why I think it's important is that a lot of
people know the word algorithm, but they don't have a
deeper appreciation of what that actually means. We almost just
use it as a placeholder. It might as well be magic, right.
The Facebook algorithm determines what you see and what you

(01:29):
don't see, but that doesn't really explain what an algorithm is.
So I hope that you find this episode interesting and
if you would like to leave a suggestion for an
episode topic for me in the future. There are a
couple different ways of doing it. One is you can
go over to the I Heart Radio podcast app, which
is free. You can just navigate over to the tech

(01:50):
stuff page and there is a little microphone icon which
allows you to leave a talk back message up to
thirty seconds in length, so you can always leave me
a message if you like. You can indicate if I
could use the audio in an upcoming episode, which would
be a lot of fun. But I'm not gonna just
assume that you're cool with that. You'll have to tell me.
And um, A lot of people have been doing that.

(02:11):
They've been leaving messages. It's been great. You've all been
so kind and so enthusiastic, and I love it. Keep
them coming. Also, if you don't want to use the app,
I totally get it. You can always leave me a
message on Twitter. The handle for the show is tech
Stuff hs W. Now let's get to this episode that
originally published July twenty one, two thousand twenty one. What

(02:32):
is an algorithm? I frequently speak about algorithms on the show.
I talked about them all the time, about Facebook's algorithm
or YouTube's algorithm. We talk about search algorithms and recommendation
algorithms and tons of others. But I haven't really spent
a ton of time talking about what algorithms actually are.

(02:56):
Like it kind of become shorthand for do not look
behind the curtain, right, It doesn't really help you if
you don't know more about it. Uh. I might go
as far as to say it's a set of rules
or instructions to carry out a task, but that's pretty
reductive and it doesn't really help you understand what algorithms
actually do. So today I thought we'd talk a little

(03:18):
bit about algorithms and what they do in general, and
then maybe chat about a few famous examples and some fun,
little quirky stuff. So before I dive into this, we
do need to establish a few things, and arguably the
most important of those things is that I am not
a mathematician. I am not a computer scientist. I'm a

(03:41):
guy who loves tech, and I'm decent at research, and
I do my best to communicate topics in an informative
and you know, on a good day and entertaining way.
So with that in mind, I fully admit that this
is a subject that is outside of my narrow expertise,
and so my descriptions and explanations will be from a

(04:04):
very high level. Uh. And you might even be able
to get something like this just by glancing at a
pamphlet that's about a computer science introductory course. So for
all you folks out there who are deep in the
computer sciences, this is going to be like a children's
story book for you, possibly one where you don't like

(04:27):
de illustrations very much. Another thing that I need to
establish is that algorithm itself is a very broad term,
you know, in that sense, it's kind of like the
word program or app or software, and that these words
are categories that can have a wide variety of specific instances.

(04:49):
For example, let's just take a look at software. If
I'm talking about software, I could be talking about a
very simple program that serves as a basic calculator that
can add and subtract and divide and multiply. That could
be a piece of software, be a very simple one.
Or I might talk about something a little more complicated,

(05:10):
like a word processing program. Or I could talk about
an advanced three D shooter video game, or an incredibly
complicated computer simulation program, or anything in between. All of
those could fall into the realm of software well. Algorithms
similarly come in a wide variety of functions and complexity.

(05:32):
There are some that are elegant in their simplicity, and
there are some that when I look at them written
out in in what is clearly well defined terminology, uh,
I'm just left wondering what the heck it is I
just saw. They might as well be hieroglyphics to me.
In other words, but let's start off by talking about

(05:54):
the word algorithm itself. Now, unlike a lot of words
that we use specifically in English, algorithm does not have
its roots in Latin vocabulary. There are words in Latin
that are like al gore and things like that, but
that's not where algorithm comes from. Instead, it's a Latinized

(06:18):
version of an Arabic name. Algorithm comes from algorithmic, which
is the name that Latin scholars gave to a brilliant
man named al Ka Ritzmy and my apologies for the
butchering of the pronunciation of that name. No disrespect is intended.
It's merely American ignorance anyway. This particular mathematician philosophers scholar

(06:44):
served as a very important astronomer and librarian in Baghdad
in the ninth century. His work in mathematics was truly foundational,
so much so that we get the word algebra from
a work that he published. He treated algebra as a
distinct branch of mathematics, when before it was just kind

(07:05):
of lumped in with other stuff. And he did a
lot of work and showing how to work various types
of equations and how to balance equations. And on a
personal note, this was something that I used to love
to do in my mathematics courses. I could really see
the beauty in determining if two sides of an equation
truly were equal. It was like an elegant puzzle, and

(07:27):
I really love that. But again, just to be clear,
my love of algebra and trigonometry is pretty much where
math and I parted ways, because when calculus came along,
I was I was pretty much done. So at that
point it was it was beyond my ken as they say. Anyway,
back to our Arabic scholar, he highlighted the importance and

(07:50):
effectiveness of using a methodical approach towards the solution of problems,
and that kind of gets at the heart of what
our gorhythms actually are. The basic definition, which I kind
of alluded to earlier is quote a process or set
of rules to be followed in calculations or other problems

(08:12):
solving operations, especially by a computer. End quote that's from
Oxford Languages. But you know this can apply to all
sorts of stuff. For example, if you've ever seen anyone
who can solve Rubics cubes, those those puzzles that are
in cube form where you've got all the little, uh
different colors of squares and your job is to make

(08:33):
each side of the cube the same color. Well, there
are people who can solve Rubik's cubes in in a
fraction of like, like less than a minute. It's it's
amazing to watch them go. It's so fast and it's
hard to keep up with what they're doing. They are
essentially following algorithms, these processes that will get to a

(08:56):
dependable and replicable outcome based on what you're doing. Um So,
an algorithm is really just a way for us to
problem solve. Let's say we have a particular outcome that
we want to achieve. The algorithm is the recipe that
we followed to go from whatever our starting condition is
our starting state. In order to get to the outcome

(09:19):
we want to our target state, it needs to be
well defined. Algorithms have to be well defined because computers
are not really capable of implementing vague instructions. It needs
to have a pretty solid approach so that the computer,
when following the algorithm quote unquote, knows exactly what to
do based upon the current state of the problem. It

(09:41):
also needs to be finite. Computers are not capable of
handling infinite possibilities, so the algorithm has to work within
certain parameters, and those parameters depend upon the nature of
the problem you're trying to solve. And the equipment that
you have to solve it with. So let's take a
problem that computers typically struggle to achieve, and that is

(10:04):
random number generation or r in G. Now you would
think that creating a random number would be easy. I mean,
we can do it just by throwing some dice for example, Right,
you can throw a die and and whatever the number
comes up, that's your random number. But computers have to
follow specific instructions and execute operations that by definition need

(10:28):
to have a predictable and replicable outcomes. So computers find
random number generation really tricky. Typically, the best we can
hope for when it comes down to this sort of
thing is pseudo random numbers. So these are numbers that,
to an extent, appear to be random, right Like, on

(10:48):
casual observation, it looks like the computer is generating random numbers. However,
if you were to really seriously dig down into the process,
you might discover that they are not really really random
at all. But we just need them to be random enough, right.
We need it to be close enough to seeming to
be random for it to fulfill the function we need

(11:10):
if it's not truly random. For a lot of applications
that ends up being, you know, negligible. On a related note,
there are some types of mathematical problems that have many
degrees of freedom lots of variables, for example. So a
problem that does have a lot of variables could have
a lot of potential solutions, and so it could be

(11:31):
a challenge for computers to solve. We need algorithms to
tackle these sorts of problems, to go at them in
ways that go beyond deterministic computation, which I guess I
should define that. So, deterministic computation refers to a process
in which each successive state of an operation depends completely

(11:53):
on the previous state. So what comes next always depends
upon what came just before. And I'll give you an example.
Let's say I give you a math problem, and I
say you need to add these two numbers together. And
let's say it's like, uh, five and six, and then
from the sum of those numbers you have to subtract four. Well,

(12:14):
you would first add those two numbers together, so five
plus six you got eleven, and then you would subtract
four from that number, and so eleven minus four equal seven.
And this process is deterministic because I gave you those
specific instructions that computers do really well with deterministic processes.

(12:34):
These are pretty easy to follow. These sorts of processes
are by their nature replicable. So in other words, if
I ask you to do that same operation, if I
tell you add five and six together and then subtract four,
it doesn't matter how many times you are you do
that that process, right, You're always going to come out
with the same answer. It's always going to come back

(12:56):
no matter what you do, You're gonna always of seven
as your answer, just based on that same input. So
there are a family of algorithms that collectively get the
name Monte Carlo algorithms, and they're named after Monte Carlo,
the area of Monaco that is famous for being the
location of lots of high end casinos have been there.

(13:20):
It's not really my jam, but it is pretty darn flashy.
So the reason these algorithms get the name Monte Carlo
algorithms is that, like gambling, they deal with an element
of randomization. So gambling is called gambling because you're taking
a risk. You're gambling, you're betting that you're gonna beat

(13:43):
the odds in whichever game it is that you're playing.
And that you're going to achieve an outcome that's favorable
to you, which you experience as a payout. So, whether
it's throwing dice or playing a card game, or placing
bets at a roulette wheel or playing a slot machine,
you are engaged in an activity in which there is
some element of chance, at least in theory if the

(14:06):
game is honest, and what you're hoping for is that
chance lands in your favor. Similarly, Monte Carlo algorithms often
focus on chance and risk assessments such as what are
the odds that if I do this one thing, I'm
gonna totally regret doing it later? But you know, more

(14:27):
of the a computational problem and not a lifestyle choice. Well,
back in ninet, scientists at the Los Alamos Scientific Laboratory,
including John von Neumann, uh stand All Lam and Nick
Metropolis propose what would become known as the Metropolis algorithm,
one of the Monte Carlo algorithms, and the purpose of

(14:50):
this algorithm was to approximate solutions two very complicated mathematical
problems where going through a deterministic approach to find the
definitive answer just wasn't feasible. So In other words, for
certain complicated math problems, figuring out the quote unquote real
answer would really just take too long, perhaps to the

(15:12):
point where you would never get there within your lifetime.
So you need a way to kind of get a
ballpark solution that hopefully is closer to being correct than incorrect.
Doesn't sound like the sort of thing that computers would
be particularly good at doing right off the bat, right.
In fact, the algorithm allows computers to mimic a randomized process,

(15:33):
so it's not truly random, but it's random ish if
you like, and it uses this to approximate a solution
to whatever the problem is that you're trying to solve
that falls into this category. I once saw a simulation
of a randomized process using the Monte Carlo method that
I thought was pretty nifty, and it was determining what

(15:55):
is the value of pie. Let's say that you don't
know the value of pie and you're just trying to
figure it out. So and by the way, I mean
pie is in p I not as in the delicious
pie dessert treat. So here's the scenario. Imagine that you've
got a table and the tables like, I don't know,
three ft to a side, and positioned over this table

(16:17):
is a robotic arm. And this robotic arm can pick
up and drop large ball bearings onto the table, and
the arm can just move over the entire region of
the table, so it's got a three ft by three
ft range of motion. And in on this table you
set down two containers. You've got one that's that's a

(16:38):
circular container and one that's square. And the square container
measures six inches to a side. The radius of the
circular container is also six inches, so that means it's
got a twelve inch diameter. Right, So those are your containers,
and you've got your situation with your your robot arm

(16:58):
above this three ft square table. So the robotic arms
job is to pick up ball bearings and then from
a random que will drop that ball bearing somewhere above
the table. So some of those balls are going to
fall into the square container, some of them are going
to fall into the round container, some of them are

(17:19):
not going to fall in either. And if the process
is truly random, meaning there's an equal chance that it
will drop a ball bearing over any given point of
that table, over time, we would expect to see more
ball bearings fall into the round container because it has
a greater area and there's more space for a ball

(17:39):
bearing to fall into one of these. At the end
of repeating this process many, many times, we should be
able to take these two containers and then count the
number of ball bearings in each of them. And if
we divide the number of ball bearings in the circular
container by the number of ball bearings that fell into

(18:00):
the square container, we're gonna get results that should be
really close to pie. And you might think, wait, why
why would the number of ball bearings in one container
versus another, and then dividing those why would you get pie? Well,
if you have a process that is uniformly random across

(18:21):
an area, like our hypothetical three ft square table, the
probability that a dropped ball bearing will land in one
of those two specific containers is proportional to that specific
containers area or cross section. So then we just ask, all, right, well,
how do we define area? Well, with the square, it's
really simple, right. We take one side of a square

(18:42):
in this case, it's six inches, and we multiply it
by itself six times six. Because all sides of a
square are equal, we know that the length and the
width are each going to be six inches. We multiply
those together. We get thirty six square inches. That is
the area or cross section of our square container. But
for our circular container, we take the radius, which again

(19:05):
this case, the radius is six inches, and we square
it and then we multiply that number by pie. So
area for the circular container is six times six squared
times pie. And if we divide the cross section of
the circular container by the cross section of the square container,

(19:25):
the two six squared you know, the two elements the
squared elements cancel each other out, and so all we're
left with is pie. So the end result is that
when we count up these ball bearings that have landed
randomly in these containers, that that difference, the division that
we get that should be a close approximation of pie.

(19:48):
Now it's not going to be precise. Chances are the
number of ball bearings in the two containers will just
get you close to pie. But it illustrates how a
randomized process can help you get to a particular solution
that might otherwise be difficult to ascertain. There are videos
on YouTube that show this particular simulation. It's kind of
a famous one. So you can actually watch and get

(20:10):
kind of an understanding of what I said to you
makes no sense whatsoever. There are other options for you
to look into that might help, And that is a
great example. But what are the actual algorithms that fall
into the Monte Carlo methods spectrum? What are they used
to do well? They can be used to simulate things
like fluid dynamics, or cellular structures or the interaction of

(20:35):
disordered materials. So in other words, these are all processes.
They have a lot of potential variables at play, and
you can think of variables as representing uncertainty, something that
computers typically don't do well with. You know, from a
general stance, computers excel at specific instances rather than potential instances.

(20:57):
So algorithms like the Metropolis algor rhythm would lead computer
scientists to develop methodologies to kind of work around the
limitations of computers and leverage computational power and tackling very
difficult problems in that way. In the Monte Carlo family
of algorithms, we typically assume that the results we get

(21:19):
are going to have a potential for being wrong, like
there'll be a small percentage of error that we have
to take into account, and so keeping that in mind,
we need to frame results as being again approximations of
an answer, rather than the definitive answer to whatever problem
it is we're trying to solve. For me, this was
one of those things that was very difficult to grasp

(21:40):
for a long time because I was equating it to
math class, where you're supposed to get the answer. Now,
when we come back, we'll talk about a few more
famous algorithms and what they do. But first let's take
a quick break. Now. In our previous section, I talked

(22:02):
about a family of algorithms that trace their history back
to nineteen and so does the next one I want
to talk about, which was created by George Dantzig, who
worked for the Rand Corporation. And Danzig created a process
that would lead to what we call linear programming. So
at its most basic level, linear programming is about taking

(22:25):
some function and maximizing or minimizing that function with regard
to various constraints. But that is super vague, right, I mean,
it's not terribly helpful. It feels like I'm talking in
a circle. But this is one that's really important for say,
business developers. So let's talk about a slightly more specific
use case. Let's say that you are launching a business

(22:46):
and you're trying to make some you know, big decisions
as you're going to do this, and you're going to
manufacture and sell a product of some sort, some physical thing. Now,
your whole goal as a business is to make money
from the work you do, and to do that, you
need to figure out what your costs are going to
be and how you cannot just offset these costs but

(23:09):
make enough money to actually profit from it, so cover
all your costs plus some extra. So in this case,
what you're trying to do is figuring out how you
can minimize costs and maximize profits. Right. The constraints come
into play as you start to look at specifics, like
maybe you're gonna partner with a fabrication company, and maybe

(23:32):
you've actually got a few choices. You've got different companies
that you could go with. Well, these choices could represent
constraints in your approach. Perhaps one is more expensive than
the others, but it can produce a greater volume of product,
so you'd be able to get more product out to
market in less time. Or maybe you've got one that's
more expensive but it's also less likely to have fabrication errors,

(23:57):
and errors can be both costly and time can assuming
to address. So you have to quantify all these variables
and use them as constraints to start figuring out your
men maxing approach. Here, linear programming simply looks for the
optimum value of a linear expression, and depending on the
problem you're trying to solve, like maximizing profit versus minimizing costs,

(24:20):
you're either looking for the highest value being optimum or
the lowest value being optimum. Like with costs, you want
that to be the lowest, and you have to look
at what constraints are at play with any given expression.
So Danzing's approach to linear programming, called the simplex method,
was efficient and elegant and far more simplified than other

(24:41):
methodologies attempting to kind of do the same thing around
that time. But it also had some limitations, and that's
because the real world isn't as simple as math tends
to be. Like math tends to be pretty, you know,
firm in the grand scheme of things in the world
is a little more wibbly. Doubly, so, a linear programming

(25:02):
solution only works if the various relationships between the different
elements like the requirements and the restrictions and constraints that
you've set up with this problem. It only works if
all of those relationships are linear, or, in other words,
for a given change in one variable, we see a
given change in other variables. And that is super vague,

(25:23):
So we're going to talk about two variables. Will reduce
it down to that. So we'll use the classic X
and Y as our variables. Those can stand in for
all sorts of different values. And let's say that we
see there's a relationship between X and Y, and that
if the value of X changes from one to two,

(25:43):
we then see that the value of Y goes from
three to nine. Well, that doesn't tell us anything on
its own, right, We just know that if x is one,
y is three. If x is two, y is nine.
If we then see that if x is three, then
why is fifteen? And when x is four, why is

(26:04):
twenty one? We start to see a linear relationship because
each time X increases by one, why increases by six.
The rate of increase for each variable is linear. It's
it's not like the rate of increase doesn't change. So
we've got this linear relationship here. We can contrast this

(26:26):
with exponential relationships, in which we see not a constant increase,
but a constant ratio in successive terms in one variable
when you change another. So in this case, let's say
that we find out then, when X is one, why
is three? When X is two, why is nine? When

(26:47):
X is three, why is twenties seven? Now we see
that why is not increasing by six each time. We're
seeing that the value of why is multiplied by three
each time X goes up once. So we're seeing there's
a constant ratio of change with why with regard to X.
This gets into an exponential relationship. So when someone describes

(27:11):
something is having exponential growth, what this is supposed to
mean is that you should see a rate of growth
that is increasing by some exponent for given unit of time.
So it's not just that the thing is growing. In fact,
it's not even that the thing is growing quickly. It's

(27:32):
that the thing is growing faster as time goes on,
so that you would say today it's growing, you know,
say twice as fast as it was growing yesterday, and
tomorrow it will grow twice as fast as it was
growing today. And a lot of people use this phrase incorrectly.
We often really just mean yo, that thing is growing

(27:53):
wicked fast, but we don't necessarily mean yo, that thing
is growing wicked fast, and the rate of growth increases
by a constant amount over time, So you know, it's
it's an important distinction to make. Anyway, that was a
bit of a tangent to say that linear programming works
if the relationships between all the variables is in fact linear.

(28:16):
But if if it's not, If the relationship, say is
exponential or logarithmic or anything like that, linear programming doesn't apply.
And the real world is pretty darn complicated. So linear
programming relies on sort of kind of dumbing down what's
really going on in the real world. Two. Again, more
of an approximation, and the solution that you arrive at

(28:37):
might not be the ideal solution, but it might be
good enough, depending upon what it is you're trying to do.
Another limitation of linear programming is that as you add
more variables to a problem, This is specific to the
simplex method. As you add more variables, well, it grows
more difficult to lay out in a linear fashion. You're

(28:59):
you're adding more uncertainty, and the methodology has trouble dealing
with additional uncertainty. Perhaps ironically, the complexity of the linear
function would grow exponentially as more variables joined a problem.
So this process would be unsuitable for certain subsets of
computational problems because they just would get so complex that

(29:21):
even the most advanced computers in the world wouldn't be
able to handle them. Later on, other mathematicians would propose
algorithms that could compensate for this particular limitation of the
simplex method. For example, there are polynomial time algorithms. We're
gonna leave that for the time being, because ain't no
way I'm going to get that right anyway. I wanted

(29:44):
to talk about those two algorithms, in particular, the Metropolis
algorithm and the simplex method, because they are the foundation
for a lot of work that's been done in computer science.
But let's get some really basic ideas in with the
next one. Let's talk about sorting. Sorting is all about
arranging a list of things in order with regard to

(30:07):
some specific feature of those things. That seems pretty basic, right,
I Mean, there's a real world example that I think
most of us have had in the past, which is
that let's say you're looking at an online store and
you're searching for a specific product or specific type of product.
Let's say it's a blender, because I had to do

(30:29):
this recently, right, So you go to some online store
and you're searching for blenders and you get results, Well,
you might actually want to sort those results by a
specific way. Maybe you want to sort it by price,
and you want to go low to high because you
have a specific budget, so you just want to look at,
you know, blenders that are an x price or less.
Maybe you want to sort it by high price too

(30:51):
low because maybe you're a fat cat tycoon type and
you're just thinking, I want whatever is the most expensive.
Or maybe you want to sort the results by customer
review because we know rice and quality don't always go
hand in hand. Maybe you just want to look at
all the blenders in alphabetical order, or maybe you want
to look at them in a reverse alphabetical order. You

(31:12):
do you. Sorting is one of those most basic functions
and heavily studied concepts within computer science, so much so
that a lot of programming language libraries have sorting functions
built into them. But it's still one of those things
that people look at to see if there are better
ways to sort various types of data, particularly as you

(31:36):
get into a larger number of of sortable features um
and it becomes really important. So let's take Google's search
algorithm for example. Now, that name suggests that the algorithm
we're talking about is related to the process of searching
the web for pages that relate to a given search query,
and in fact, Google does have a search algorithm that

(31:58):
does this. But most of the time I hear people
talking about the search algorithm, they actually mean that the
algorithm that Google relies upon to determine what the order
of search results are going to be that a person
sees when they search for, you know, whatever it might be.
This is critically important because multiple studies have shown that

(32:20):
most people never bother to go beyond the first page
of search results, which means that Google really needs to
do its best to make sure that the most relevant
results to any given search query show up toward the
top of this list, because most people are never going
to bother going any further than that. And if those

(32:41):
people decide that Google search results just aren't any good,
they could conceivably use some other search engine. And let
me tell you. Google hates that idea. So Google's goal
is to present to users search results that are most
likely going to meet whatever their needs are based on
their search criteria. And way back in the day, Google

(33:04):
relied pretty much solely on an algorithm called page rank,
and from what I understand, Google still relies on page rink,
but also makes use of several other algorithms to augment
page drink. But the page rank algorithm would take a
ton of stuff into account in the process of determining

(33:26):
where within a list of search results any given example
should appear. So let's say that I'm searching for blenders
on Google for some reason, and the search algorithm has
gone out and indexed a ton of web pages that
deal with blenders. Let's say that one of those results

(33:48):
is in a blog post that isn't that old, it's
fairly recent, but from a mostly unknown source, and very
few other documents are linking into this blog. Well, chances
are that search result would appear really far down on
the list, probably pages and pages and pages down on

(34:09):
the results. But then let's say that there's another web
page that belongs to an established entity, one that has
a good reputation that's been around for a while, and
there are a ton of other pages that link to
this particular website, well, that one would probably rank fairly
high on the list. So the page rank algorithm essentially

(34:31):
assigns a score to each search result, and the value
of that score depends upon lots of stuff I mentioned,
as well as tons of other variables, and these variables
can either boost a score higher or it can knock
some points off. And then page rank would take all
these different search results and order them based upon those scores.

(34:53):
Essentially really oversimplifying here, but the idea was that the
more reliable and therefore real event results would likely be
the ones that were the most popular, So all those
links that aimed at a page would count for a lot.
If you happen to run the definitive page on blenders

(35:13):
and everyone was linking to you, that would boost your
page links score significantly. Now, clearly, just because something is
popular doesn't necessarily mean it's the most relevant or valuable thing,
but it's generally more true than it isn't true. It

(35:33):
might not always be the absolute best result, but it
might be reliably relevant, and that was what was important
to Google and to Google's users. Again, it just needs
to be good enough. So the page rank algorithm was
using a pretty complex sorting method to determine which results
should be the most visible. You can, of course, page

(35:55):
through search results. You don't have to go with whichever
one's are on page one, And in theory, if you
keep on scrolling through, going to page after page after
page of search results, then over time you should see
that the results you're getting are not as relevant as
those that appeared earlier on the list, assuming everything is

(36:18):
working properly. Again, this doesn't always happen, but generally it's oh,
it holds true, and yeah, I'm drastically oversimplifying. You could
teach an entire computer science course on the page rink
algorithm and how it works. We're not going to do that.
I would just get it wrong anyway. Speaking of search, however,

(36:40):
let's talk about a much simpler form of search that's
based on a basic algorithm, and this is called binary search.
This is a process that's meant to decrease the amount
of time it takes to search for a specific value
within a sordid array of values. Alright, so let's say
that you've got a really big array or list of

(37:03):
entries right, and it's massive, and you're searching for a
specific target that has a specific value associated with it,
and a computer could go through this list item by item,
comparing it with your target right and look to see
if there's a match. If there's not a match, you

(37:24):
can go to the next item on the list and
do that. But if you're talking about a truly huge list,
this could take ages because if it happens to be
at the bottom of that list, you have to wait
for the computer to go through every single other entry
before it gets there. So a binary search cuts this short.
It takes a value that's in the middle of the array,

(37:48):
so like halfway down the list. Essentially it just pulls
that value and it compares it against your target, the
thing that you're searching for. And if the array is
a hundred thousand trees long, it's essentially looking at entry
number fifty thousand and one. It compares this to your search.
And let's say the value target of your of your

(38:09):
search query is lower than the entry that appeared at
fifty thousand and one. Well, now the binary search just
tosses everything that's fifty and one or higher on the
list because the values are only going to go up
from there, so it can get rid of half the list.
Right there, You're left with fifty thousand items in this array,

(38:33):
and the algorithm repeats this process. It goes for an
entry smack dab in the middle of the array, compares
it to the target. If the target's value is lower
than the middle entry of this array, then you you
dump everything that's at a higher number than that if

(38:54):
it's lower or its higher. Rather, if the target value
is higher than the middle entry, then you would get
rid of all the previous ones, like one through twenty
five thousand. For example, Let's say that you know your
your target value is higher than the middle entry of
twenty and one. It's gonna dump everything from one to
thousand and start over again and half it again again.

(39:15):
I'm kind of oversimplifying here, but the idea of binary
search is that rather than go through every single entry,
it keeps cutting this array in half, comparing it with
the target, and doing it again, which reduces the number
of times it takes to find the specific answer. Uh.
You might have thought of the like the the experiment

(39:35):
where you think about you're given two pennies on day
one and the next day, you're given four pennies or
cents if you prefer two cents. The next day you
get four cents. The next day you get eight cents,
and it doubles, and pretty quickly you see how that
doubling can actually start to amount to what we would consider,
you know, like quote unquote real money, same sort of thing.

(39:56):
By by having an array over and over again, you
very quickly whittle down the stuff you have to go
through and it speeds up the process of searching for
a specific value within a huge UH data list. Binary
search is just one form of search for for various
data constructs. There are lots of others. For example, there's

(40:18):
depth first search and breadth first search. I won't get
too deep into this but because it gets really technical,
but I can describe what's what's happening here from a
high level. So these search functions are used for data
structures that are that consists of nodes. And this is
easier to understand if we use an analogy, and I

(40:40):
want you to think of a choose your own adventure
book in case you're not familiar with those. These are
books where you would read a page and at the
end of the page, you would actually be given a choice,
like you could choose what the character you're reading about
would do next. So you might get to the end
of the first page of like a mystery book, and

(41:01):
the instruction might say, if you want your protagonists to
go down the hall, turn to page eight. If you
want her to shut a door and walk away, turn
to page twenty three, and you would make your choice,
turn to that page and continue the story that way.
And so the story branches, and you've got these branching pathways. Well.
Depth first and breadth first searches explore nodes in different ways.

(41:27):
A depth first search follows each branched path all the
way down from start to the end of that one branch,
before moving on to look at a different branch. So
in our example, would the Choose your Own Adventure book,
a depth first search would go to page eight to
follow your protagonists down the hall. Then whatever the options

(41:50):
were at the end of that, it would go with
the first option, continue and doing this all the time
until you get to the end of that pathway. Then
it would come back up to the next most recent branch,
follow that all the way to the end, and so
on until it had searched through every possible pathway in
this this uh this graph, that's what we would call it.

(42:12):
Breath search instead progresses equally across all possible paths. So
breath search would mean that you would go to page eight,
and let's say at the end of page eight, it
says you can then turn to page seventeen or page
fifty two to make your next choice. But instead of
continuing down that pathway, you back up to your starting

(42:32):
position and then you go down to page twenty three.
You know, the other option where you would close the
door and walk away. You would read that maybe at
the end of that page it says you can choose
to either go to page three or to page forty six. Well,
then you would go back to your your first branch.
You would go down the list of pages seventeen and
fifty two and three and forty six, maybe those all

(42:56):
end in different choices. You would then do all of
those across the board. Next you're you're searching across the
width of the graph the breadth of it. Both are
valid forms of searching, and the method used typically depends
upon the type of data being searched and what you're
trying to do. We've got a lot more to cover
with algorithms, but I feel like I'm gonna have to

(43:19):
give you guys a break, So let's do that really quick,
and I'll just give you a very short example following,
and then we'll pick up back up with algorithms again
in a future episode. But first let's take this quick break,
all right. I know I've waxed poetic about algorithms so

(43:43):
much so that this episode is running longer than I anticipated.
I have a whole extra section to talk about with algorithms,
including things that are related to algorithms but aren't necessarily
specifically algorithms, stuff like hashing and Markov models and things
of that nature. But I want to finish up with

(44:03):
one that I think is kind of fun. And this
is called the Doomsday rule. And to be be fair,
this is not just a computational algorithm. This is something
that you can do in your head if you learn
the rule. I'm not saying I could do it. I
haven't really committed this to a memory process yet, but
it is possible. And the Doomsday rule sounds pretty ominous, however, right.

(44:26):
I mean, it's basically to figure out what day of
the week any given date is So if you were
an expert with the doomsday rule and I asked you, hey,
what day of the week does June five fall on?
You could use that rule and say, oh, that's gonna
be a Thursday. That also, by the way, happens to

(44:46):
be the day I'll turn fifty. Yikes, man, we're coming
up on that one. All right, So where did this
rule come from? And what the heck is this doomsday
stuff about? Well, the doomsday thing is kind of just
a cheeky name, and there's all these different ways we
could describe that. But during any given year, there are

(45:07):
certain dates that will always share the same day of
the week, which is no big surprise, right. I mean,
you might notice that Christmas, that is December, falls on
the same day of the week as New Year's Day
for the following year that happened, you know, January one.
It always is that way. But with the doomsday rule,
the dates for that are called doomsdays. Are things like

(45:31):
April four or four four, June six that's six six,
August eight that's eight eight, and you know, so on
like October tenth, that's ten ten. The anchor day are
called these are anchor days. They're called doomsdays, um they

(45:51):
change with each year or with leap years. The anchor
day actually leaps a year, but once you know that day,
you know it's the same for all of those dates.
So for one, as I record this, this year's doomsday
is or anchor day is a Sunday. So that means
every Doomsday is going to be a Sunday. So I'm

(46:12):
recording this in July with our next doomsday being on
August eight, so that means August eight should be a Sunday.
And if you look at the calendar, you'll see that yes,
August day is a Sunday. That means that October ten
should be a Sunday, that December twelve should be a Sunday.

(46:35):
So the doomsday rule involves a little bit more than
just this, but not much more. And this means that
if you learn the doomsday rule and you have a
general idea of what the anchor day is for each year,
you can calculate what in what day of the week
any given date will fall on, including past years. The

(46:57):
guy who came up with this was a mathematician named
John Conway, and reportedly he could give you an answer
in less than two seconds if you gave him a date,
he could tell you what day of the week it
was going to fall on in two seconds or less. Sadly,
he passed away last year at the age of eighty
two after he contracted COVID nineteen. But that doomsday rule

(47:20):
is one that I think is super cool. I've seen
people who can do this where you give them a
date and they're just able to tell you what day
of the week that falls on, and you can go
and check it and sure enough they're right. Well, this
is one process where you can do that. It's not
the only one, by the way, there are other algorithms
that also can help you determine what day of the
week a specific date will fall on. But the doomsday

(47:42):
rule is is one that works. And again I don't
have it, um. I don't have an expertise in this.
I haven't practiced it at all, so I can't do it.
If you came up to me and say, hey, Jonathan
May first, nineteen, what day of the week was that,
I'd be like, I don't know. Probably really a it's
probably a fine day. I don't I don't know because

(48:04):
I haven't done this yet. But I think it's really cool.
I gets. It illustrates how for certain types of problems
you can come up with a mathematical approach to solving
those problems, and it just is applicable for all problems
that fall into that category. And that's really the elegant
and cool thing about algorithms. Also the other cool things
that people are coming up with new ones all the time,

(48:27):
and they might be trying to refine earlier algorithms to
make them even more efficient and effective, or they might
be trying to come up with all new algorithms to
take advantage of emerging technology stuff like quantum computing. I
hope you enjoyed that episode. Like I said, it originally
published last year in two thousand twenty one, and hopefully

(48:48):
tomorrow I'll be back in shape and I'll be able
to do another news episode. We'll have to wait and see.
Like I said, I'm really hoping that this is just
a stomach bug thing that I'm dealing with and not
like round two of COVID. I do have a test
that literally I think just arrived at my door, so
I will check and make sure, because obviously I want

(49:08):
to be responsible and safe for all of my fellow
human beings. Out there. I hope all of you are
well and staying safe and healthy, and that you're having
a great summer so far. Uh you know, I plan
on having a great summer once I feel better. For now,
I'm just having a kind of grouchy summer because that's
I mean, that's my go to mood for pretty much

(49:29):
any situation, but particularly when I'm not feeling well. Anyway,
take care and I will talk to you again really soon.
Text Stuff is an I Heart Radio production. For more
podcasts from my Heart Radio, visit the i Heart Radio app,

(49:50):
Apple Podcasts, or wherever you listen to your favorite shows.

All Episodes

Episode Transcript

TechStuff News

Follow Us On

Hosts And Creators

Oz Woloshyn

Karah Preiss

Show Links

Popular Podcasts

Stuff You Should Know

24/7 News: The Latest

Crime Junkie

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Rerun: What is an algorithm?