Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:04):
Welcome to tech Stuff, a production from I Heart Radio.
Hey there, and welcome to tech Stuff. I'm your host,
Jonathan Strickland. I'm an executive producer with I Heart Radio
and a love of all things tech, and I frequently
speak about algorithms on this show. I talked about them
(00:25):
all the time, talk about Facebook's algorithm or YouTube's algorithm.
We talk about search algorithms and recommendation algorithms and tons
of others. But I haven't really spent a ton of
time talking about what algorithms actually are. Like it kind
of becomes shorthand for do not look behind the curtain, right,
(00:47):
It doesn't really help you if you don't know more
about it. I might go as far as to say
it's a set of rules or instructions to carry out
a task, but that's pretty reductive and it doesn't really
help you understand and what algorithms actually do. So today
I thought we'd talk a little bit about algorithms and
what they do in general, and then maybe chat about
(01:08):
a few famous examples and some fun, little quirky stuff.
So before I dive into this, we do need to
establish a few things, and arguably the most important of
those things is that I am not a mathematician, I
am not a computer scientist. I'm a guy who loves tech,
(01:28):
and I'm decent at research, and I do my best
to communicate topics in an informative and you know, on
a good day and entertaining way. So with that in mind,
I fully admit that this is a subject that is
outside of my narrow expertise, and so my descriptions and
explanations will be from a very high level. And you
(01:52):
might even be able to get something like this just
by glancing at a pamphlet that's about a computer science
introduct the recourse. So for all you folks out there
who are deep in the computer sciences, this is going
to be like a children's story book for you, possibly
one where you don't like the illustrations very much. Another
(02:15):
thing that I need to establish is that algorithm itself
is a very broad term, you know, in that sense
it's kind of like the word program or app or software,
and that these words are categories that can have a
wide variety of specific instances. For example, let's just take
(02:36):
a look at software. If I'm talking about software, I
could be talking about a very simple program that serves
as a basic calculator that can add and subtract and
divide and multiply. That could be a piece of software,
be a very simple one, or I might talk about
something a little more complicated, like a word processing program.
(02:57):
Or I could talk about an advanced three D shooter
video game, or an incredibly complicated computer simulation program, or
anything in between. All of those could fall into the
realm of software well. Algorithms similarly come in a wide
variety of functions and complexity. There are some that are
(03:19):
elegant in their simplicity, and there are some that when
I look at them written out in in what is
clearly well defined terminology, uh, I'm just left wondering what
the heck it is I just saw. They might as
well be hieroglyphics to me. In other words, But let's
start off by talking about the word algorithm itself. Now,
(03:44):
unlike a lot of words that we use specifically in English,
algorithm does not have its roots in Latin vocabulary. There
are words in Latin that are like al gore and
things like that, but that's not where algorithm comes from. Instead,
it's a latinized version of an Arabic name. Algorithm comes
(04:07):
from algorithmic, which is the name that Latin scholars gave
to a brilliant man named al qua Ritzmy and my
apologies for the butchering of the pronunciation of that name.
No disrespect is intended. It's merely American ignorance. Anyway. This
particular mathematician philosophers scholar served as a very important astronomer
(04:31):
and librarian in Baghdad in the ninth century. His work
in mathematics was truly foundational, so much so that we
get the word algebra from a work that he published.
He treated algebra as a distinct branch of mathematics, when
before it was just kind of lumped in with other stuff.
(04:52):
And he did a lot of work in showing how
to work various types of equations and how to balance equations.
And on a personal note, this was something that I
used to love to do in my mathematics courses. I
could really see the beauty in determining if two sides
of an equation truly were equal. It was like an
elegant puzzle and I really love that. But again, just
(05:15):
to be clear, my love of algebra and trigonometry is
pretty much where math and I parted ways, because when
calculus came along, I was I was pretty much done,
so at that point it was it was beyond my
ken as they say. Anyway, back to our Arabic scholar,
he highlighted the importance and effectiveness of using a methodical
(05:38):
approach towards the solution of problems, and that kind of
gets at the heart of what algorithms actually are. The
basic definition, which I kind of alluded to earlier is
quote a process or set of rules to be followed
in calculations or other problems solving operations, especially by a
(06:00):
computer end quote that's from Oxford Languages. But you know
this can apply to all sorts of stuff. For example,
if you've ever seen anyone who can solve Rubik's cubes,
those those puzzles that are in cube form where you've
got all the little, uh different colors of squares and
your job is to make each side of the cube
the same color. Well, there are people who can solve
(06:24):
Rubik's cubes in in in a fraction of like, like
less than a minute. It's it's amazing to watch them go.
It's so fast and it's hard to keep up with
what they're doing. They are essentially following algorithms these processes
that will get to a dependable and replicable outcome based
(06:45):
on what you're doing. Um So, an algorithm is really
just a way for us to problem solve. Let's say
we have a particular outcome that we want to achieve.
The algorithm is the recipe that we followed to go
from whatever our starting can to. S is our starting
state in order to get to the outcome we want
to our target state, it needs to be well defined.
(07:08):
Algorithms have to be well defined because computers are not
really capable of implementing vague instructions. It needs to have
a pretty solid approach so that the computer, when following
the algorithm quote unquote, knows exactly what to do based
upon the current state of the problem. It also needs
to be finite. Computers are not capable of handling infinite possibilities,
(07:32):
so the algorithm has to work within certain parameters, and
those parameters depend upon the nature of the problem you're
trying to solve and the equipment that you have to
solve it with. So let's take a problem that computers
typically struggle to achieve, and that is random number generation
or r n G. Now, you would think that creating
(07:55):
a random number would be easy. I mean, we can
do it just by throwing some dice for example, Right,
you can throw a die and and whatever the number
comes up, that's your random number. But computers have to
follow specific instructions and execute operations that by definition need
to have a predictable and replicable outcomes. So computers find
(08:17):
random number generation really tricky. Typically, the best we can
hope for when it comes down to this sort of
thing is pseudo random numbers. So these are numbers that,
to an extent, appear to be random, right Like, on
casual observation, it looks like the computer is generating random numbers. However,
(08:38):
if you were to really seriously dig down into the process,
you might discover that they are not really random at all.
But we just need them to be random enough, right.
We need it to be close enough to seeming to
be random for it to fulfill the function we need.
If it's not truly random for a lot of applications,
(08:59):
that ends being you know, negligible. On a related note,
there are some types of mathematical problems that have many
degrees of freedom lots of variables. For example, so a
problem that does have a lot of variables could have
a lot of potential solutions, and so it could be
a challenge for computers to solve. We need algorithms to
(09:20):
tackle these sorts of problems, to go at them in
ways that go beyond deterministic computation, which I guess I
should define that. So, deterministic computation refers to a process
in which each successive state of an operation depends completely
on the previous state. So what comes next always depends
(09:43):
upon what came just before. And I'll give you an example.
Let's say I give you a math problem. I say
you need to add these two numbers together. And let's
say it's like, uh, five and six, and then from
the sum of those numbers you have to subtract four. Well,
you at first add those two numbers together, so five
plus six you got eleven, and then you would subtract
(10:05):
four from that number, and so eleven minus four equals seven.
And this process is deterministic, because I gave you those
specific instructions that computers do really well with deterministic processes.
These are pretty easy to follow. These sorts of processes
are by their nature replicable. So in other words, if
(10:27):
I ask you to do that same operation. If I
tell you add five and six together and then subtract four,
it doesn't matter how many times you are you do
that that process, right, You're always going to come out
with the same answer. It's always going to come back
no matter what you do, You're gonna always have seven
as your answer, just based on that same input. So
(10:49):
there are a family of algorithms that collectively get the
name Monte Carlo algorithms, and they're named after Monte Carlo,
the area of Monaco that is famous for being the
location of lots of high end casinos have been there.
It's not really my jam, but it is pretty darn flashy.
(11:10):
So the reason these algorithms get the name Monte Carlo
algorithms is that, like gambling, they deal with an element
of randomization. So gambling is called gambling because you're taking
a risk. You're gambling, you're betting that you're gonna beat
the odds in whichever game it is that you're playing,
(11:31):
and that you're going to achieve an outcome that's favorable
to you, which you experience as a payout. So whether
it's throwing dice or playing a card game or placing
bets at a roulette wheel or playing a slot machine.
You are engaged in an activity in which there is
some element of chance, at least in theory if the
(11:51):
game is honest, and what you're hoping for is that
chance lands in your favor. Similarly, Monty car are Low
algorithms often focus on chance and risk assessments such as
what are the odds that if I do this one thing,
I'm gonna totally regret doing it later? But you know,
(12:12):
more of the as a computational problem and not a
lifestyle choice. Well, back in nineteen scientists at the Los
Alamos Scientific Laboratory, including John von Neumann, uh stand All
Lam and Nick Metropolis, proposed what would become known as
the Metropolis algorithm, one of the Monte Carlo algorithms, and
(12:35):
the purpose of this algorithm was to approximate solutions two
very complicated mathematical problems where going through a deterministic approach
to find the definitive answer just wasn't feasible. So, in
other words, for certain complicated math problems, figuring out the
quote unquote real answer would really just take too long,
(12:57):
perhaps to the point where you would never get there
within your lifetime. So you need a way to kind
of get a ballpark solution that hopefully is closer to
being correct than incorrect. Doesn't sound like the sort of
thing that computers would be particularly good at doing right
off the bat. Right. In fact, the algorithm allows computers
to mimic a randomized process, so it's not truly random,
(13:20):
but it's random ish if you like, And it uses
this to approximate a solution to whatever the problem is
that you're trying to solve that falls into this category.
I once saw a simulation of a randomized process using
the Monte Carlo method that I thought was pretty nifty,
and it was determining what is the value of pie.
(13:41):
Let's say that you don't know the value of pie
and you're just trying to figure it out. So and
by the way, I mean pie is in p I
not as in the delicious pie dessert treat. So here's
the scenario. Imagine that you've got a table and the
tables like I don't know, three ft to a side,
and positioned over this table is a robotic arm, and
(14:04):
this robotic arm can pick up and drop large ball
bearings onto the table, and the arm can just move
over the entire region of the table, so it's got
a three ft by three ft range of motion. And
in on this table you set down two containers. You've
got one that's that's a circular container, and one that's square.
(14:26):
And the square container measures six inches to a side. Uh,
the radius of the circular circular container is also six inches,
so that means it's got a twelve inch diameter. Right.
So those are your containers, and you've got your situation
with your your robot arm above this three ft square table.
So the robotic arms job is to pick up ball
(14:48):
bearings and then from a random que will drop that
ball bearing somewhere above the table. So some of those
balls are going to fall into the square container, some
of them are going to fall into the round container.
Some of them are not going to fall in either.
And if the process is truly random, meaning there's an
(15:09):
equal chance that it will drop a ball bearing over
any given point of that table, over time, we would
expect to see more ball bearings fall into the round
container because it has a greater area and there's more
space for a ball bearing to fall into one of these.
At the end of repeating this process many many times,
(15:32):
we should be able to take these two containers and
then count the number of ball bearings in each of them.
And if we divide the number of ball bearings in
the circular container by the number of ball bearings that
fell into the square container, we're gonna get results that
should be really close to pie. And you might think, wait,
(15:54):
why why would the number of ball bearings in one
container versus another, and then divide fighting those why would
you get pie? Well, if you have a process that
is uniformly random across an area, like our hypothetical three
ft square table, the probability that a dropped ball bearing
will land in one of those two specific containers is
(16:16):
proportional to that specific containers area or cross section. So
then we just ask, all, right, well, how do we
define area? Well, with the square, it's really simple, right.
We take one side of a square in this case,
it's six inches, and we multiply it by itself six
times six. Because all sides of a square are equal,
we know that the length and the width are each
(16:37):
going to be six inches. We multiply those together. We
get thirty six square inches. That is the area or
cross section of our square container. But for our circular container,
we take the radius, which again in this case the
radius is six inches, and we square it and then
we multiply that number by pie. So area for the
(16:59):
circular contain or is six times six squared times pie.
And if we divide the cross section of the circular
container by the cross section of the square container, the
two six squared, you know, the two elements the squared
elements cancel each other out, and so all we're left
with is pie. So the end result is that when
(17:22):
we count up these ball bearings that have landed randomly
in these containers, that that difference, the division that we
get that should be a close approximation of pie. Now
it's not going to be precise. Chances are the number
of ball bearings in the two containers will just get
you close to pie. But it illustrates how a randomized
(17:42):
process can help you get to a particular solution that
might otherwise be difficult to ascertain. There are videos on
YouTube that show this particular simulation. It's kind of a
famous one, so you can actually watch and get kind
of an understanding of what I said to you makes
no sense whatsoever. There are other options for you to
look into that might help, and that is a great example.
(18:06):
But what are the actual algorithms that fall into the
Monte Carlo methods spectrum? What are they used to do well?
They can be used to simulate things like fluid dynamics
or cellular structures or the interaction of disordered materials. So
in other words, these are all processes. They have a
lot of potential variables at play, and you can think
(18:28):
of variables as representing uncertainty, something that computers typically don't
do well with. You know, from a general stance, computers
excel at specific instances rather than potential instances. So algorithms
like the Metropolis algorithm would lead computer scientists to develop
methodologies to kind of work around the limitations of computers
(18:52):
and leverage computational power and tackling very difficult problems in
that way. In the Monte Carlo family of algorithms, we
typically assume that the results we get are going to
have a potential for being wrong, like there'll be a
small percentage of error that we have to take into account,
and so keeping that in mind, we need to frame
(19:14):
results as being again approximations of an answer rather than
the definitive answer to whatever problem it is we're trying
to solve. For me, this was one of those things
that was very difficult to grasp for a long time
because I was equating it to math class, where you're
supposed to get the answer. Now, when we come back,
we'll talk about a few more famous algorithms and what
(19:36):
they do. But first let's take a quick break. Now.
In our previous section, I talked about a family of
algorithms that trace their history back to nineteen and so
does the next one I want to talk about, which
was created by George Dantzig who worked for the Rand Corporation,
(20:00):
and Danzig created a process that would lead to what
we call linear programming. So at its most basic level,
linear programming is about taking some function and maximizing or
minimizing that function with regard to various constraints. But that
is super vague, right, I mean, it's not terribly helpful.
(20:20):
It feels like I'm talking in a circle. But this
is one that's really important for say, business developers. So
let's talk about a slightly more specific use case. Let's
say that you are launching a business and you're trying
to make some you know, big decisions as you're going
to do this, and you're going to manufacture and sell
a product of some sort, some physical thing. Now, your
(20:42):
whole goal as a business is to make money from
the work you do, and to do that, you need
to figure out what your costs are going to be
and how you cannot just offset these costs but make
enough money to actually profit from it, so cover all
your costs plus some extra. So in this case, what
(21:03):
you're trying to do is figuring out how you can
minimize costs and maximize profits. Right. The constraints come into
play as you start to look at specifics, like maybe
you're gonna partner with a fabrication company, and maybe you've
actually got a few choices. You've got different companies that
you could go with. Well, these choices could represent constraints
(21:23):
in your approach. Perhaps one is more expensive than the others,
but it can produce a greater volume of product, so
you'd be able to get more product out to market
in less time. Or maybe you've got one that's more expensive,
but it's also less likely to have fabrication errors, and
errors can be both costly and time consuming to address,
(21:46):
So you have to quantify all these variables and use
them as constraints to start figuring out your men maxing approach.
Here line, your programming simply looks for the optimum value
of a linear expression, and depending on the problem you're
trying to solve, like maximizing profit versus minimizing costs, you're
either looking for the highest value being optimum or the
(22:09):
lowest value being optimum. Like with costs, you want that
to be the lowest, and you have to look at
what constraints are at play with any given expression. So
Danzig's approach to linear programming, called the simplex method, was
efficient and elegant and far more simplified than other methodologies
attempting to kind of do the same thing around that time.
(22:31):
But it also had some limitations, and that's because the
real world isn't as simple as math tends to be.
Like math tends to be pretty, you know, firm in
the grand scheme of things, in the world is a
little more wibbly wobbly. So a linear programming solution only
works if the various relationships between the different elements like
(22:52):
the requirements and the restrictions and the constraints that you've
set up with this problem. It only works if all
of those relationships are linear or other words, for a
given change in one variable, we see a given change
in other variables, and that is super vague. So we're
going to talk about two variables. Will reduce it down
to that. So we'll use the classic X and Y
(23:14):
as our variables. Those can stand in for all sorts
of different values. And let's say that we see there's
a relationship between X and Y, and that if the
value of X changes from one to two, we then
see that the value of hy goes from three to nine. Well,
that doesn't tell us anything on its own, right. We
(23:36):
just know that if x is one, y is three.
If x is two, y is nine. If we then
see that if x is three, then why is fifteen?
And when x is four, why is twenty one? We
start to see a linear relationship because each time x
increases by one, Y increases by six, the rate of
(23:59):
increase for each variable is linear. It's it's not like
the rate of increase doesn't change. So we've got this
linear relationship here. We can contrast this with exponential relationships,
in which we see not a constant increase but a
constant ratio in successive terms in one variable when you
(24:22):
change another. So in this case, let's say that we
find out then when X is one, why is three?
When X is two, why is nine? When X is three,
why is twenties seven? Now we see that why is
not increasing by six each time. We're seeing that the
value of why is multiplied by three each time X
(24:46):
goes up one. So we're seeing there's a constant ratio
of change with why with regard to X. This gets
into an exponential relationship. So when someone describes something as
having exponential growth, what this is supposed to mean is
that you should see a rate of growth that is
(25:06):
increasing by some exponent per given unit of time. So
it's not just that the thing is growing. In fact,
it's not even that the thing is growing quickly. It's
that the thing is growing faster as time goes on,
so that you would say today it's growing, you know, say,
twice as fast as it was growing yesterday, and tomorrow
(25:28):
it will grow twice as fast as it was growing today.
And a lot of people use this phrase incorrectly. We
often really just mean yo, that thing is growing wicked fast,
But we don't necessarily mean yo, that thing is growing
wicked fast, and the rate of growth increases by a
constant amount over time, so you know, it's it's an
(25:51):
important distinction to make. Anyway, that was a bit of
a tangent to say that linear programming works if the
relationships between all the variables is in fact linear. But
if if it's not, if the relationship says exponential or
logarithmic or anything like that, linear programming doesn't apply. And
the real world is pretty darn complicated. So linear programming
(26:14):
relies on sort of kind of dumbing down what's really
going on in the real world. Two. Again, more of
an approximation, and the solution that you arrive at might
not be the ideal solution, but it might be good enough,
depending upon what it is you're trying to do. Another
limitation of linear programming is that as you add more
variables to a problem. This is specific to the simplex method.
(26:37):
As you add more variables, well, it grows more difficult
to lay out in a linear fashion. You're you're adding
more uncertainty, and the methodology has trouble dealing with additional uncertainty.
Perhaps ironically, the complexity of the linear function would grow
exponentially as more variables joined a problem, so the process
(27:01):
would be unsuitable for certain subsets of computational problems, because
they just would get so complex that even the most
advanced computers in the world wouldn't be able to handle them.
Later on, other mathematicians would propose algorithms that could compensate
for this particular limitation of the simplex method. For example,
there are polynomial time algorithms, but we're gonna leave that
(27:23):
for the time being, because ain't no way I'm going
to get that right anyway. I wanted to talk about
those two algorithms, in particular, the metropolis algorithm and the
simplex method, because they are the foundation for a lot
of work that's been done in computer science. But let's
get some really basic ideas in with the next one.
(27:45):
Let's talk about sorting. Sorting is all about arranging a
list of things in order with regard to some specific
feature of those things. That seems pretty basic, right, I Mean,
there's a real world example that I think most of
us have had in the past, which is that let's
say you're looking at an online store and you're searching
(28:07):
for a specific product or specific type of product. Let's
say it's a blender. Because I had to do this recently, right,
So you go to some online store and you're searching
for blenders and you get results, Well, you might actually
want to sort those results by a specific way. Maybe
you want to sort it by price, and you want
to go low too high because you have a specific budget,
(28:28):
so you just want to look at, you know, blenders
that are an x price or less. Maybe you want
to sort it by high price too low because maybe
you're a fat cat tycoon type and you're just thinking,
I want whatever is the most expensive. Or maybe you
want to sort the results by customer review because we
know price and quality don't always go hand in hand.
(28:50):
Maybe you just want to look at all the blenders
in alphabetical order, or maybe you want to look at
them in a reverse alphabetical order. You do you. Sorting
is one of those most base sick functions and heavily
studied concepts within computer science, so much so that a
lot of programming language libraries have sorting functions built into them.
(29:12):
But it's still one of those things that people look
at to see if there are better ways to sort
various types of data, particularly as you get into a
larger number of of sortable features um and it becomes
really important. So let's take Google's search algorithm for example.
(29:32):
Now that name suggests that the algorithm we're talking about
is related to the process of searching the web for
pages that relate to a given search query, and in fact,
Google does have a search algorithm that does this, But
most of the time I hear people talking about the
search algorithm, they actually mean that the algorithm that Google
relies upon to determine what the order of search results
(29:56):
are going to be that a person sees when they
search for, you know, whatever it might be. This is
critically important because multiple studies have shown that most people
never bother to go beyond the first page of search results,
which means that Google really needs to do its best
to make sure that the most relevant results to any
(30:17):
given search query show up toward the top of this list,
because most people are never going to bother going any
further than that. And if those people decide that Google
search results just aren't any good, they could conceivably use
some other search engine, and let me tell you, Google
hates that idea. So Google's goal is to present to
(30:39):
users search results that are most likely going to meet
whatever their needs are based on their search criteria and
way back in the day, Google relied pretty much solely
on an algorithm called page rank uh, and from what
I understand, Google still relies on page rank, but also
makes use of several other algorithms to augment page drink.
(31:04):
But the page drink algorithm would take a ton of
stuff into account in the process of determining where within
a list of search results any given example should appear.
So let's say that I'm searching for blenders on Google
for some reason, and the search algorithm has gone out
(31:26):
and indexed a ton of web pages that deal with blenders.
Let's say that one of those results is in a
blog post that isn't that old, it's fairly recent but
from a mostly unknown source, and very few other documents
are linking into this blog. Well, chances are that search
(31:47):
result would appear really far down on the list, probably
pages and pages and pages down on the results. But
then let's say that there's another web page that belongs
to an estab abolished entity, one that has a good reputation,
it's been around for a while, and there are a
ton of other pages that link to this particular website. Well,
(32:10):
that one would probably rank fairly high on the list.
So The page rank algorithm essentially assigns a score to
each search result, and the value of that score depends
upon lots of stuff I mentioned, as well as tons
of other variables, and these variables can either boost a
score higher or it can knock some points off. And
(32:31):
then page rank would take all these different search results
and order them based upon those scores. Essentially really oversimplifying here,
but the idea was that the more reliable and therefore
relevant results would likely be the ones that were the
most popular. So all those links that aimed at a
(32:52):
page would count for a lot. If you happen to
run the definitive page on Blenders and everyone was linking you,
you that would boost your page links score significantly. Now, clearly,
just because something is popular doesn't necessarily mean it's the
most relevant or valuable thing, but it's generally more true
(33:16):
than it isn't true. It might not always be the
absolute best result, but it might be reliably relevant, and
that was what was important to Google and to Google's users. Again,
it just needs to be good enough. So the page
rank algorithm was using a pretty complex sorting method to
determine which results should be the most visible. You can
(33:39):
of course page through search results. You don't have to
go with whichever ones are on page one. And in theory,
if you keep on scrolling through, going to page after
page after page of search results, then over time you
should see that the results you're getting are not as
relevant it as those that appeared earlier on the list,
(34:02):
assuming everything is working properly. Again, this doesn't always happen,
but generally it's all it holds true. And yeah, I'm
drastically oversimplifying. You could teach an entire computer science course
on the page rink algorithm and how it works. We're
not going to do that. I would just get it
wrong anyway. Speaking of search, however, let's talk about a
(34:26):
much simpler form of search that's based on a basic algorithm,
and this is called binary search. This is a process
that's meant to decrease the amount of time it takes
to search for a specific value within a sordid array
of values. Alright, So let's say that you've got a
really big array or list of entries, right, and it's massive,
(34:52):
and you're searching for a specific target that has a
specific value associated with it. And computer could go through
this list item by item, comparing it with your target
right and look to see if there's a match. If
there's not a match, you can go to the next
item on the list and do that. But if you're
(35:13):
talking about a truly huge list, this could take ages
because if it happens to be at the bottom of
that list, you have to wait for the computer to
go through every single other entry before it gets there.
So a binary search cuts this short. It takes a
value that's in the middle of the array, so like
(35:34):
halfway down the list. Essentially, it just pulls that value
and it compares it against your target, the thing that
you're searching for. And if the array is a hundred
thousand entries long, it's essentially looking at entry number fifty
thousand and one. It compares this to your search. And
let's say the value target of your of your search
(35:55):
query is lower than the entry that appeared at fifty
thousand and one. Well, now the binary search just tosses
everything that's fifty thousand and one or higher on the
list because the values are only going to go up
from there, so it can get rid of half the
list right there. You're left with fifty thousand items in
(36:17):
this array, and the algorithm repeats this process. It goes
for an entry smack deb in the middle of the array,
compares it to the target. If the target's value is
lower than the middle entry of this array, then you
you dump everything that's at a higher number than that
(36:39):
if it's lower or it's higher. Rather, if the target
value is higher than the middle entry, then you would
get rid of all the previous ones, like one through
twenty five thousand. For example, Let's say that you know
your your target value is higher than the middle entry
of twenty five and one. It's gonna dump everything from
one to twenty five thousand and start over again and
half it again again. I'm kind of oversimplifying here, but
(37:02):
the idea of binary search is that rather than go
through every single entry, it keeps cutting this array in half,
comparing it with the target, and doing it again, which
reduces the number of times it takes to find the
specific answer. UH. You might have thought of the like
the the experiment where you think about you're given two
(37:23):
pennies on day one, and the next day you're given
four pennies or cents if you prefer two cents. The
next day you get four cents, the next day you
get eight cents and it doubles, and pretty quickly you
see how that doubling can actually start to amount to
what we would consider, you know, like quote unquote real money,
the same sort of thing. By by having an array
(37:44):
over and over again, you very quickly whittle down the
stuff you have to go through and it speeds up
the process of searching for a specific value within a
huge UH data list. Binary search is just one form
of search for for various data constructs. There are lots
of others. For example, there's depth first search and breadth
(38:06):
first search. I won't get too deep into this but
because it gets really technical, but I can describe what's
what's happening here from a high level. So these search
functions are used for data structures that are that consists
of nodes. And this is easier to understand if we
use an analogy. And I want you to think of
(38:26):
a Choose your own Adventure book in case you're not
familiar with those. These are books where you would read
a page and at the end of the page you
would actually be given a choice, like you could choose
what the character you're reading about would do next, So
you might get to the end of the first page
of like a mystery book, and the instruction might say,
(38:48):
if you want your protagonists to go down the hall,
turn to page eight. If you want her to shut
a door and walk away, turn to page twenty three,
and you would make your choice, turn to that page
and continue the story that way. And so the story branches,
and you've got these branching pathways. Well. Depth first and
breadth first searches explore nodes in different ways. A depth
(39:12):
first search follows each branched path all the way down
from start to the end of that one branch, before
moving on to look at a different branch. So, in
our example, would the Choose your Own Adventure book, a
depth first search would go to page eight to follow
your protagonist down the hall. Then whatever the options were
(39:36):
at the end of that, it would go with the
first option, continue and doing this all the time until
you get to the end of that pathway. Then it
would come back up to the next most recent branch,
follow that all the way to the end, and so
on until it had searched through every possible pathway in
this this uh, this graph is what we would call
it breath search. Instead, progress is equally across all possible paths.
(40:03):
So breath search would mean that you would go to
page eight, and let's say at the end of page eight,
it says you can then turn to page seventeen or
page fifty two to make your next choice. But instead
of continuing down that pathway, you back up to your
starting position and then you go down to page twenty three.
You know, the other option where you would close the
door and walk away. You would read that maybe at
(40:25):
the end of that page it says you can choose
to either go to page three or to page forty six. Well,
then you would go back to your your first branch.
You would go down the list of pages seventeen and
fifty two and three and forty six, maybe those all
end in different choices. You would then do all of
those across the board next, so you're you're searching across
(40:46):
the width of the graph the breadth of it. Both
are valid forms of searching, and the method used typically
depends upon the type of data being searched and what
you're trying to do. We've got a lot more to
cover with algorithms, but I feel like I'm gonna have
to give you guys a break, So let's do that
(41:07):
really quick, and I'll just give you a very short
example following, and then we'll pick up back up with
algorithms again in a future episode. But first let's take
this quick break, all right. I know I've waxed poetic
about algorithms so much so that this episode is running
(41:30):
longer than I anticipated. I have a whole extra section
to talk about with algorithms, including things that are related
to algorithms but aren't necessarily specifically algorithms, stuff like hashing
and Markov models and things of that nature. But I
want to finish up with one that I think is
kind of fun. And this is called the Doomsday rule.
(41:53):
And to be be fair, this is not just a
computational algorithm. This is something that you can do in
your head if you learn the rule. I'm not saying
I could do it. I haven't really committed this to
a memory process yet, but it is possible. And the
Doomsday rule sounds pretty ominous, however, right. I mean it's
basically to figure out what day of the week any
(42:15):
given date is. So if you were an expert with
the doomsday rule and I asked you, hey, what day
of the week does June five fall on? You could
use that rule and say, oh, that's gonna be a Thursday.
That also, by the way, happens to be the day
I'll turn fifty. Yikes, man, we're coming up on that one.
(42:36):
All right, So where did this rule come from? And
what the heck is this doomsday stuff about? Well, the
doomsday thing is kind of just a cheeky name, and
there's all these different ways we could describe that. But
during any given year, there are certain dates that will
always share the same day of the week, which is
no big surprise, right. I mean, you might notice that Christmas,
(43:00):
that is December twenty falls on the same day of
the week as New Year's Day for the following year
that happened, you know, January one. It always is that way.
But with the doomsday rule, the dates for that are
called doomsdays. Are things like April four or four four,
June six that's six six, August eight that's eight eight,
(43:24):
and you know, so on like October tenth, that's ten ten.
The anchor day are called these are anchor days, they're
called doomsdays. Um, they change with each year or with
leap years. The anchor day actually leaps a year. But
once you know that day. You know, it's the same
(43:45):
for all of those dates. So for one, as I
record this, this year's doomsday is or anchor day is
a Sunday, So that means every Doomsday is going to
be a Sunday. So I'm recording this in July with
our next doomsday being on August eight, So that means
August eight should be a Sunday. And if you look
(44:09):
at the calendar, you'll see that, yes, August eight is
a Sunday. That means that October tenth should be a Sunday,
that December twelve should be a Sunday. So the doomsday
rule involves a little bit more than just this, but
not much more. And this means that if you learn
the doomsday rule and you have a general idea of
(44:30):
what the anchor day is for each year, you can
calculate what in what day of the week any given
date will fall on, including past years. The guy who
came up with this was a mathematician named John Conway,
and reportedly he could give you an answer in less
than two seconds. If you gave him a date, he
(44:51):
could tell you what day of the week it was
going to fall on in two seconds or less. Sadly,
he passed away last year at the age of eighty
two after he contracted COVID nineteen. But that doomsday rule
is one that I think is super cool. I've seen
people who can do this where you give them a
date and they're just able to tell you what day
of the week that falls on, and you can go
(45:13):
and check it and sure enough they're right. Well, this
is one process where you can do that. It's not
the only one, by the way, there are other algorithms
that also can help you determine what day of the
week a specific date will fall on. But the doomsday
rule is is one that works. And again I don't
have it. Uh. I don't have an expertise in this.
(45:34):
I haven't practiced it at all, so I can't do it.
If you came up to me and said, hey, Jonathan
May first, nineteen, what day of the week was that,
I'd be like, A, I don't know. Probably A it's
probably a fine day. I don't I don't know because
I haven't done this yet. But I think it's really cool.
I gets it illustrates how for certain types of problems
(45:56):
you can come up with a mathematical approach to solving
the problems, and it just is applicable for all problems
that fall into that category. And that's really the elegant
and cool thing about algorithms. Also, the other cool thing
is that people are coming up with new ones all
the time, and they might be trying to refine earlier
algorithms to make them even more efficient and effective, or
(46:19):
they might be trying to come up with all new
algorithms to take advantage of emerging technology stuff like quantum computing.
But as I said, I'll have to do another episode
about algorithms to talk more about it, because this one
ran longer than I thought. Like I said, I've got
like three more pages of notes that I wanted to
go over, but I ran long, So we're gonna cut
(46:40):
this one short. But we will come back to algorithms
at some point in the future and talk about some
related stuff. I really want to talk about hashing, particularly
as it applies to stuff like bitcoin mining, because that's
a pretty big deal. If you have suggestions for topics
I should cover in tech stuff, whether it's something really
technical or maybe just a goofy thing in tech. It
(47:01):
could be a person in tech, a company, a specific
product and you want to know about its history and impact,
anything like that. Let me know. The best way to
get in touch with me is over on Twitter. The
handle we use is tech stuff H s W and
I'll talk to you again really soon. Tech Stuff is
(47:24):
an I Heart Radio production. For more podcasts from my
Heart Radio, visit the i Heart Radio app, Apple Podcasts,
or wherever you listen to your favorite shows.