Rerun: Machine Learning 101

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:04):
Welcome to Tech Stuff, a production from I Heart Radio.
Hey there, and welcome to tech Stuff. I'm your host,
Jonathan Strickland. I'm an executive producer with iHeart Radio. And
how the tech are you? Alright? Well, I'm still on vacation.
I'll be coming back soon, so tomorrow you should expect
a brand new episode unless something goes wrong while I'm

(00:27):
trying to get back. Hopefully nothing like that happens, And
so we thought we'd have a little rerun. This episode
originally published in April one, so just last year. It
is titled machine Learning one oh one. And I wanted
to do this one because, as always, we hear a
lot about artificial intelligence and machine learning in the news

(00:48):
and in media, and often those topics get a little confusing.
They can come across more broad than some people intend,
or or they can be somewhat misguided in their interpretations.
So I thought it would be useful to have a
little refresher course on machine learning and artificial intelligence to

(01:10):
hope you enjoy, uh and I will be back at
the end. Back in nineteen eighties, six comedy science fiction
film that I saw in the theater about a robot,
the game sentience and becomes a total goofball what it will.
It hit theaters in eighties six and it was called

(01:31):
Short Circuit. The movie starred Steve Gutenberg, Ali Sheety, and
lamentably a white actor named Fisher Stevens playing a non
white character, someone who is Indian. I should add that's
not Steven's fault. I mean, he auditioned to be in
a movie and he got a gig. He didn't cast
himself in the film, and he has since talked about

(01:53):
his experiences, realizing the problems with a white man playing
a non white character, but setting aside all the problematic
white washing, the movie showed this robot, who in the
course of the film names itself Johnny five learning. It
learns about the world around it, it learns about people,
It learns about human concepts like humor and emotion, and

(02:17):
the general idea was pretty cute. Now, the nifty thing
is machines actually can learn. In fact, machine learning is
a really important field of study these days, complete with
its own challenges and risks. I've talked about machine learning
a few times in the past, but I figured we
could do a deeper dive to understand what machine learning

(02:38):
is what it isn't how people are leveraging machine learning
and why I said that it does come with risks,
So let's learn about machines learning. It will be impossible
to talk about machine learning without also talking about artificial
intelligence or AI. And this term artificial intelligence is a

(03:02):
real doozy. It trips people up, even people who have
dedicated their lives to researching and developing artificial intelligence. You
can get two experts in AI talking about AI and
find out that because they have slightly different takes on
what AI is, there are some communication issues. It's not

(03:24):
as simple as red versus blue would have you think
what does the A stand for? So when you really
boil it down, it comes out as as no big
surprise that there's a lot of ambiguity here. After all,
how would you define intelligence just intelligence, not artificial intelligence,

(03:45):
just intelligence? Well? Would it be the ability to learn,
that is, to acquire skills and knowledge? Or is it
the application of learning? Is it problems solving? Is it
being able to think ahead and make plans in order
to achieve a specific goal? Is it the ability to

(04:06):
examine a problem and deconstructed in order to figure out
the best solution. A more specific version of problem solving.
Is it the ability to recognize, understand, and navigate emotional scenarios? Now,
arguably it's all of these things and more. We all
have kind of an intuitive grasp on what intelligence is,

(04:29):
but defining it in a simple way tends to feel
reductive and it leaves out a lot of important details.
So if defining just general intelligence is hard, it stands
for a reason that defining artificial intelligence is also a
tough job. Heck, even coming up with a number of

(04:50):
different types of a I is tricky. And if you
don't believe me, just google the phrase different types of
artificial intelligence. Never mind, you don't. You don't really actually
have to do that. I already did it, though, Feel
free to do it yourself and check my work if
you like. When I googled that phrase different types of AI,

(05:13):
some of the top results included a blog post on
BMC software titled four types of artificial Intelligence. But then
there was also an article on code bots that was
titled what are the three types of AI? And then
there was an article from Forbes titled seven types of
artificial intelligence. See, we can't even agree on how many

(05:35):
versions of a EI there are because defining a I
is really hard. It largely depends upon how you view
AI and then how you break it down into different
realms of intelligence. Now we could go super high level,
because a classic way to look at AI is strong

(05:55):
versus weak Artificial intelligence stro on AI UH sometimes called
artificial general intelligence, would be a machine that processes information
and at least appears to have some form of consciousness
and self awareness and the ability to both have experiences

(06:17):
and to be aware that it is having experiences. It
might even feel emotion, though maybe not emotions that we
could easily identify or sympathize with. So this would be
the kind of machine that would think in a way
similar to humans. It would be able to sense its
environment and not just react, but really process what is

(06:40):
going on and build and understanding. It's the type of
AI that we see a lot in science fiction. A's
the type of AI of Johnny five from Short Circuit
or how from two thousand one, or the droids in
Star Wars. It's also a type of artificial intelligence that
we have yet to actually achieve in the real world.

(07:02):
So then what is week AI. Well, you could say
it's everything else, or you could say it's the building
blocks that maybe collectively will lead to strong AI week.
AI involves processes that allow machines to complete tasks, So,
for example, image recognition software could fall into this category.

(07:25):
Once upon a time, in order to search photos effectively,
you needed to actually add meta data like tags to
those photos. So, for example, I might tag pictures of
my dog with the meta tag dog, and then if
I wanted to see photos of my pooch, then I

(07:46):
would pull up my photo app and search the term dog,
and all the photos that I had tagged with the
word dog would show up. But if I had failed
to tag some pictures of my dog, those pictures wouldn't
pop up in search because the computer program wasn't actually
looking for dogs in my photos. It was just looking
for photos that had that particular meta tag attached to it.

(08:09):
But now we've reached a point where at least some
photo apps are using image recognition to analyze photos, and
these will return results that the algorithm has identified as
having a reasonable chance of meeting your search query. So
if I used an app like that and I put
in dog as my search term, it could pull up
photos that had no meta tags attached to them at all.

(08:33):
Because the search is relying on image recognition. Now, this
also means that if the image recognition algorithm isn't very good,
I could get some images that don't have a dog
in them at all, or it might miss other images
that have my dog in them. But my point is
that the ability to identify whether or not a dog
is in a particular photo represents a kind of weak

(08:56):
artificial intelligence. You wouldn't say that the photo search tool
possesses humanlike intelligence, because really it only does one thing.
It's analyzing photos and looks for matches to specific search queries,
but it can't do anything outside of that use case. However,
that's just one little example. There are all sorts of

(09:17):
other ones, like voice recognition, environmental sensing, course plotting, that
kind of thing, and in some circles, as we get
better at making machines and systems that can do these things,
those elements seem to kind of drift away from the
ongoing conversation about artificial intelligence. A guy named Larry Tessler,

(09:39):
who was a computer scientist who worked at lots of
really important places like Xerox, Park and Amazon and Apple,
he once observed, quote, intelligence is whatever machines haven't done yet.
End quote. So his point was that the reason that
AI is really hard to talk about is that the
goal post for why actually is artificial intelligence is constantly moving.

(10:06):
Now this pretty much mirrors how we think about things
like consciousness. Lots of people study consciousness, and the general
sense I get is that it's a lot easier for
people to talk about what isn't consciousness rather than what
consciousness actually is. And it seems like artificial intelligence is
in a similar place, which really isn't that big of

(10:28):
a surprise as we closely associate intelligence with consciousness. Now
this leads us to why there are so many different
takes on how many types of AI there are. It
all depends on how you classify different disciplines in artificial intelligence,
and over time, a lot of disciplines that were previously

(10:49):
distinct from AI have sort of converged into becoming part
of the AI discussion. Machine learning, as it turns out,
was part of the AI discussion, branch off from it,
and then rejoined the AI discussion years later. So I
am not going to go down all the different approaches
to classification because I don't know that they would be

(11:10):
that valuable to us. They would really just illustrate that
there are a lot of different ways to look at
the subject. So if you ever find yourself in a
conversation about AI, it might be a good idea to
set a few ground rules as to what everyone means
when they use the term artificial intelligence. That can help

(11:33):
with expectations and understanding. Or you could just run for
the nearest exit, which is what people tend to do
whenever I start talking about it anyway. What about machine learning, Well,
from one perspective, you could say machine learning is a
sub discipline of artificial intelligence, although like I said, it

(11:55):
hasn't always been viewed as such. I think most people
would say that the ability to learn that is to
take information and experience and then have some form of
understanding of those things so that you can apply that
to future tasks, potentially getting better over time. I would
say most people would call that part of intelligence. But

(12:19):
you could also be a bit more wishy washy and
say it's related to, you know, artificial intelligence, as opposed
to being part of AI, since the definition of AI
is let's say, fluid. Either way of classifying machine learning works.
As far as I'm concerned, machine learning boils down to
the idea of creating a system that can learn as

(12:41):
it performs a task. It can learn what works and
more importantly, what does not work. You may have heard
that we learn a lot more from our mistakes than
we do from our successes, which there's pretty much true
in my experience. When something goes wrong, it's usually, but
not always, possible to trace the event or events that

(13:05):
led to the failure. You can identify decisions that we're
probably the wrong ones or that led to a bad outcome,
But if you have a success, it's hard to figure
out which decisions were key to that successful outcome. Did
your decision at step two set you on the right path,
or was your choice at step three so good that

(13:28):
it helped correct a mistake that you made it step two.
But a good approach to machine learning involves a system
that can adjust things on its own to reduce mistakes
and increase the success rate. And another way of putting
it is that instead of programming a system to arrive
at a specific outcome, you are training the system to

(13:49):
learn how to do it by itself. And that sounds
a bit magical when you put it that way, doesn't it?
It sounds like someone just took a computer and showed
it pictures of cat and then expected the computer to
know what a cat was. And this actually does mirror
an actual project that really did do that, But I'm
leaving out some big important information in the middle. Now,

(14:13):
one big step is that computers and machines can't just
magically learn by default. People first had to come up
with a methodology that allows machines to go through the
process of completing a task, then making adjustments to the
process of doing that task, which would then improve future results.
We have to lay the groundwork in architecture and theory

(14:37):
and algorithms. We have to build the logical pathways that
computers can follow in order for them to learn. A
lot of machine learning revolves around patterns and pattern recognition.
So what do I mean by patterns? Well, I mean
some form of regularity and predictability. Machine learning models analyze

(14:58):
patterns and attempt to draw conclusions based on those patterns.
This in itself is tricky stuff. So why is that? Well,
it's because sometimes we might think there's a pattern when
in reality there is not. We humans are pretty good
at recognizing patterns, which makes sense. It's a survival mechanism.

(15:22):
If you were to look at tall grass and you
see patterns that suggest the presence of a predator like
a tiger, well you would know that danger is nearby,
and you would have the opportunity to do something about
that to help your chances of survival. If, however, you
remained blissfully unaware of the danger, you'd be far more

(15:44):
likely to fall prey to that hungry tiger. So recognizing
patterns is one of the abilities that gave humans a
chance to live another day, and, from an evolutionary standpoint,
a chance to make more humans. But sometimes we wins
will perceive a pattern where none actually exists. A simple

(16:05):
example of this is the fun exercise of laying on
your back outside, looking up at the clouds and saying,
what does that cloud remind you? Of? The shapes of clouds,
which have no significance and are the product of environmental factors,
can seem to suggest patterns to us. We might see
a dog, or a car or a face, but we

(16:28):
know that what we're really seeing with just the appearance
of a pattern, it's it's not evidence of a pattern
actually being there. It's noise, not signal. But it could
be misinterpreted as signal. Well, it turns out that in
machine learning applications this is also an issue. I'll talk
about it more towards the end of this episode. Computers

(16:50):
can sometimes misinterpret data and determine something represents a pattern
when it really doesn't. When that happens, a system relying
on machine learning can whose false positives, and the consequences
can sometimes be funny, like hey, this image recognition software
thinks this coffee mug is actually a kidney cat. Or
they can be really serious and potentially harmful. Hey, this

(17:12):
facial recognition software has misidentified a person, marking them as, say,
a person of interest in a criminal case. And it's
all because this facial recognition software isn't very good at
differentiating people of color. That's a real problem that really happens. Now,
when we come back, I'll give a little overview of
the evolution of machine learning. But before we do that,

(17:35):
let's take a quick break to talk about the history
of machine learning. We first have to look back much
much earlier, long before the era of computers, and talk
about how thinkers like Thomas Bayes thought about the act

(17:58):
of problem solving. Bays was born way back in two,
so quite a bit before we were thinking about machine learning,
but he was interested in problem solving for problems involving probabilities,
and specifically the relationship between different probabilities. I think it's
easier to talk about if I give you an example.

(18:20):
So let's make a silly one, all right, So let's
say we got ourselves a plucky podcaster. Hey there, everybody,
It's Jonathan Strickland, and it's Tuesday as I record this,
And because of who I am, you know who this
podcaster is. And because it's Tuesday, there is a chance
I am wearing a they might be Giants T shirt.

(18:43):
And we also know that if this podcaster is wearing
a they might be Giants T shirt on a Tuesday,
there's a sixty chance that I'm going to end up
wearing pajamas on Wednesday. But we also know that if
I did not where they might be Giant's shirt on Tuesday,

(19:04):
and remember there's a six chance I didn't, then we
know there's an eighty percent chance I'm going to be
wearing pajamas on Wednesday. Will Bays worked out a way
that described the sort of probability relationship between different discrete
events and using his reasoning, you can work forward or
backward based on probabilities. Theys would describe wearing a they

(19:29):
Might be Giant shirt on Tuesday as one event and
wearing pajamas on Wednesday as a separate event, and then
describe the two not only determining how likely it is
I'll wear pajamas on Wednesday, but if we start with
the later event, in other words, that we start with
the fact that it's Wednesday and I'm wearing pajamas, we

(19:50):
could work out how likely it was that yesterday, on Tuesday,
I was wearing they Might be Giants shirt. That was
his his contribution, that you can work this in either
direction if you know these different variables. Now, Bay has
never published his thoughts, but rather send an essay explaining
it to a friend of his, who then made sure

(20:11):
that the work was published. After Bays had passed away,
and a few decades later, Pierre Simon Laplace would take
this work that Bays had done and flesh it out
into an actual formal theorem. It's an important example of
conditional probability, and a lot of what machine learning is
really boiled down to is dealing with different probabilities, not certainties, which,

(20:36):
when you get down to it, is what most of
us are doing most of the time. Right. We make
decisions based on at least perceived probabilities. Sometimes these decisions
might feel like they're a coin flip situation, that any
choice is equally likely to precipitate a good outcome or
a bad outcome. Other Times we might make a choice
because we feel the probabilities are stacked favorably one way

(20:58):
over another. Sometimes we will make a choice to back
the least probable outcome, because well, humans are not always superrational.
In hex sometimes the long shot does pay off, so
that keeps Vegas in business. Bayes' theorem is just one
example of ways that mathematicians and philosophers figured out ways

(21:19):
to mathematically express problem solving and decision making, And a
lot of this was figuring out if there were a
way to boil down things that most of us approached
through intuition and experience. So it's kind of neat, and
also the more you look into it, the more likely
you might find it's little spooky, because it's weird to

(21:39):
consider that our approaches to making choices and solving problems
can be reduced down to mathematical expressions. But let's leave
the potential existential crises alone for now, shall we. So
moving on, we have another smarty pants we need to
talk about Andre Markov, mathematician. In the early twentie century.

(22:03):
He began studying the nature of certain random processes that
follow a particular type of rule, which we now call
the Markov property. That rule says that for this particular process,
the next stage of the process only depends upon the
current stage, but not any stages that came before then.

(22:24):
So let's take my ridiculous T shirt example and let's
build it out a little bit further. Let's say that
I've got three T shirts to my name. One of
them is that they might be Giant's shirt. One is
a plain blue T shirt, and the third is a
shirt that has the tech Stuff logo on it. And
it's based off of long observation that you've determined these

(22:48):
following facts. If I am wearing that they might be
Giant's shirt today, I definitely will not wear it tomorrow.
But there's a fifty fifty shot I'll wear either the
blue shirt or the tech Stuff shirt. Now, if I'm
wearing the blue shirt today, there's a ten chance I'm

(23:09):
going to wear the same blue shirt tomorrow. Don't worry,
I'll wash it first. There's a sixty chance that I'll
wear the tech Stuff shirt, and there's a thirty percent
chance I'll wear the they Might Be Giant shirt. But
if I'm wearing the tech stuff shirt today, there's a
seventy chance I'll wear it again tomorrow because I like

(23:29):
to promote myself. But there's a thirty percent chance I'll
wear the they Might be Giant shirt, and there is
no chance that I'm going to wear the blue one
in this case. So those are our various scenarios. Right
which shirt I will wear tomorrow depends only upon which
shirt I am wearing today. What I wore yesterday has

(23:51):
no bearing on the outcome for tomorrow, So today is
all that matters. And depending on which shirt I wear,
you can make some probability predictions for tomorrow. So we
can actually use this approach to figure out the probability
that I might wear the tech Stuff shirts, say ten
days in a row, since there's a better than even

(24:12):
chance that if I'm wearing tech Stuff today, I'll end
up wearing it again tomorrow, and if I wear it tomorrow,
then there's a better than fift chance that I'm going
to wear it the following day. But at some point
you're going to see that the odds are starting to
be against you, for you know, increasingly long strings of

(24:33):
wearing the tech stuff shirt. Anyway, Markov chains would become
one of the types of processes that machine learning models
would incorporate, with some models looking at the current state
of a given process and then make predictions on what
the next state will be with no need to look
back at the previous decisions. The Markov chain is memory less.

(24:57):
Now that's just a couple of the mathematicians whose work
underlies elements of machine learning. There's also structure we need
to talk about. In a man named Donald Hebb wrote
a book titled The Organization of Behavior, and in that book,
Hebb gave hypothesis on how neurons, that is, how how

(25:18):
brain cells interact with one another. His ideas included the
notion that if two neurons interact with one another regularly,
that is, if one fires, that the second one is
also likely to fire. They end up forming a tighter
communicative relationship with each other. Not long after his expression

(25:40):
of this hypothesis. Computer scientists began to think of a
potential way to do this artificially, with machines creating the
equivalent of artificial neurons. The relative strength in relationship between
artificial neurons is something we describe by Wait, that's going
to be an important part of machine learning. WIT. By

(26:02):
the way, is W E I G H T, as
in this relationship is weighted more heavily than that relationship.
In the early nineteen fifties, an IBM researcher named Arthur
Samuel created a program designed to win at checkers. The
program would do a quick analysis of where pieces were

(26:23):
on a checkerboard and whose move it was, and then
calculate the chances of each side winning the game based
on those positions. And it did this with a mini
max approach. Alright, so checkers is a two player turn
based game. Player one makes a move, then player two
can make a move. There are a finite number of

(26:43):
moves that can be made, a finite number of possibilities,
though admittedly it's a pretty good number of possibilities. But
let's say a game has been going on for a
few moves, and you've got your two sides you've got
the red checkers over on player one side and the
black checkers for a player to Let's say it's player
one's move. For the purposes of this example, will say

(27:06):
that player one really just has one piece that they
can actually move on this turn, and it can move
into one of two open spaces. So player one has
to make a choice. After that choice, it's going to
be player two's turn, so we can create a decision
treat illustrating the possible choices and the possible outcomes of

(27:28):
those choices. These choices are the children of the starting
position for player one, so player one's starting position has
two children. Player too will have their own choices to
make after that decision has been made, but those choices
are going to depend upon whatever move player one ultimately takes.
So we can extend out our decision treat showing the

(27:51):
branching possible moves that player Too might make, And these
are the children of the two possible outcomes of our choice.
After player two's turn, it's player ones turn again, which
means we need to branch those decisions out even further.
And this is all before player one has even made

(28:12):
that first choice. We're just evaluating possibilities. At some point,
either when we have plotted far enough out that we
know all possible outcomes of the game, or we're just
reaching a point where it would be unmanageable for us
to go any further, we need to actually analyze what
our options are. The endpoints represent either a win, a loss,

(28:35):
or a draw for player one, or, if we haven't
extended out the tree all the way to the end
of the game, at least a change in advantage, whether
it would be in player one's advantage to make that
move or disadvantage. We could actually assign numerical values to
each end point, with positive values representing an advantage for

(28:56):
player one and a negative value representing an advantage for
a player too, and once we do that, we can
see which pathways tend to lead to better outcomes for
player one. We work backward through the decision tree, so
on all the decisions that end in an advantage for
player one, we can say this is the choice that

(29:18):
player one would take. But then we know that a
player to player two is always going to choose whichever
choice has the greatest advantage for that player, so we
have to actually take that into account as we're working backward,
and this is how we can finally get to the
point where we decide which move we're going to make.

(29:39):
Because these decisions as you go backward up the tree,
they ultimately inform you which of those two choices is
going to give you the best result. Those values, well,
those are weights. So for player one, the goal is
to pick the path that has the highest positive value.
For player too, it's to pick the path that has

(30:00):
the lowest possible value or the highest negative value if
you prefer so. In other words, player one might be
thinking something like, if I move to Spot A, my
chance of winning this game, But if I moved to
Spot B, it's only so. Of course, those percentages will
also depend on what player two is going to do
in response. Some moves that player two might do could

(30:23):
end up guaranteeing a win for player one. This is
the mini max approach, and there's an algorithm that guides it.
It depends upon the current position within a game and
how many moves or how much depth it has to
take into account, and for which player is it actually
helping out. What happens is if player one does this

(30:44):
evaluation and finds that both options are negative, well, then
this is something that happens in games, right, Sometimes you
find out there is no good move, like any move
you make is going to be a losing move. Well,
the only option at that point is to choose the
least bad had one, so it would be whatever the
smallest negative value choice was. Our Next big development that

(31:06):
I need to mention is Frank Rosenblatt's artificial neural network
called Perceptron. Its purpose was to recognize shapes and patterns,
and it was originally going to be its own machine
like actual hardware, but the first incarnation of Perceptron would
actually be in the form of software rather than hardware.
There was a purpose built Perceptron later, but the original

(31:29):
one was software. Despite some early excitement, the Perceptron proved
to be somewhat limited in its capabilities, and interest in
artificial neural networks died down for a while as a result.
In a way, you could kind of compare this to
some other technologies that got a big hype cycle and
then later deflated. Virtual reality is the one I always

(31:52):
go with. Back in the nineteen nineties, the world was
really hyped for virtual reality. People had incredibly unrealistic x
spectations for what VR actually meant and what it could do,
and when it turned out the VR wasn't nearly as
sophisticated as people were imagining, a lot of enthusiasm dropped
out for the entire field, and with that dropped funding

(32:15):
and support, and as a result, development and VR hit
a real wall, with only a fraction of the people
who had been working in the field sticking around, and
they had to scramble just to find funding to keep
their projects going. So VR was effectively put on the
shelf and wouldn't make much progress for nearly twenty years. Well.
Artificial neural networks had a very similar issue, but other

(32:39):
computer scientists eventually found ways to design artificial neural networks.
They could do some pretty amazing things if they had
access to enough data. When we come back, i'll talk
a little bit more about that and what it all means,
but first let's take another quick break. So we left

(33:04):
off with the AI field going into hibernation for a
little bit. Theory and mathematics were bumping up against the
limitations of technology, which wasn't quite at the level to
put all that theory to the test. Plus there needed
to be some tweaks to the approaches, but those came
with time and more mathematicians found new ways to create

(33:26):
artificial neural networks capable of stuff like pattern recognition and learning.
So let's imagine another decision tree. We've got our starting position.
This is probably where we put some input. We would
feed data into a system, and let's say from that
starting position, we have a process that's going to transform

(33:47):
that input into one of two possible ways. So we've
got two potential outputs for that first step. Like our
mini max example, we can go down several layers of
possible choices, and we can wait the relationships between these
different choices. So if the incoming value is higher than

(34:08):
a certain amount, maybe the node sends it down one pathway,
But if the value is lower than that arbitrary amount,
the node will send it down a different pathway. This
is drastically oversimplifying, but I hope you kind of get
the idea. It's like a big sorting system, and the
goal is that at the very end whatever comes out

(34:30):
as output is correct or true. Ideally, you've got a
system that is self improving. It trains itself to be better.
But how the heck does that happen? Well, let's consider
cats for a bit, not the musical and good Heaven's
definitely not the movie musical. That is a subject that

(34:56):
deserves its own episode. Maybe one day I'll figure out
a way to tell a cackled that film with some
sort of tech capacity, But honestly, I'm just not ready
to do that yet. From like an emotional standpoint as
well as a research one. No, Let's say you're teaching
a computer system to recognize cats pictures of cats, and

(35:16):
the system has an artificial neural network that accepts input
pictures of cats and then filters that input through the
network to make the determination does this picture include a
cat in it? And you start feeding it lots of images.
The neural network acts on the data according to the
weighted relationship between the artificial neurons, and it produces an output.

(35:40):
Now here's the thing. We already know what we want
the output to be, because we can recognize if a
picture has a cat inet or not. Maybe we've got
one thousand pictures. This is the training data we're going
to use for this machine learning process. We also know
that eight hundred of those pictures have a cat in
the and two don't, so we know what we want

(36:03):
the results to be. We've got an artificial neural network
in which some neurons or nodes will accept input and
perform a function based on that input, and then the
weighted connections that neuron has to other neurons will determine
where it passes the information down until we get to
an output. And this happens until we get that conclusion.

(36:23):
So what happens if the computer's answer is wrong? What
if we feed those one thousand photos to it and
says only three hundred of them have cats in them?
While we have to go back and adjust those weighted connections,
because clearly something didn't go right, the connections within the
network need to be readjusted. We would likely start closest

(36:47):
to our output and see which neurons seem to contribute
to the mistake, which which neurons were responsible, In other words,
for it to say, oh, only three these pictures had
cats in them, and then we would adjust the weights,
the incoming weights of connections to those neurons in order
to try and favor pathways that lead to correct answers.

(37:10):
Then we feed it the one thousand pictures again and
we look at those results. Then we do this again
and again and again, every time, tweaking the network a
little bit so that it gets a bit better. Eventually,
when we have trained the system, we can start to
feed brand new data to the network, not the stuff

(37:33):
we've trained it on, but pictures that we and the
system have never seen before. And if our network is
a good one, if we have trained it well, it
will sort through these new photos and it will count
up the ones that have the cat pictures lickety split.
This approach is called supervised learning because it involves kind

(37:54):
of grading the network on its homework and then working
with it to get better. Heck, with the right algorithm,
a neural network can learn to recognize and differentiate patterns
even if we never explicitly told the system what it
was looking for. Google discovered this several years ago when
it fed several thousand YouTube videos to an enormous artificial

(38:18):
neural network. The system analyzed the videos that were fed
to it and gradually recognized patterns that represented different types
of stuff, like people or like cats, because there are
a lot of cat videos on YouTube, and the network
got to the point where it could identify an image

(38:38):
of a cat fairly reliably better than seventy of the time,
even though it was never told how to do that,
or it was never even told what a cat was. So,
as Google representatives put it, they said, it had to
invent the concept of a cat. It had to recognize

(38:58):
that cats are not the same as people, which I
think is a big slap in the face to some cats. Really,
what it said was that I recognized this particular pattern
of features, and I recognized that these other instances of
creatures that have a similar pattern seemed to match that,

(39:20):
and so I draw the conclusion that this instance of
a thing belongs with all these other instances of things
that are similar in characteristics. So this was more of
an example of unsupervised learning, and that the system, when
fed enough data, began to categorize stuff all on its
own through its own parameters. Now, one neat way that

(39:43):
computer scientists will train up systems for certain types of
applications is through a generative adversarial network, which I admit
sounds kind of sinister, doesn't it, And I mean it
can be, but it doesn't have to be essentially near
Using two different artificial neural networks. One of the networks

(40:04):
has a specific job. It's to fool the other network.
So the other network's job is to detect attempts to
fool it versus legitimate data. So let's use an example.
Let's say you're trying to create a system that can
make realistic but entirely computer generated, that is, fabricated photographs

(40:25):
of people. So, in other words, these are computer generated
images that don't actually represent a real person at all.
We've got one artificial neural network, the generator, and its
job is to create images of people that can pass
as real photographs. Then we've got our other network, which
is the discriminator. This is trying to sort out real

(40:48):
photos of actual people from pictures that have been generated
by the generative system. And we pick these two networks
against each other. The idea here is that both systems
get better as they test one another out. If the
generator network is falling behind because the discriminator can suss

(41:10):
out the fakes too easily, well, then it's time to
tweak some weights in that neural network that are leading
to dissatisfactory computer generated images and try it again. But then,
if the discriminator is starting to miss fakes while, it's
time to tweak the discriminator network. So it's better at

(41:31):
spotting the false pictures. Now along the way, some pretty
extraordinary stuff can happen. There are photos of computer generated faces,
not altered pictures, not ones created by a human artist,
but entirely composed via a computer, and they can look
absolutely realistic, complete with consistent lighting and shadows. This is

(41:56):
only after lots of training sessions the networks learn what
the giveaways are, like, what is it that leads the
discriminator to say, no, this is a fake photo, and
how can you fix that? It reminds me a bit
of how photo experts used to point out really bad
photoshop jobs and explaining how certain elements like shadows or

(42:18):
edges or whatever, we're a dead giveaway that someone had
altered an image. Well, similar rules exist for generated images,
and through training, the generator gets better at making really
convincing examples that don't fall into the traps that would
reveal it as a fake. Over time, generative networks can

(42:39):
get good enough to produce stuff that would be very
difficult for a human to tell apart from the quote
unquote real thing, and discriminators can get good enough to
detect fakes that would otherwise pass human inspection. So an
example of This is the current ongoing battle with deep fakes.
These are computer generated videos that appear to be legit.

(43:01):
If they're done well enough, they can have famous people
in them. Doesn't have to be a famous person, but
it can show a video of someone doing something that
they absolutely never did, but according to the video, they did,
and it can be really convincing if it's done well.
A good deep fake can fool people if you aren't

(43:21):
paying too much attention. Some of the really good ones
can pass pretty deep scrutiny. So this requires researchers to
come up with solutions that are pretty subtle and beyond
the average person's ability to replicate, like looking at the
reflections in the person's eyes and whether or not they
seem realistic or a computer generated. But that really just

(43:43):
represents another hurdle for the generative side. So in other words,
this is a seesaw approach, right. It's creating fakes on
one side and detecting them on the other side. It's
something we see in artificial intelligence in general. A similar
our story played out with the old capture systems, where

(44:04):
you know, we saw back and forth between methods to
try and weed out bots by using capture images that
only humans could really parse, and then we saw improved
bots that could analyze these images and return correct results,
which meant it was necessary to create more difficult captures.
Eventually get to a point where the captures are difficult

(44:25):
enough where the average person can't even pass them, and
then you have to go to a different method. We
also see this play out in the cyber security realm,
where you might say the thieves get better at lock picking,
and then security experts make better locks, and the cycle
just repeats endlessly. One thing that has really fueled machine

(44:46):
learning recently is the era of big data. Being able
to harvest information on a truly massive scale provides the
opportunity to feed that data into various machine learning systems
to search for meaning within that data. These systems might
scour the information to look for stuff like criminal activity

(45:08):
like financial crimes or the attempt to move some money
around from various criminal exploits. Or it could be used
to look for trends like market trends, or it might
be used to plot possible spikes in COVID nineteen transmission
where those might occur where people should really be focusing
their attention. But now we got to think back on

(45:31):
what I said earlier about looking up at the sky
and seeing shapes in the clouds. There's a risk that
comes along with machine learning. Actually, technically there are a
lot of risks, but this one is a biggie. It
is possible for machines like humans, to detect a pattern
where there really isn't a pattern. Systems might interpret noise

(45:54):
to be signal, and depending on what you're using the
system to do, that could lead you to some seriously dangerous,
incorrect conclusions. In some cases, you could just be inconvenient,
but depending on what you're working toward, it could be catastrophic.
And so computer scientists know they have to do a
lot of analysis to make sure that patterns that are

(46:15):
identified through machine learning processes are actually real before acting
on that information. Likewise, bias is something that we humans have, well,
it's also something that machine learning systems have too. Now,
sometimes bias is intentional. It can take the form of
those weighted relationships between artificial neurons. Other times, a systems architects,

(46:42):
you know, the people who put it together, They might
have introduced bias, not through conscious effort, but merely through
the approach they took and that approach might have been
too narrow. We've seen this pop up a lot again
with facial recognition technologies, many of which have a sliding
scale of efficacy. They might be more reliable with certain

(47:04):
ethnicities like white people, over others. That points that a
likely problem with the way those systems were trained. This
is one of the reasons why many companies have made
a choice to stop supplying certain parties like police forces
and military branches with facial recognition systems. The systems aren't
reliable for all demographic groups and thus could cause disproportionate

(47:28):
harm to certain populations. It would be a technological approach
to systemic racism, and this stuff is already out there
in the wild. You might think a computer system can't
be biased or prejudiced or racist, and sure, we're still
not at the point where these systems are thinking in
the way that humans do, but the outcome is still

(47:49):
disproportionately harmful to some groups. That's not to say that
machine learning itself is bad. It's not bad. It's a tool,
just as all technology is a tool used properly with
a careful hand to make sure that biases understood and
where needed mitigated and where work can be double or

(48:10):
triple checked before acted upon. It is a remarkably useful tool,
one that will power and design and improve elements in
our lives if it's under the correct stewardship. But it
does require a bit more hands on work. We can't
just leave it to the machines just yet. Well, that

(48:32):
wraps up this look at the concept of machine learning
and some of the thought that underlies it. This really
is a very high level treatment of machine learning. There
are plenty of resources online if you want to dive
in and learn more. A lot of them get very
heavy into the math, so if that's not your bag,

(48:53):
it might be a little challenging to navigate. It certainly
is for me. I love learning about the stuff, but
a lot of it requires me to look up a term,
then look up a term that explains that term, and
so on, and I go down a rabbit hole. I
hope you enjoyed that classic episode. I guess not classic,
that rerun episode of tech stuff. You can't call it

(49:15):
a classic if it's just a year old, right, So anyway,
I will be back again tomorrow hopefully, and we will
have a new episode, y'all If you want to get
in touch with me and let me know what you
would like me to cover in future episodes. There are
a couple of ways of doing that. You can drop
a note on Twitter. Several of you have been doing
that recently and I've got I've got a list of topics.
So thank you so much. That's fantastic. I really appreciate it.

(49:39):
Keep them coming. The The handle for the podcast Twitter
feed is text Stuff hs W. If, however, you would
like to leave me a voice message, you can go
to the I Heart Radio app go to the tech
stuff page. There's a little microphone icon you click on
that you can leave a message of up to thirty
seconds and if you like me to include that message

(50:01):
in an upcoming episode, just let me know in the message.
Because I'm an opt in kind of guy. That's it.
Hope you all are doing well and I'll talk to
you again really soon. Y text Stuff is an I
Heart Radio production. For more podcasts from my Heart Radio,
visit the i Heart Radio app, Apple Podcasts, or wherever

(50:24):
you listen to your favorite shows.

All Episodes

Episode Transcript

TechStuff News

Follow Us On

Hosts And Creators

Oz Woloshyn

Karah Preiss

Show Links

Popular Podcasts

Stuff You Should Know

24/7 News: The Latest

Crime Junkie

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Rerun: Machine Learning 101