Black-Belt Debugging with Chelsea Troy - RUBY 663 - Ruby Rogues

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:05):
Hey, everybody, and welcome to another episode of the Ruby
Rops podcast.

Speaker 2 (00:09):
This week, on our panel we have John Eperson.

Speaker 3 (00:12):
Hello, everybody up stutters Hey.

Speaker 1 (00:16):
I'm Charles Maxwood from devchat dot TV working on stuff
over at Most Valuable dot dev.

Speaker 2 (00:21):
So go check that out. We have a special guest
this week, and that is Chelsea Troy. Chelsea, how's it going.

Speaker 4 (00:28):
Hey, it's going well. How about you?

Speaker 3 (00:30):
We're doing great. You want to introduce yourself real quick?

Speaker 4 (00:33):
Sure? Yeah, my name is Chelsea Troy. I am a
software engineer. By day. I work on a couple of
different projects that focus on projects that are saving the
planet or advancing basic scientific research, or providing resources to
underserved communities. And then in the evenings, I teach a

(00:53):
class called Mobile Software Development at the University of Chicago
in their master's programming Computer Science, and or, I don't know,
organize a couple of conferences here in Chicago when we're
allowed to have conferences, which is, you know, not not
right now. So I spend a lot of time lately
taking walks and going on bike rides because we only

(01:13):
get one hundred nice days in the city of Chicago.

Speaker 1 (01:17):
So nice have you had a heat wave out there
like we've had out here.

Speaker 4 (01:21):
So it's funny you ask. Last week was relatively warm,
but this week it has been it's not going to
get back to seventy degrees until the weekend. It's been
in the sixties, which is really nice for being outside,
with the exception of the fact that it's been raining
cats and dogs for a lot of that. And so
I mean people, the people enjoy the temperate weather so

(01:43):
much that they just go out in it any way
with their umbrellas and raincoats. But I like a cloudy
day without the rain when I can get it nice. Yeah.

Speaker 1 (01:53):
Well, the topic that I have on our on our
calendar is practical debugg and it looks like you've given
some talks on this. I found a blog post about
it on chelseatroy dot com. I'm a little curious as
we get going, what is your approach to debugging, because

(02:14):
it seems like a lot of people just kind of
go and tweak the code until it works.

Speaker 5 (02:19):
Well.

Speaker 4 (02:20):
Yeah, so I guess the way that debug really does
depend on the situation. But one of the things that
I have noticed in the way that we teach people
to program is that we leave out certain skills when
we are educating folks, and that is in part by
danta of the resources that we make available and in

(02:40):
part by dint of what we think of as programming
as opposed to what I find that it really is,
at least in my case. So the vast majority of
the tutorials online and even the majority of the courses
that you'll see undergraduate graduate institutions, Udacity, you name it,
they're focused on teaching people how to do things. And
the vast majority of the tutorials that are available are

(03:01):
developer has already practiced how to do this several times,
so they get it right on the tutorial, and then
they go through the tutorial getting it right. They don't
run into issues, or they don't run into a lot
of issues. They certainly don't run into the number of
issues that you would actually run into implementing something for
the first time. And that's for a number of reasons.
They want the tutorial to focus on how to do something.

(03:22):
They want the tutorial to be a relatively smooth experience
for people, which is good and it absolutely has its place.
One of the things that I think that we undervalue
in our programming education is all of the skills around
that that we end up picking up over time. So
relatively experienced developer might have an easier time debugging than

(03:45):
a relatively new one. But right now, the way that
we have it set up, the vast majority of people's
intuition and skill around debugging has to come from essentially
inductive reasoning from all of the situations that they personally
have encountered in their career. Because we don't have a
pedagogy around debugging, We don't have a praxis around debugging.

(04:06):
We just sort of like figure it out from all
of our individual experiences. Now, for that same reason, we
don't see particularly good translation between debugging on one stack
and debugging on another stack. Take an extremely experienced Rubius
or extremely experienced Rails developer and they'll be absolutely excellent
debugging in Rails. Put them on a mobile stack and
they're going to have a lot more trouble than you

(04:27):
would expect based on the amount of experience that they
have in Rails, because there isn't a generalized practice for debugging,
and so a lot of people's quote unquote debugging skill
comes from I have seen literally this exact bug before,
as opposed to I have seen bugs like this before,
and I know what might be causing it, or I

(04:48):
have some intuition about how to narrow down what might
be causing it, and so we end up with these
extremely specific, extremely personal, experienced based debug ugging strategies like
you're talking about, where people just kind of mess around
with things until it works, and the more experienced they
have with a particular stack, the more they'll be able

(05:08):
to hone in on what to mess around with. But
then when they switch stacks, they lose a lot of that.
And I think that a lot of that can be
solved by taking a more deliberate approach to the way
that we debug and the way that we teach debugging,
because the vast majority of the time that we spend
as programmers isn't usually on cranking through solutions where we

(05:30):
already know what we're going to end up with at
the end, and in the situations where we are doing that,
where we're cranking a whole bunch of code, we know
exactly how it's going to work, a lot of times
developers think of that as rote they don't really like it.
What they want to do is they want to approach
new challenges and when we're approaching new challenges, we're spending
more than half of our development time looking at something

(05:51):
that's not working the way that we were expecting it
to work, and needing to figure out why it's not
working the way we expect it to work and how
to get it to work. And because we don't model that,
because we don't teach that, folks assume that a quote
unquote good developer spends less time doing that than they
actually do because we don't see it modeled anywhere, and they,

(06:14):
in addition, spend more time doing that then they might
need to if they had a strategy for going in
and debugging something in a general sense, Does that make sense?

Speaker 2 (06:24):
Yeah, that makes sense to me.

Speaker 1 (06:25):
I'm one of the people that, Yeah, I just want
to make it work right, and so if I can
shortcut even a well reasoned approach, I'm gonna try it.

Speaker 4 (06:37):
Yeah. It's it's it's a tough it's a tough intuition
to fight, and I think that there's a reason for that.
I think that we think about programming as being single
modal when there's actually two modes to it. So the
first one we might think of as like building mode
or productivity mode. This is mostly what we see modeled,

(06:57):
This is mostly what we see in tutorials, and this
is most what we're striving for. This is also what
we reward software engineers for. If we see them spending
a lot of time in this like cranking building mode,
we think of them as a good software engineer. And
in that mode, we're focused on getting something working fast.
Our focus is on creating something that wasn't there before,

(07:19):
and speed is of the essence in this mode. So
the problem comes in that that mode is most effective
when we understand what our code is doing. In a case,
we're we're facing a bug, or in a case we'
we're facing an issue in our code. By definition, we
don't quite understand what our code is doing, because if
we did understand what our code was doing, it wouldn't

(07:41):
be doing the thing that we don't want it to do.
So the base assumption that makes the building mode work
is no longer true. But we continue to try this
building mode. We continue to move as if speed is
of the essence. Is if the focus is on getting
something to happen, and it doesn't end up working particularly well,
and it works worse and worse, Yes, the less we

(08:01):
understand exactly what our code is doing. Because speed relies
on making quick judgments. Speed relies on heuristics that relies
on assumptions, and when our assumptions are correct, then we
can move faster. But if we don't understand what our
code is doing, there's some assumption that we're making that's incorrect,
and the faster we try to move, the more likely

(08:22):
it is that we gloss over that assumption that we
don't take the time to re examine that assumption. So
the faster we try to move, the lower the likelihood
that we're going to be able to end up catching
it insidious bug because we're not re examining those assumptions
that we're making. So there's really a second mode that
we need to be aware of in programming, and it's
something that we need to switch into when we don't
understand what our code is doing. We need to be

(08:44):
able to switch from that building mode into an investigative
mode where we're no longer focused on creating something that
once upon a time wasn't and now instead we're focused
on understanding exactly what our code is doing, but maybe
more importantly, understand our assumptions about what our code is
doing because somewhere in those assumptions something isn't matching up.

(09:07):
That's our opportunity to slow down, to compare what our
code is doing to what it is that we think
our code should be doing, and narrow down the location
in the code where the difference is happening so that
we can then fix it and bring our assumptions and
the codes function back into line to resolve the issue.

Speaker 3 (09:27):
So slow down, huh, go ahead? Yeah, So you you
all right?

Speaker 6 (09:32):
So talking about these modes since since we got here,
because I think this is super interesting and it's it
appears that you have thought quite a bit about this,
because you're super eloquent at talking about it.

Speaker 4 (09:44):
Huh, thank you.

Speaker 3 (09:44):
I appreciate that there's a lot.

Speaker 6 (09:46):
So there's a lot of language here, and I think
I think I vaguely understand what's going on. Hopefully I
understand it's slightly better than that. How do you teach
this to somebody? Because as I was sitting here listening
to you talking like, okay, yes, I recognize eyes where over.
I mean, I've been doing this for almost fifteen years,
so at some point, hopefully I've gotten sort of good

(10:07):
at this, right, and I feel like I have stumbled
my way through sort of having these two modes, right,
or at least I can look back on my past
and like kind of feel like I recognize that.

Speaker 3 (10:20):
And as I.

Speaker 6 (10:21):
Think about like people that I'm mentoring, this is this
is kind of the frame that I'm thinking about right now.
This is definitely a stumbling block, right. So, for example,
I have one person that I'm mentor right now that
they're always in build mode every week, and I'm spending
all all this time and energy trying to get this

(10:43):
person to slow down, right, So, like, are there ways
do you have tips?

Speaker 3 (10:48):
Maybe you don't. Maybe you're just like, look.

Speaker 6 (10:50):
I'm just talking about what we should do, not necessarily
how we get there, right, And that's fine, But I'm
curious if you have tips, how do you convey this
to somebody? Hey, look like here's the value of slowing down.
Maybe here's a framework for when you choose to slow
down and when you know, flip back to build mode.

Speaker 3 (11:08):
I know there's a lot of weird questions in there.
Look what you want up?

Speaker 4 (11:13):
Yeah, no, it makes sense to me. I think that
you're right. So one of the things that we need
to be able to identify in practice is when do
we need to shift between these two modes? What signals
can we use to determine that we should be in
a building mode versus an investigative mode. And so one

(11:37):
of the things that I use for this is so
there are a couple of different strategies that we can
use for debugging, right, and there's one in particular that
tends to show up really really commonly when we are
in when we are in build mode, because it is
effective for build mode. So I call it the standard strategy.

(12:00):
And what this is is that when you run into
a bug, you try changing the thing that you think
is most likely to be causing the bug, and you
see if it works or not. If it doesn't work,
you try changing the thing you think is second most
likely to be causing the bug, and hopefully that works.
And if it doesn't work, then you try changing the

(12:20):
thing that you think is the third most likely to
be causing the bug. This is perhaps the fastest kind
of shortcut strategy to get something working when you understand
what's happening in your code, Because when you do understand
what's happening in your code, the thing that you try
first is in fact the most likely thing to be
causing the bug, and you get through it relatively quickly.

(12:42):
That's not going to be the case with insidious bugs.
So that's not going to be the case with a
bug where you're making an assumption that's fundamentally inaccurate to
what it is that the code is doing, because the
things that you think are the most likely to be
causing the bug are like in line with your assumptions
about what the code is doing. So standard strategy great
for build mode. Here's where it falls down is that

(13:05):
if you don't understand what's happening in your code, you
can end up in this kind of vicious cycle with
the standard strategy where, for example, you try changing the
thing you think is most likely to be causing the bug,
and it doesn't work. So you try the changing the
thing you think is second most likely to be causing
the bug, and it still doesn't work. So you run
out of ideas, and then you get pissed and you

(13:26):
go back and you try changing the thing that you
tried the first time to see if it works this
time around, and you get in this kind of like
vicious cycle with it. Right, that's the failure mode for
the standard strategy. That's where build mode fails us when
we don't understand what's going on with our code, And
that I think is a really strong signal when you're

(13:47):
going back and trying the thing again that you already
tried and saw it didn't work. Yeah, sometimes that works.
Maybe that's what it was. There was some configures you
or something like that. If it doesn't work the second
in time, I think that's your like really strong signal
that it's time to switch into investigation mode. Because at
that point, clearly you've made some assumptions. Something somewhere is

(14:09):
not matching up. So then if the standard strategy is
not going to work for us, we have to have
strategies that we could use instead. Right, there's got to
be something other than the standard strategy. Figure out, you know,
instead of focusing on getting the thing working as quickly
as possible, we can instead pick a strategy that focus

(14:30):
is on something else. So the strategies that I use
at that point both focus on there's two, I'll go
over them, but they both they shift the focus from
how do I get this thing working to how do
I identify the place where my assumptions about what's happening do
not match up with the code and what it's doing right.

(14:53):
So there are a couple. One of them I try
to use. One of them is helpful for speed in
some cases, when it's not possible, I use the other one.
So the first one which I can use for speed
is I call it the binary search strategy. And basically
the way that this works is that I identify some
point where the code begins executing right and identify some

(15:15):
point either where the code finishes or at which point
I'm positive the bug has happened. So I have these
beginning and end points right now. I try to pick
some spot in between those two, kind of in the middle,
and at that point I use a breakpoint, or I
use print statements or any of a number of tactics
to get into the code and test that all of

(15:36):
the assumptions that I am making at that point are correct.
So I'm printing out variables determining what the flow of
the execution is. And suppose I test all of my
assumptions at that point in the code and whatever i'm
printing logging breakpoint and everything's working is expected. If everything's
working is expected at that point, then I pick a

(15:57):
point halfway between there and wherever my code finished executing
or the bug happens. And so one of the things
that we know from executing binary executing binary search in
computer programs is that it's pretty fast. You can narrow
down in relatively few steps from a really wide range

(16:19):
of options what could be going on, even if you've
got tons and tons of code in the middle. We're
talking about in a really complicated system using binary search
strategy five six steps maybe, which is sounds like a
lot until you consider that you might try like fourteen fifteen, sixteen,
twenty twenty five different things in a really complex system

(16:41):
if you're just trying to move straight to getting the
thing working.

Speaker 1 (16:45):
So I just want to point out, with your standard strategy,
I'm a little bit weirded out.

Speaker 2 (16:50):
That you've been watching me code all these years.

Speaker 4 (16:54):
You know what the problem is. I've been watching me
code all these years, and I do that constant, you know,
the like maybe there was a ghost in the system
the first time. Let me try it a second time.
See if the ghost is gone. I've definitely done it.
I think everybody's done it, you know.

Speaker 1 (17:10):
Yeah, but what you're saying makes sense as far as
the binary search strategy, it's the same idea as a
binary search tree. And I'm assuming that you got the
idea from some structure like that. But yeah, you know,
I mean we talk about big O and everybody makes
a big deal out of it as an interview question.
But I mean, yeah, you here, it actually is a

(17:33):
measure of speed, right, because it's log of then you know,
number of lines in the area you've identified as where
the bug may be occurring, so you know, And and
the fact that you can pick chunks of the code
instead of you know, specific locations within the code allows
you to narrow down well, it's either this chunk or
that chunk, and so you know, you might have to

(17:57):
log in, yeah, you know, to narrow it down. Yeah,
instead of you know, essentially you're just doing a random selection. Yeah,
you know, it's an informed random selection, but it's still
random ish and hoping that you hit the right thing,
because it's not just that you change it in the
right place.

Speaker 2 (18:16):
But you have to change it the right way too.

Speaker 4 (18:18):
Right, Yeah, it's absolutely the case. So yeah, I love
using the binary search strategy for that. I think I
started using it for that after I was using get bisect,
which essentially allows you to similarly use a binary search
strategy to narrow down what commits something went wrong in
and what I you know, I sort of like knew

(18:40):
this cerebrally from having studied computer science, but using get bisect,
I realized, like, it takes me, you know, a lot
fewer steps, let's say, six ish steps to figure out
what's going wrong, like which commit out of a hundred
commits something wrong. One hundred commit sounds daunting at first,

(19:03):
but seven steps doesn't sound nearly as daunting, and so
this strategy could also be applied there. Now the downfall
of this strategy it's graver speed, but it's sort of assumed.
It works best when you're talking about like a single
threaded application or a situation where you don't have multiple
processes where things can break off, where the order is deterministic,

(19:24):
and so when you're dealing with multiple threads or some
other kind of nondeterministic order type of situation, the binary
search strategy can be tough to use because you don't
necessarily know from beginning to end of execution, like what
the order is to do that. What you do know
is that you start somewhere, and then at some point
at the end something went wrong, and all this stuff

(19:44):
that happened in here, whatever order it happened in, something
went wrong. So in those cases I have to back
up and use potentially a slower, less flashy, less cool strategy,
which is essentially to start at the beginning of execution
and and like at the very beginning of execution, do
the same thing, print, log, breakpoint, test my assumptions and

(20:07):
if they're wrong, then maybe that's where my issue is
getting cost and if not, then move a little later
in the execute basically follow the code path through, which
feels really slow because especially if we do it like
we start at the very beginning, all our assumptions are right.
We start one step down, all of our assumptions are right.

(20:28):
We go to two steps down, all of our assumptions
are right. Okay, this is getting boring. When do I
get to skip ahead and then we get to step three?
Oh wait a minute, I was assuming I was just
skipping to step eight and all of this time, and
the problem was in step three. So suppose that there
are like fifteen steps there. That sounds like a lot.

(20:50):
But if your problem is in step three or step
four or step five, you only have to try as
many until you figure out what the actual problem is,
which at most is going to be fifteen, Whereas if
you're trying the same thing a million times at step eight,
that ends up still being more steps. Even though this
process feels really slow because kind of again by definition,
like we don't know what's wrong, so it makes sense

(21:13):
to figure out. It makes sense to systematically figure out
what's wrong. The other really nice thing about that strategy,
or both strategies, really is that we can feel like
we're making progress on a bug even if we haven't
fixed it yet, because psychologically the way it feels, well,
I'm interested in y'all's thoughts on this, actually, but my

(21:35):
feeling when I'm fixing a bug using something like the
standard strategy is my first assumption didn't work. I've gotten nowhere.
My second assumption didn't work. I've gotten nowhere. I feel
like I've made no progress until I have finished fixing
the bug, which really sucks right up until the moment
where I've finished fixing the bug. Whereas if I have
a systematic strategy, I could say, all right, well, I

(21:58):
haven't fixed the bug yet, but I have narrowed down
like half of the possibilities for where it could be.
And I know that I've narrowed down half of the
possibilities because I've quantified the possibilities and I'm moving through
them with some kind of like plan as opposed to guessing.
And if my guess is wrong, I don't really know
how much longer I'm going to be doing this.

Speaker 6 (22:20):
You know, there's another downside to that Wold standard strategy too,
which is if you continue the standard strategy right through
the time, you're at the point when you don't know
what's wrong now right, but you continue just playing guests
and check. Now, all of a sudden, you probably didn't
take good notes as you're doing this, and when you

(22:42):
do finally affix it, you don't know why it now works,
and so this just becomes a boogeyman that you're just
like scared to go back and touch again because you're
just like, well, I can't touch any of that stuff.

Speaker 3 (22:53):
I literally have no idea why it's working now.

Speaker 6 (22:56):
I mean, I have done this and I can't I
can't even like tell how many times like I realized
somebody else was doing this because they're like, shoot, I
don't know why it's working, you.

Speaker 4 (23:06):
Know, yeah, we uh. I think it's really funny when
people talk about how programmers, you know, oh, you know,
software engineers or engineers, they're like, they're not they're not
superstitious or any of that stuff. Everything's evidence based to them.
And I'm like, no, definitely, no, that's not how this is. Like,

(23:29):
the longer you work with software, the less you trust software.
There's a reason for that, and it's not. Yeah, I
think you know it's coming out.

Speaker 5 (23:39):
The other side. Personally, I feel like I feel like
my code is my friend now now other people's code
is the enemy. My code code I've written and no
one else has touched is like a lovely, warm Blanket's
like something to come home to. Uh huh. So so yeah,
I do feel like there's a big difference between debugging

(24:02):
my code and debugging other people's code.

Speaker 1 (24:07):
Oh absolutely, is that like other people's kids, So obviously
I tell you what I feel like.

Speaker 5 (24:15):
The difference is is that when I'm writing something, I
tend to find the bugs at the time. Yeah, so
this probably isn't true, but it feels like it. I
feel like I'm finding the bugs when I'm writing it.
But when you've meen it happened. The other day, I
was asked to come and look at the roilse five
side and work out why it was being a naughty website.

(24:40):
And if someone else's code, you know, and you've just
got no no idea what's going on at all, and
it just feels like a much more difficult process. And
I also get a lot angrier when I'm debugging other
people's code. I'm a lot more forgiving of my own code.
Is there like an management approach to ducking?

Speaker 4 (25:02):
Oh man? So it's funny that you mentioned this because
so I'm working on an implementation right now if the
Raft distributed Consensus algorithm, and I'm the only person who
has worked on this code, which means I got to
make all my own stylistic choices about it and every
you know, all the messages are like QT definitely not

(25:23):
what the original Raft team would have had their servers do.
And they're they're very nice to each other. They say
please and thank you and all that stuff. It's very
Canadian of them. But I I find that I like
the problems in that code. The idiosyncrasies in that code

(25:43):
are endearing to me because it's mine, you know, the
same way that my frenziosyncrasies are like endearing. Whereas like
if somebody else were to try to run this code,
they would probably be like, this is ridiculous, Like why
does it this is no, I don't like this at all,
but it's fun to me. And they did this. I'm

(26:04):
not gonna saye this correctly. I'll go back and find
it and we can put it in the show notes.
But they did this study about people's opinions of their
own origami. Are you all familiar with this study? So
essentially they taught some people how to do origami, people
who were not particularly skilled at origami to begin with.

(26:25):
And what they found was this really interesting effect where
the objectively worse the finished product was, the more the
creator loved it because it was an indication of how
much they had struggled with the origami and then they
finally got something sort of working, and that struggle was

(26:47):
like that struggle played a role in their fondness for
the finished product more than the objective judgment of the
finished product. And I think that probably happens with Code two.

Speaker 5 (26:59):
Is say, look, what's the way this is going.

Speaker 4 (27:06):
Well, it's that you have context on like all of
the problems that you've already solved in that code, right,
And other people they don't have that. They just come
in and they just assume everything that's working was always working,
and they focus on the things that that don't And
I think, you know, we struggle with that, and it's

(27:28):
of course, it's when you're when you're debugging your own
code to you have a little bit more intuition for
what was happening, and you don't have to back translate
from the code to the original intuition the way that
you would with somebody else's code. And when we go
in and we maintain somebody else's code. I do this
a lot in my job. I find that many of
the projects that I take on are relatively complex, relatively undocumented,

(27:51):
relatively untested code basis where I'm going in and I'm
adding documentation, I'm adding tests. In some cases, I'm going
in and the dial developer is gone and nobody knows
how it works, and I have to figure it out.
And in that case, we're not just talking about debugging.
We're not even at debugging yet. We're starting at like
forensic software analysis. Effectively, we have to be able to

(28:13):
go in and like CSI this code and figure out
from clues like what's what's going on? What's going on
in there, and figure out you know, based on like
these indicators in the code. I think originally this piece
of it was supposed to do this, And that's a
whole separate skill set from debugging, which is also you know,

(28:35):
forensic software analysis is a fun thing to get good
at too. But once again, like debugging, if we're not
if we're not promoting that to a first class skill set,
if we're not recognizing it and teaching it as a
first class skill set, it just feels to people like
either wasted time or like an indication that they're not

(28:56):
good at their jobs. And it's not that they're not
good at their jobs. It's that this is a skill
that takes time, and the fact that you need to
take time to do this thing doesn't mean you're bad.
It means that this is one of the things we
have to take time to do when we're doing this job.
The same way that a cowboy, you know, it's not
just about riding. You also have to be able to

(29:17):
saddle up your horse. You have to be able to
feed your horse. You have to have cattle sense or
whatever it is they have. You have to be able
to cook your beans and a can on the fire.
You know, you've got to be able to do all
this stuff that's not just riding your horse, right, and
it's not wasted time to do all those things. It's
stuff you gotta do to be, you know, a cattle person.

(29:39):
And I think it's the same with debugging, forensic software analysis.
We've got these skills that we need that we don't
factor into our mental model of what it means to
be a good programmer, and we really should.

Speaker 5 (29:52):
Can I ask you about reproduction?

Speaker 4 (29:55):
Sure?

Speaker 3 (29:56):
Because what's your question?

Speaker 4 (29:58):
So when a man in a one and love each
other very much, I'm so.

Speaker 5 (30:02):
Glad you picked that up. The hardest bit I have is,
especially in Rudy because there's so much rails floating around,
is reproducing the expletive expletive bug. Because a lot of
the stuff I do is kind of an e commerce
site and you get the people with phoning up and

(30:23):
they said, your website had just done that, and then
the client phones me up and I said, like they say,
once or twice a day, the website does this right,
And for most people it's fine, but it's those edge
cases where you're hitting a bug which can't be easily reproduced,

(30:44):
and often once you have reproduction on that bug, solving
it's just like the easiest thing ever, it's just getting
it to do it again. Is there any tips for
getting it to do it again?

Speaker 4 (30:56):
Ah, that's a good question, because that can be a
really tough one. You know, it be hard to replicate
the exact environment that it happened, in the exact state
of the database, that it happened in the exact like
whatever the configuration is it's going on. We know with

(31:17):
bugs that are tough to reproduce that if we're not
able to get it to reproduce, there's some aspect of
the configuration that is outside of the environment that we're
currently modeling that is different from what the person who
saw the bug originally provided. And I think there's kind

(31:39):
of two parts to this. The first one is recognizing
which aspects of the environment are missing from our understanding,
And this one is tough, but it's the one between
the two that I think software engineers have a little
more success with than the other one because the other one,
and this is an ongoing struggle is figuring out how

(32:02):
to get the details of the entire environment to be
able to replicate it one hundred percent, Because when somebody
phones in and they say, like, my app just did this,
it's not necessarily it's not necessarily true that that person
has access to the details of the entire environment where
the problem is cropping up, or that they're even like

(32:25):
aware of all of the environmental variables it might be
causing the issue. And so in those cases it can
be really really tough, and sometimes the solution ends up
being all right. So it happens once or twice a day.
We take all of the issues of this type and
we put them in a bucket and suppose that it happens.
So we get two of them one day, two of

(32:47):
them the next day, and two of them the next day.
So at what point does this bucket have enough data
points in it that we can go through and systematically
compare it to all of the times when this works
correctly that aren't in this bucket? What environment variables do
we think could be different between those two situations? And so, unfortunately,

(33:08):
I think sometimes we got to like wait and figure
out from this collection of individual instances what might be
different about the individual instances from the way that like
normal circumstances operate, we see some so there are there
may be some valuable transfer here between techniques that medical

(33:30):
researchers use for diagnosing and understanding rare conditions. Is that
you just don't have that many people who have it happen.
But in order to get solid research on a condition,
you have to sort of have a minimum sample size,
And so what they'll do is they'll try to create
a record of everybody who has this condition and hopefully

(33:53):
get the sample size up to you know, twenty fifty
one hundred something where they can start using some aggregate
and analysis techniques to figure out what the difference might be.
But I think sometimes with rare bugs, we, like I said,
it's tough, and if it's not happening all the time,
it can be difficult. But sometimes we're not where it's

(34:15):
tough in the first place, and we're making it tougher
on ourselves by not taking that systematic step. Either. We
don't have a record of all the times this rare
bug happens so that we can start to implement some
aggregate analysis when it's happened twenty times fifty times, one
hundred times, and we can tell customers in the meantime, like,
you know, we know this is a really thorny issue.
We're not really sure why it's happening. We're in the

(34:37):
process of data collection right now to see if we
can figure out why it's happening, and on the back
end be collecting that somewhere so that we know when
we get to some certain number that we think like
we can try again and try to do some aggregate
analysis on this. Then we'll set aside time we'll put
in we'll put a ticket in the system for when

(35:00):
we get twenty examples of this, we'll go back and
take a look at this in the aggregate as opposed
to each individual time it's happening, attempt to like debugget
based on that single instance without looking at any of
the instances in the past we have. This is a
little bit unrelated, so I promise we'll come back to
the actual subject of debugging. But we see a similar

(35:21):
tendency in organizational dynamics and teams on software teams, where
like if somebody's demonstrating a pattern of behavior of like
mild microaggressions or like one little bitty thing that's like
not really worth addressing on its own. Sometimes those situations
can be insidious because in any given situation, it's like

(35:41):
not worth bringing up. But over time, this person demonstrates
a pattern of doing this, and we don't you we
don't have any like aggregate strategies for saying, you know,
there's this teeny little thing. There are these teeny microaggressions
and any given one of them might be a mistake,
but it's happening like regularly over the course of you,
and so it's a pattern that we need to address.

(36:02):
I found that it can be really helpful in organizations
as well too. Somehow, even if we're not addressing those
things when they happen, keep some kind of record of them,
and then address the pattern when it's clear that it's
a pattern, because at that point it's no longer about
like whatever the latest incident is, which wasn't that big
a deal. It's about the fact that the individual incidents,

(36:24):
though not a big deal, individually aggregate to form a
pattern that is, you know, ever so slightly dragging on
our team, ever so slightly increasing our turnover, our churn
on who leaves the company, and it's it's a cost
that over time we can't afford, but it's tough to

(36:45):
address at individual instances. So I think there is there
is a parallel there. But I know we're not talking
about organizational dynamics. We're talking about bugs. So I'm happy
to come back to the code side of people.

Speaker 5 (36:55):
Have bugs in their personality and they need to be fixed.
Oh yeah, absolutely one place where we have a solution
for this.

Speaker 6 (37:03):
Right, So a clock, Your clock gets off by a
little bit like every moment, right, But you don't go
to your clock every two minutes to go fix it
by like the tiny milliseconds that it's off. You do
it like once or twice a year maybe or something
like that. Right, So we sort of this is the
same thing.

Speaker 3 (37:23):
Right. You set up a system that.

Speaker 6 (37:27):
You can check in or whatever it is, right like,
but you have to design a system around it. Like
it's not something that you can You can't let it
go because otherwise your clock doesn't work anymore.

Speaker 3 (37:39):
But you also can't check it every moment. You have
to come up with some system that works for you.
I feel like it's kind of like the time boxing thing.

Speaker 6 (37:46):
Like everybody that I mentor I like teach to timebox
and they're like, well, how long do I timebox for?

Speaker 3 (37:51):
And I'm just like, that's actually a personal thing. You
just kind of say you just kind of figure it
out for you. Like I have my own time boxes
that work for me.

Speaker 6 (38:00):
I tend to break my days down into like half
days more or less, just because that works really well
for me. Like, you know, I eat lunch, there's like
a natural break, and because I'm like ADHD and I'll
like get super hyper focused on something like that, I
always come up for air to half day, so I
can guarantee that I can at least time box for that.

Speaker 3 (38:20):
You know, it's all sort of the same thing.

Speaker 6 (38:24):
But my point is like you have to design the
system getting back to the subject or whatever, and I
don't feel like you're saying something different. I feel like
it's going to be a personal thing kind of thing here.

Speaker 4 (38:35):
Yeah, I would say so. I think. So this conversation
is making me wonder whether Century and similar error logging
platforms have a way for you to like automatically put
certain types of bugs in buckets and then like alert
you when the bucket has a certain number of issues
in it or something.

Speaker 3 (38:54):
I know that some of them let you bucket.

Speaker 6 (38:57):
Right, So if I have like the same kind kind
of book, happen again, it like buckets it together. It says, hey,
here's an instance, here's you know, here's a list of
ten instances that have happened.

Speaker 3 (39:08):
I don't know.

Speaker 6 (39:09):
I don't know if there's an alert. I mean I
can make it. I've used like three or four of them,
and I can. I mean, they usually give you an
option to like give you a daily briefing or something
like that.

Speaker 4 (39:20):
But right, like I know that Sentry has a button
that allows you to just mass ignore a certain type
of error. So I wonder if there's the opposite, you know, like,
if there are enough of these, tell me.

Speaker 6 (39:32):
Should we should discuss mass ignoring errors? Because that's a
that's a ignore. Like leaving the beeping signal on all
the time is also a problem.

Speaker 3 (39:42):
Right, so you just put.

Speaker 4 (39:45):
Electrical tape over the check engine light. I don't need
to know about this.

Speaker 3 (39:49):
That means when you really do have a problem, you
have no idea.

Speaker 4 (39:55):
That's absolutely true. So this is something that I've run
into a fair amount with end to end to SUN
mobile applications, is like, and part of this is that
sometimes the end to end frameworks are like a little
flaky at their core. I get yet, that's absolutely true,
but the signal to noise ratio is like low enough
that sometimes developers start completely ignoring their end to end

(40:15):
test to the point that they don't even look at
why it's failing. And at some point the reason it's
failing like sneakily changes, but people don't notice because they
just see it fail and they're like, oh, that test
always fails. And then a few months later something was
wrong in the app for months. Oh, we didn't realize
this was wrong. Why didn't we realize it was wrong?

(40:36):
Or in the best case, somebody goes in and they
actually take a look at the end to end test
and they're like, wait a second, wait a second, it's
actually pointing out an issue. This isn't just flakiness, it's
like actually a problem.

Speaker 3 (40:48):
Oh my gosh.

Speaker 6 (40:49):
So Andrew Mason, who actually used to be on the show,
so I chat with him every week, and he was
just discussing this exact similar to the problem last week,
right like, and he was he was trying to deal
with the code base, where like it was just failing. Sorry,
I'm totally about to say an opinion here, so it

(41:09):
could be controversial. But he was like, well, you know
which test do I delete? And I was like, dude,
I was like, if your test suite is failing, it's
providing zero value right now.

Speaker 3 (41:20):
I was like, start.

Speaker 6 (41:21):
Commenting stuff out until it's green, because a test week
provides zero value until it's green. And you were talking
about like, I mean, I've been at places before where
they let let these tests go like all the time,
where they just rerun their test feet like four times
till it passes, right, But that flaky test is telling
you something sometimes anyway.

Speaker 3 (41:41):
So my point is like I'm a.

Speaker 6 (41:43):
Big believer in deleting and deleting like broken flashy lights.

Speaker 3 (41:48):
All right, I'm done.

Speaker 4 (41:49):
How do you feel about coming back to flaky test
on some kind of regular interval, similar to like coming
back to rare bugs on a regular interval and try
like time boxing, attempting to fix it.

Speaker 3 (42:05):
I think that's the same thing.

Speaker 6 (42:07):
But I'm not really a big believer in Okay, So
I worked pretty hard and self discipline for myself, right,
and you know I have friends and I trust their
self discipline to a point.

Speaker 3 (42:19):
But as far as like trusting like a.

Speaker 6 (42:21):
General random developer that I don't know to like come
back to a thing at a regular interval.

Speaker 3 (42:28):
I have almost zero trust for that.

Speaker 6 (42:30):
So so my answer is I just deleted instead, because
because I don't trust the other guy. I guess kind
of like kind of like lou Katese the other guy's code.

Speaker 5 (42:40):
Oh man, it happens more as you get older.

Speaker 4 (42:44):
Oh really, the trust goes down.

Speaker 5 (42:46):
I used to believe in other people's code. Then I
started working on Brailles.

Speaker 4 (42:52):
Oh Man. So a friend of mine, hill O Wayne.
He also keeps a software engineering blog, and in one
of his posts, he's talking about Uncle Bob and Uncle
Bob's approach to software resilience, and he in talking about

(43:16):
that approach, he points out that parts of the approach
rely almost entirely on telling developers to be more disciplined,
and that you can't that doesn't like discipline as the
solution for making something not happen is never going to

(43:37):
make that thing not happen because you can't just get
an entire population to all exercise discipline to the threshold
that you would like them to coach. Your bugs, guys,
the better programmers, what is your problem?

Speaker 3 (43:50):
Then you don't have to do bug.

Speaker 5 (43:53):
So Uncle Bob was saying we should be more disciplined,
and you're saying that's not possible.

Speaker 4 (43:58):
Well, what I'm saying is that you're uh, not that
it's not possible to be more disciplined, but rather that
it's not possible to command a bunch of people to
be disciplined and then guarantee that that's going to work.

Speaker 3 (44:17):
Right. Can't use it to validate your personal problems.

Speaker 5 (44:20):
Unfortunately, that sounds that sounds like do you want to share?

Speaker 3 (44:24):
John?

Speaker 4 (44:25):
No.

Speaker 3 (44:25):
I actually was just saying that, like, you can't.

Speaker 6 (44:28):
You can't use the excuse of like, oh I can't
be more disciplined because they said it on review Rooks.

Speaker 4 (44:35):
Oh man? Oh if people always take everything that we
say on here as ironclad advice, And there are a
few statements from earlier that I need to protect.

Speaker 5 (44:44):
But that's a sign of a good podcast that you
wish you'd never done it?

Speaker 4 (44:49):
Oh man?

Speaker 3 (44:50):
Is it?

Speaker 4 (44:52):
I will say that for myself. Part of the reason
I use automated testing is precisely because I do not
trust my own discipline, because you know, there are certain
circumstances where now, let me go ahead and say that
I think unit TDD is very valuable in certain circumstances.
I also don't see UNIT TDD as a panacea for
software verification. I think there are other methods that we

(45:14):
can use in addition, and I happen to also think
some of those other methods are better suited to certain
problems that UNIT TDD doesn't address. However, However, one nice
thing about unit test driven development is that if I
write the test first, then I code for the API
that I want as a reader of the code, which
is good because code gets read many multiples of the

(45:37):
number of times it gets written, and so I can
be lazy at the point where I'm writing the test
and write for the API that I want. Then when
I'm writing the actual code, I'm held accountable for an
API that's relatively easy to read, as opposed to starting
with an API that's relatively easy to write, which ends
up being harder to read, which causes more strife over

(45:59):
the course of the life of the code base. And
that works precisely because I do not trust myself to
go the extra mile and make the easy to read
API without the accountability step of the test in the front,
and I imagine that my code would be harder to
read if I weren't using that where I can to
make and to hold myself accountable to that API. So

(46:21):
it's a perfect example of using a system precisely because
I don't trust my own discipline. I can't. I cannot
rely on discipline, even in the one person system of myself,
to make things work the way that I ultimately, like
theoretically want them to work, because in the short term
it's harder for me.

Speaker 3 (46:42):
Yeah.

Speaker 1 (46:42):
Well, the other reason that I do a lot of
that is just that, I mean, going back to the
assumptions on how things work, I'm not going to remember
that next week. I mean honestly, people are like, well
if I come back to it in six months, I'm like,
if I come back to it in six days, right,
And so, you know, by having some of these disciplines
it you know, just speaking to the larger idea here, right, Yeah,

(47:08):
if I can encode my assumptions, if I can make
sure that the things that I care about are things
that I am checking on, that's where it makes sense.

Speaker 2 (47:18):
Mmm.

Speaker 4 (47:19):
What's the adage? The more you cuss about a line
of code and how inscrutable it is, the more likely
it becomes that you wrote it.

Speaker 2 (47:25):
That is so true, get blamed. Ah crap, yep, yep.

Speaker 1 (47:34):
And then the other half of the time it's well,
look John did it. And then you go and you
look at that commit and he ran the Linter and
it changed the indentation. I actually put the code on
that line.

Speaker 5 (47:46):
Anyway, I've got a theory that Linter's and making thugs
harder to find. Oh, because back in the day it
used to be able to side channel do a side
channel attack on nice cote basis where if you found
like different indentation and non standard formatting and stuff. I
just kind of zero into that. You go, ah, this

(48:07):
person can't be bothered to put a space before the
curly brakes, so the chances are they're a total loser.
But now, and this is where the bug is, I'm like,
you know, if it looks if it looks all over
the place, then I'm like, ah, there'll be bugs here.
And I just found this was really helpful for assessing
code qualities. Can they can be bothered to indent properly,

(48:28):
But now with Linter's and the Rubo cop and everything,
then everyone's code looks the same, so you don't have
that kind of that meta attack, that side channel attack
to spot the dodgy bits of a new code base.
Am I am I am I am? I off on
one there?

Speaker 3 (48:45):
Or is this a real thing? Now? You have to
read the code during code reviews of what you're saying.

Speaker 4 (48:51):
So your linter is eliminating some of your priors for
where your issues and the code might be.

Speaker 5 (48:57):
I think I think the LINTA is solving the easy bugs,
like you've misspelled faarable name and yeah, you know you've
got kind of riva cops picking stuff up, like this
variable ward is never ever used and that's it. But
by removing those low hanging bugs, what you have instead
is a kind of faceless wall of perfectly statically analyzed code,

(49:20):
and the kind of it just seems like bugs are
getting harder.

Speaker 2 (49:23):
That's it.

Speaker 5 (49:23):
Bugs are getting harder, and lintus of the blame.

Speaker 3 (49:26):
That's exactly what we wanted it to do.

Speaker 6 (49:28):
That we wanted it to take away all the easy
bugs that we can work on the hard one. Now
you're complaining that they're all hard. There's no easy ones
for you to point out. Some people just can't be satisfied. Okay,
So getting back on topic. I actually wanted to revisit
a thing, and I'm not I swear I'm not trying
to set you up for failure.

Speaker 3 (49:48):
I just felt like we should probably address this. But
you talked. You talked about your three strategies earlier.

Speaker 6 (49:54):
And how the sort of default one for us is
this standard strategy, try out the place that I think
the bug think, go to the next most likely place,
and so on, right until I eventually find it.

Speaker 3 (50:04):
At some point you sort of have.

Speaker 6 (50:06):
To like bail out of that and be like, I'm
not getting anywhere, I should try a different strategy. Do
you have a sort of rule of thumb, even if
it's not perfect for like when you start to bail out,
I don't think that we actually called that out.

Speaker 4 (50:20):
Yeah, totally. When I am trying the same thing multiple
times even though I already saw it didn't work, is
usually when I like, if I haven't done it by then,
that's the time.

Speaker 3 (50:32):
She wait until that point of anger for.

Speaker 4 (50:34):
Me, exactly when you've got like the orange face emoji
is your actual face?

Speaker 3 (50:43):
That's fair?

Speaker 4 (50:46):
When do you usually bail out? How do you know
when to bail out?

Speaker 3 (50:49):
I don't, but I'm I'm big into timeboxing.

Speaker 6 (50:54):
So I'm usually like, you know, all right, well I'll
let myself If it's something that I start out in
the first place saying I literally have no sense of
where this thing is, right, then I'm just like, well,
I'll give myself like thirty minutes maybe to like Sometimes
I'll be like, I'll give myself like five or ten
minutes right to like play guests and check, and then

(51:15):
I like jump into like something else. But if it's
something that like I feel like usually that process happens
for me when I think I know where something is
and then I start down the rabbit hole like with
high confidence that I know exactly where this problem is,
and then I just keep discovering that I don't know
where this problem is.

Speaker 3 (51:36):
In that case, it usually.

Speaker 6 (51:37):
Happens that, you know, I either need to like get
up for a bio break or like to go eat
or something, and then I'm like, huh, I've been doing
this for a long time.

Speaker 3 (51:46):
I should probably like stop. That's yep.

Speaker 4 (51:50):
Yeah, I run into similar And another thing that tends
to happen to me is that I will be just
banging my head against a problem with no luck for
some extended period of time. And then convince myself somehow
to put it down and walk away. And in the
time when I've walked away, it's like I guess on

(52:13):
a background process somewhere, and then it occurs to me, Oh,
I haven't tried whatever this other thing. Maybe I should
do that instead. And it happens so frequently that now
I get into these mental battles with myself where I
can't figure something out and I want to keep working
on it because like, the more it doesn't work, the
stronger my resolve grows to get it to work. And
then I have to convince myself to walk away because

(52:37):
for some reason, the more things don't work, the more
determined I become that the next thing is going to work,
which doesn't match up with the data at all. But
I've gotten better about it. But it used to be
a real problem.

Speaker 5 (52:51):
What about caffeine and alcohol?

Speaker 4 (52:53):
Oh man, So I do like my coffee in the mornings.
I don't know if it makes me a better programmer.
I do. I have noticed that if I if I
try to like mess with the system after about five
thirty PM, I'm probably just gonna end up breaking it
in some kind of way where I have to come

(53:15):
back in the morning and start from like behind where
I would have started if I just stopped at five thirty,
because I have to fix whatever I broke after five thirty.
Like the best thing that's going to happen if I
commit after five thirty is I'm going to have the
system back at where it was when I started working,
which is an.

Speaker 5 (53:33):
Issue I find. I sometimes I need both caffeine and alcohol.

Speaker 1 (53:38):
MM.

Speaker 5 (53:40):
The caffeine to motivate you to find a bug, because
it's always, like, you know, the big ones, always the
ones you don't want to look at right. So you
get the caffeine to breave you up, and then eventually
you reach the point of failure where you can't find
the bug, and then the alcohol lowers your inhibitions and
you start just trying crazier stuff.

Speaker 4 (54:02):
I like that idea as an inhibition inhibitor.

Speaker 2 (54:05):
I mean interesting.

Speaker 4 (54:07):
I wonder that's how it gets depicted in pop culture
a lot too.

Speaker 5 (54:11):
Right, I was reading about these brain studies where people
can't see stuff and they can spot spelling errors easier
when they're tired or they kind of impinge their brain
and suddenly they can kind of spot like the word
the appearing twice in the sentence more easier. You had
that kind of stuff.

Speaker 4 (54:28):
I haven't heard about it, but I believe you this
is a okay.

Speaker 5 (54:32):
I'm sure this isn't something I've made up, But they
kind of do something people's brains, either they make them
very tired or they give them something, and then spotting
spotting single characters spelling mistakes is easier because their brain
is no longer functioning on that higher level a kind
of reading. It's no longer kind of speed ring. It's

(54:52):
kind of doing one step at a time, and then
certain activities become easier. So this is an idea. I
really take the heart in my foll where I am
banging the head against a bull Doug hunting binges.

Speaker 4 (55:06):
I wonder if that speaks as well. In part of
why pair programming works is that you can have one
person focused on the overall strategy and code flow, and
then the other person's focused on like, is this word
in there twice? Does this match the API as it
currently is? Are we using this variable? You could sort
of employ two different levels of thought at the same time.

Speaker 6 (55:27):
That way, I don't I don't have an answer one
way or the other. I was just going to say
the way that I always thought that code programming just
worked is you had somebody writing and then somebody reading it,
and the person that reading it was like, what the
heck is am I reading? And I always felt like
that that sort of was how I saw it working.
But I'm sure there's many things at play.

Speaker 3 (55:48):
It just works. I just trust it. So I did
have a question.

Speaker 6 (55:52):
I think this is probably my last one for the day,
but so this is kind of you've kind of touched
on this before. You said that you're sort of interested
in creating a language around debugging pedagogy, as you said earlier,
things like that, it seems like you're sort of interested
in this space there, I think. I mean, I take

(56:15):
your point, like I get it, Like you're right, Like
I literally have no words to describe my process, Like
today I have some new words and some new systems,
and that's really cool. Are there things that are like
missing from here? Like are there things that we need
to do to make this work? I was thinking as
you went along that this seems like a societal shift

(56:35):
you mentioned earlier. Actually I might be answering my first
question and opening a new one.

Speaker 3 (56:40):
So you said earlier.

Speaker 6 (56:42):
Hey, we actually like basically reward people that don't debug well,
and we're like not rewarding people that are basically by
just rewarding people that are in bill mode all the time.
So are there like problems across the board that we
should be addressed saying how do we get to a
better place?

Speaker 3 (57:03):
Is this is a good question?

Speaker 4 (57:05):
It is a good question. I think that there are
a couple of different pieces to it, and one of
them might be about the incentive structure that we use
for measuring our I don't know, I value is the word.
Are measuring our work as programmers or maybe as anyone
and finding ways to model that so that it doesn't

(57:27):
necessarily feel like a waste of time. And there's a
psychological component to that, and there's an actual organizational incentive
component to that. And then the other part would be
learning more about bugs. So I'll address both of those,
the first one being the organizational incentive part. How do
we go about, you know, providing accolades and career oriented

(57:50):
rewards to someone who is able to figure out and
resolve this like thorny issue that nobody else has managed
to figure or that has existed for a long time.
How do we promote that to the level of like
getting this feature out on time, because in essence they

(58:11):
both they they do the same thing. The for the
goals of the software, we want working software, which means
the feature has to be there. But it also means
that the feature has to be working, and we want
to make sure that we are recognizing and rewarding both
of those things. And maybe that's at the organizational incentive level.

(58:34):
It's something for engineering managers and directors of engineering to
think about on the psychological level. So I'm a big
fan of and this is my mom. Growing up. My
mom really loved mystery books and she kind of passed
that on to me. She even wrote some mystery novels.
She's very, very into the mystery. And I had various

(58:54):
plans for what I wanted to be when I grew up,
A detective, a spy, various types of things like that.
Those I really did. It's a whole story. I'm not
a spy. I already know somebody was about to ask
me that, So no, I'm not.

Speaker 3 (59:08):
Things like that.

Speaker 2 (59:09):
That's exactly what a spy would say.

Speaker 4 (59:11):
That, That's immediately what people say when I tell them
I'm not a spy. I'll let you come to your
own conclusions about whether I'm a spy or not. But
yeah so. But but one of the things that has
kind of helped me psychologically make peace with long and
onerous debugging processes is to imagine that I'm a detective

(59:34):
in those scenarios, and this is a case, and you
know which which cases to Detectives get rewarded for solving
the really hard to crack cases. So regardless of whether
or not there's any actual organizational incentive for me to
go after this bug, I get to depict I get
to picture myself as like detective Chloe Decker or whoever,

(59:55):
and think of this as solving a case, which is
this is helpful for me as an analogy that makes
me feel like, you know, I'm getting something valuable done,
I'm getting something cool done. It's not this isn't this
isn't like a yak shaving type of task where I,
you know, it just has to get done and that's

(01:00:16):
the only reason that we're doing it. There's something for
me to learn here, there's something there's a skill for
me to develop here. I'm going to be a better
programmer at the end. Of this because I've resolved this bug.
So that's the organizational incentive and psychological side of it,
and then there's coming up with better ways to understand

(01:00:36):
and categorize bugs. So, because bugs are something that we
largely have approached by an individual guest and check method,
there's not a lot of good systematic research on bugs,
where they come from, how they happen, what they look
like we do see. So there are papers about like,
does static type checking reduced bugs? Does this reduce bugs?

(01:00:59):
Does that reduce bugs? But the so so I'll go
ahead and say the denuma of that is that a
lot of that research indicates that the two things that
quote unquote reduced bugs are code review and developers getting
adequate sleep, which is an interesting result. But the thing
is that as you look into these studies, I'll just

(01:01:19):
say this, looking into computer science studies can sometimes be
a depressing endeavor because as we look into them, we
realize that, like, the sample size isn't anywhere big enough
to indicate a statistically significant difference a lot of the
times or our there are multiple comparisons going on here,
the statistical rigord doesn't tend to be very good, and

(01:01:41):
in particular, in the case of debugging studies or studies
of bugs, we don't have a good handle on what
constitutes a bug, and so what happens researchers come up
with proxies. They try to come up with these proxies
to indicate what a bug is in a code base.
And it'll be things like this isn't a fake example,
and I understand why they did it. I don't think

(01:02:02):
this is a you know, I get that it's really
hard to figure out a way to say what is
a bug and so, and you need, once again, a
lot of samples to be able to do any research.
So how do you find a lot of samples in
a case where you're not really sure what a sample is?
You try to come up with something that represents a sample.
And so they'll say, we looked at these five code bases,

(01:02:24):
and we decided that any commit that changes two lines
represents a commit where there was a bug. Like, what
kind of proxy is that? I don't think that's a
particularly accurate proxy. Necessarily, there might be bugs where you
have to change a whole ton of lines of code,
or there might be a situation where like, I don't know,
we changed deployment hosting providers and that's one line and

(01:02:49):
it wasn't a bug. We just like changed, you know,
we don't we make it's an assumption, it's an embedded assumption.
We're assuming that a two line change equals a bug. Okay,
what's the app curacy of that assumption? What are the
false positives on that? What are the false negatives on that?
So we really just don't have solid research on bugs
in that way because in order to do that, it's

(01:03:11):
really really tough to go back and retroactively figure out
where the bugs were. You have to, in my opinion,
have folks or in my perspective, at the moment. To
get a good read on bugs, you have to have
developers logging at the time of resolving bugs. What is
the bug, how much time did you spend on this bug,

(01:03:32):
what did the problem end up being? What are all
of the things that you tried in the process. I
think that a really solid, really insightful, illuminative study on
bugs would require that kind of data collection, and that's
a really really tough thing to do, but I mean,
I love to do that. At some point. I would
like to get a cadre of developers together who are

(01:03:54):
committed to logging our bugs and how we resolve them
and figuring out what patterns we can find in that.
What took a really long time, what didn't take a
really long time, What prior experience really helped me out here?
Where am I translating skills from one code base to
another codebase and stuff like that? And the data collection
process would be super intensive, but I think it's something

(01:04:15):
that hasn't been done and something that I would love
to do and or see done. But I think that's
where additional terminology would come from. Yeah.

Speaker 3 (01:04:22):
Cool, Well, I've got.

Speaker 1 (01:04:23):
To push this toward picks because I've got a hard
stop in about fifteen minutes.

Speaker 2 (01:04:26):
This has been really enlightening.

Speaker 1 (01:04:28):
And yeah, if people want to participate in the conversation
going forward, how do they get a hold of you?

Speaker 4 (01:04:35):
Oh? Man, So my name is Chelsea Troy. My site
is Chelseatroy dot com. My email is Chelsea at chelseatroy
dot com. My Twitter is Hey Chelsea Troy. I keep
it consistent on the name as much as I can.
But yeah, I mean I'd love to chat with people.
Those are probably the places where you would find the most.
I do have some blog posts about debugging already on

(01:04:57):
the site. I'm happy to provide a link to the
category or what have you for the show notes if
that's helpful. But yeah, I'd love to talk to people
about this kind of thing. I'm on a bunch of
slacks too. It's possible if I'm on a slack that
you're in. My handle is Chelsea Troy because that's what
it is on all slacks.

Speaker 2 (01:05:13):
So yeah, awesome.

Speaker 1 (01:05:15):
Yeah, if we can get links to those in the
chat that we'll put them in the show notes. All right, well,
let's go ahead and do some pics. Luke, do you
want to start us off with picks?

Speaker 5 (01:05:24):
I got to pick, I gotta pick. I've been working
on a giant CCTV system that backs up about one
hundred and forty gigabytes of video data day, which is
quite a lot. And it's all based up a Mongo database,
and the Morgo database likes to corrupt itself in the
regular basis. We found that that bug. But we want
to move some data off the server onto a kind

(01:05:46):
of off site backup. We had the idea of sticking
it in an NS three or a or a Google
Cloud object story, and my colleague used our Clone, which
I had not really heard of, but it was so
both fantastically easy to get a database blob.

Speaker 3 (01:06:05):
Don't kind of.

Speaker 5 (01:06:05):
This was only one hundred meg but any any science works.
It's quite a bust to get data into a Google
Cloud drive or sports all kinds of different things. It
was so easy to use and set up. My pick
for this week is our clonearclone dot org. If you
want to get data into the cloud from development an
environment server, that is really really good. It's fantastically easy.

Speaker 2 (01:06:31):
Awesome, John, Do you have some picks for us?

Speaker 3 (01:06:34):
I have too this week.

Speaker 6 (01:06:36):
So I think this is probably more well known in
the gamer community or whatever, but because I think that
most of us in the developer community are using laptops,
but I have I have to have my laptop and
then also have dsktop machines, so I use a mouse,
and I got one of these gigantic mouse pads that

(01:06:56):
like goes under your mouth and like under your keyboard
just like takes up every thing or whatever. And it's
like I don't even know how to describe, like how
much better it is than having a little like you know,
a few inches by a few inches wide square for
a mouse. It's just completely different experience, Like I don't
run my mouse off the edge of my mousepad constantly

(01:07:19):
kind of thing.

Speaker 3 (01:07:20):
It's great.

Speaker 6 (01:07:21):
I highly recommend that you can get them for fairly
inexpensive too. I mean, they have plenty of expensive ones,
but you can get inexpensive ones as well. I got
a Razor one, and then I have like this cours
Air one that I got for like five bucks actually
because it was like on sale, Like you can get
them for fairly inexpensive. So but yeah, I totally totally

(01:07:44):
recommend getting like a gigantic freaking mouse pad.

Speaker 3 (01:07:47):
They're awesome.

Speaker 5 (01:07:48):
I second that I've got a massive one, and it's
also wonderfully absorbent of anything you spill.

Speaker 3 (01:07:55):
All right, So there's that.

Speaker 6 (01:07:59):
And then the other thing that I was thinking about
this week, I was just reminded because I got drag bank.

Speaker 3 (01:08:03):
So I've been a member of this Discord server for
a really long time.

Speaker 6 (01:08:07):
It's apparently become popular, but it's like, so I'm really
into mentoring and stuff, and and some friend like was like, hey,
you should come join the server. Like, uh, it's just
people like learning how to code, like answer their questions
occasionally blah, blah blah, that'd be really cool. So I've
been in the server for like, I don't know a

(01:08:28):
few years at this point. But it's called the Coding Den.
It appears to me on the surface to be like
filled with mostly probably like college kids and stuff.

Speaker 3 (01:08:37):
But if you like ore into mentoring, you know, they
could probably use some more mature people to do some
of that.

Speaker 6 (01:08:43):
There are some of those people there, so if you're
like into that, I'm just throwing this out there.

Speaker 3 (01:08:47):
It's the thing. So that's out there. It's apparently I
found it the other day.

Speaker 6 (01:08:53):
It's like apparently one of like the most populous Discord
servers on Discord too, so for whatever reason. But yeah,
there's a whole bunch of people on there, asking questions
and helping other people.

Speaker 3 (01:09:05):
So I got nice.

Speaker 1 (01:09:06):
I don't know if I'm more mature, I'm more seasoned. Anyway,
There we go, right, I like spicy, Yeah, there we go.
I've got a few picks and I'm going to push
out there. Lately, I have been listening to the Wheel
of Time books on Audible and they're terrific. And I
don't remember who the narrator is, but he's also pretty terrific.

(01:09:29):
There are two of them, There's a man and a woman,
and they kind of depending on which point of view
you're getting the story from, Right, if you're getting it
from a male character, then it's the male narrator, and
if it's a female character, from the female narrator. But
they're they're really well done. I really really enjoy those,
so I'm going to pick those on Audible. And then yeah,

(01:09:53):
I've been onboarding with a company. I'm not going to
announce where I'm working, but things kind of slowed down
with the podcast network to the point where I in
order to pay the bills, you know, I had to
go find some other work. So anyway, the onboarding process
has been excruciating. But they did send like a box

(01:10:14):
full of hardware right with the laptop and stuff, and
they had like this really inexpensive Logitech wireless.

Speaker 2 (01:10:23):
Mouse, and I really like it, so I'm going to
pick that.

Speaker 1 (01:10:26):
I'll find it on Amazon and put a link in
the show notes, and yeah, those are my picks. And
then also, of course most Valuable Dot Dev We're going
to be doing monthly Q and A calls, probably get
an expert or one of my co hosts from one
of the shows on to talk about stuff and teach
us stuff. And then I'm putting together a summit in

(01:10:47):
December and I'm going to have a whole bunch of
experts essentially get on.

Speaker 2 (01:10:51):
And the idea is is, you know, if you were
if you woke.

Speaker 1 (01:10:54):
Up tomorrow where nobody knew who you were, you were
kind of a mid level developer who was competent at
their job and didn't really know what to learn next,
you know, what would you do to become the most
valuable the most valuable developer on your team. So it's
just going to be an it's going to a series
of interviews. But that's that's what I'm looking at doing
for that that summit. And if you want to just

(01:11:17):
get the videos as they come out, that'll be free.
They'll be free for a day or so. And then
if you want, you know, longer access and access to
a Slack channel and scheduled q and a's and stuff
like that with them in Slack, then you know, you
can buy a ticket.

Speaker 2 (01:11:31):
So that's that's how I'm looking at doing that.

Speaker 1 (01:11:34):
That way, the folks who are in areas where they
can't afford it, or if you're between jobs. I mean,
you can just get the free ticket and show up
and participate. So so yeah, so that's what I'm looking
at doing there, Chelsea. Do you have some things you
want to shout out about?

Speaker 4 (01:11:49):
Yeah? Absolutely, I got some picks. So the first one
is a book that I've been reading recently. It's called
The New Educations by Kathy and Davidson, and it is
about the American chiefly higher education system, and where it
came from, why it is the way it is, and
what we might potentially change about it. Now, I'll issue

(01:12:11):
a reservation on this book, particularly with regard to tech.
Some of the tech claims that it makes are not
accurate exactly. But with regard to the teaching portion specifically,
there are a lot of case studies and examples in
this book of teachers who've done really interesting things in
their classroom to help engage the students and help the

(01:12:31):
students actually participate in meaningful projects that make positive changes
in the world in areas where the students care about
as they're in the class. It's really cool. I took
a whole bunch of notes from it. I'm going to
be doing some stuff with the exercises from that book.
I think as experiments in my mobile software development class

(01:12:52):
this fall, So I'm pretty excited about that. It's called
the New education. And I guess the other thing, maybe
this is to to cliche for the Ruby Rokes podcast,
But at Ruby Kanf twenty twenty this year, remote edition,
I am going to be co leading a workshop about

(01:13:13):
software maintenance and risk mitigation mechanisms that are pretty excited about.
The case study that we're doing is a three device
system where some software all collaborates together to help a
parent emergency room keep track of all of its parent patients.
So parrots lives are going to be on the line,
and mitigating risks in the software is going to be

(01:13:36):
in your hands as participants. I'm really excited about it,
and I hope that you will join me there in November.

Speaker 2 (01:13:42):
Awesome, looking forward to it, and thanks for coming, Chelsea.

Speaker 3 (01:13:45):
This was really fun.

Speaker 4 (01:13:46):
Yeah, absolutely, I had a great time. Thanks for having me.

Speaker 1 (01:13:49):
Yeah, thanks for coming. All right, folks, we'll wrap this
one up until next time.

Speaker 2 (01:13:53):
Max Out

All Episodes

Black-Belt Debugging with Chelsea Troy - RUBY 663

Episode Transcript

Popular Podcasts

Stuff You Should Know

The Joe Rogan Experience

On Purpose with Jay Shetty

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Black-Belt Debugging with Chelsea Troy - RUBY 663