Rails at Super Scale with Kyle d'Oliveira - RUBY 667 - Ruby Rogues

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Speaker 1 (00:05):
Hi, everyone, Welcome to another episode of Ruby Rogues. I'm
David Kimura, and today on our panel we have Matt Smith,
Luke Sutters II, and we have a special guest, Kyle
da OLIVERA. Do I say that right?

Speaker 2 (00:19):
That's Delivera Doliver. Yeah.

Speaker 1 (00:22):
So, Kyle, would you mind telling us a bit about
who you are, who you work for, and some of
the things that you're doing.

Speaker 2 (00:28):
Sure. My name is Kyle. I've been working for a
company named CLEO. Let's say Legal Practice Management SaaS software.
It's based out of Vancouver, Canada. It makes practice management
software aimed at lawyers. We're looking at transforming the legal space.
Our mission is to transform the practice of law for good.
There's a nice little double upon for there. And it's

(00:51):
been really interesting seeing some of the changes in legal
that we've kind of made an impact with over the
last few years. I've been working on Rubyan rails for
the better part of the last decade. But when I
started working on rails it was rails versions zero and
I've been upgrading rails ever since and so now finally

(01:12):
up to rail six and so touching all of the
major versions. My major focus at CLEO, which I've been
at now for eight years has been on the back
end infrastructure side of things. So the main focus is
scalability for the codebase, but also in the terms of
the organization, like what happens when we have two hundred

(01:35):
developers working, what happens when the dataset sizes are to
the size where we can exhaust regular integers and we
need to actually go into like vacants. We look at approachability.
How easy can we just take a new developer and
dump them into the codebase and have them up and running.
Because as things go to scale, there's obviously new patterns
that need to be adhere to that you know, we

(01:58):
don't necessarily need to focus on with small project, but
we do need to focus on for large projects, and
my team has focused a lot of that to making
the effort and experience for all of the developers easy
and fast.

Speaker 1 (02:10):
Yeah. Absolutely. One thing that kind of rings true is
you always have to think about scalability when you're developing,
but don't actually write for scalability when you're developing. So
keep it in the back of your head saying is
this going to come back and bite me later or
is it you know, a really non issue. I remember

(02:31):
one time I had a situation where I was soaring
just three kilobytes of data in a database, and I thought, Okay,
this is going to get used a little bit. They
were images, so you can kind of see where this
is going. I'm like, you know, that's not a big deal.
It's only three kilobytes. But unexpectedly the consumers loved the

(02:53):
feature that it was supporting. And now that single table
is over thirty gigabytes and it had as millions upon
millions of records. I'm like, oh, that was an unexpected
But I guess that's kind of where I did not
think of scale at the time or proper way. So
introducing that kind of technical debt kind of painted us

(03:15):
in a corner because now transitioning away from that model
is going to be a pain when you're dealing with
that much data.

Speaker 2 (03:22):
Yeah. Absolutely, it's hard to know what you don't know.
So if you don't think about the scale at that
point in time, it's hard to know what problems you're
even going to run into.

Speaker 1 (03:31):
So you gave a talk last year about death by
one thousand commits. Could you give us a high level
overview of that talk and kind of some of the
things that entails.

Speaker 2 (03:44):
Yeah, So working at CLEO, the code base is quite large.
We have tens of thousands of commits that we go through,
and it's really easy to see patterns of developers working
on features. The features go live, and at some point
in the next six months a year, those features come

(04:06):
back to bite us. So as like the first commit
is great, the tenth commit is you're starting to notice
some things. By the hundred, there's maybe some problems. And
by the thousands, lat thousands commit on it. Right, you've
stopped because now you have to completely refactor and rebuild
a lot of this technical debt that you introduced. So
my talk was talking about some of the lessons that

(04:28):
we've learned. And although these the lessons are very specific
to specific problems, there's kind of a generalized idea of
what some approaches that you can take to dealing with
technical debt in your own projects. If you are able to,
for instance, keep if you're able to automate technical debt
away entirely, well, now there's a whole classification of problems
you no longer need to think about, and you can

(04:50):
feel confident that those are just automatically protected and if
you are cleaning up after yourself as you go and
making it easier when there are curve balls being thrown
at you, Fixing technical debt and dealing with it when
you hit scale doesn't have to stop you entirely. It
just becomes a constant, small tax that you pay. But
if you invest in the tools, you can actually start

(05:10):
moving faster even as you.

Speaker 1 (05:12):
Scale right, and so would you mine also explaining what
technical debt is, what would you consider technical debt and
what are some things that you would maybe not consider
technical debt? Kind of like some debusting myths about technical debt,
I would.

Speaker 2 (05:28):
Say technical debt is like accumulation of decisions that are
made while coding that you eventually need to correct in
the future. And as developers, I think we're always making
these decisions. Can we cut a corner here to deliver
a feature out a little bit early? And I think
those are like technical debt isn't bad. I think when

(05:50):
you are willing to get something in front of the
users and deliver value earlier by incurring a little bit
of this technical debt that you then have to clean up,
I think that's totally okay. But I think technical debt
often comes in the situation of developers making a decision
that a framework needs to be super generic and it's

(06:11):
got a little bit speculative, and then they come to
implement something in the future and it's just really difficult
to deal with because it's so generic and hard to
understand that new developers have to then unpack that and
wind it back just to implement something new in it.
Some things that I think are not necessarily technical debt
can kind of come from maybe decisions that actually made

(06:33):
sense at the time and aren't necessarily any cutting a corner.
So I mean, it may make sense to build a
system that is very generic, and maybe that is the
correct choice, and you build through and then things change,
and when things change, that's when you might have to
have like the technical debt comes back. But until the

(06:55):
things change, it actually might not be I think that's
a bit of like a generic answer. But it's hard
to pin down a concept like technical debt because almost
everything we write is debt of some form.

Speaker 1 (07:08):
Yeah, I definitely have to agree with that. So where
are some of the real world examples that you guys
have experienced over your years, where at the time you
made a decision and you or the team thought like
this was a great choice, this is the right way
to do it, but then later you found that it

(07:28):
became more troublesome or more of a headache than it
was worth.

Speaker 2 (07:32):
One of the things that popped up is actually something
that you know, we decided on because the rails community
pushes for and this is what comes out of the box.
So if you think about RAILS migrations, if you think
about how they're often applied, if you think about some
examples that you've worked on, there are often times where
you use something like a tool like Capistrano, which deploys
some code and as part of the code database, migration

(07:54):
gets run. And for projects that's fine, that's like a
for most small things like that, migration that runs is
fast and it's not a problem. But so this is
an example of a decision that we kind of were like,
let's just inherit what the community uses. But as we
started scaling out, we started encountering problems with it. So,
for instance, a table that if you ran a migration

(08:16):
on it took thirty minutes. This means that our deployment
took thirty minutes. It also timed out so we lost
all of the context of it. But also during this
period of time, the table locked, so any developer or
any queries that started going to that table stopped being answered.
So all of our servers shut down, and we couldn't
kill the altered table because it was already mid progress.

(08:39):
And after it finished, we now had a table in
UH with like a new state, but the code hadn't
actually finished deploying, so now we're running into different problems.
So this is a little bit of a decision that
it makes a lot of sense when you're small, like
go really quick because you can, and it makes sense.
But when you hit a certain piece of scale, well

(09:00):
you can no longer run with those assumptions and you
need to change those. So a new process needs to
be built. And for database migrations, we need to build
them in a way that are like entirely asynchronous to
a deployment process.

Speaker 3 (09:11):
Thirteen minutes is a that's quite a migration.

Speaker 1 (09:14):
Yeah.

Speaker 2 (09:15):
I think the table, this table that we run uses
us stores a little bit of all of the activity
that users do, and it was like the first table
we ran into that it exhausted like thirty two bit
integers and we needed to flip the IDs to be Biggins.
We didn't think that would be a problem either, and
it's it's leaps and bounds bigger than any of the
other tables we have in our system.

Speaker 3 (09:37):
I'm going to ask you of this question now, which
is how do you make your system capable of asynchronous
table migrations.

Speaker 2 (09:46):
There's actually there's good question, and there's actually a lot
of tools that exist that we don't necessarily need to
build ourselves. GitHub has a tool called ghost. There's another
tool by Percona in the Percona Toolkit. I can't remember
it's like maybe online schema replacement. Can't remember the exact name.
But the general strategy is to instead of changing a

(10:07):
table with like an altered table, you actually create a
brand new table, populate that table with various mechanisms. Some
of them use triggers, some of them use the binary logs,
get the table to like a table that's in sync,
and then do quick renames. And so you rename the
table to be the old one to be old. You
change the new table to be the new one, and
the new queries start flowing into this new table. And

(10:28):
you can do this as long as you want. It's
entirely non blocking, but it has to be in a
process that exists entirely outside of like the deployment ZECH.

Speaker 1 (10:37):
Yeah, and that could have its own issues if you
have you know, thousands of requests per second coming in.
So yeah, definitely not a fun problem to solve. And
it's also I guess good to know what kind of
migration or really what kind of SEQL functions will cause
a table lock. So adding an index or adding a

(10:58):
column and stuff can your table. So being aware of
what actually is going to lock the table is really
good information to know.

Speaker 2 (11:07):
Some of them seem obvious, like I think if you're
dropping a column or adding a column, that could potentially lock,
But some of them are not. Like if you changed
a VarChar from like a VarChar one hundred to var
ar chart two hundred and you're just increasing it, does
that lock? Maybe? I actually don't know off the top
of my head. What if you change the character set,
what if you changed the coalition? I don't know.

Speaker 3 (11:25):
Is this on my sequel or post ris.

Speaker 2 (11:28):
This was in We use Percona, which is just an
offshoot of my sequel, so it'll also be different for
between databases. So PROCONA might have different decisions.

Speaker 3 (11:36):
Shout out to the ConA guys. I've done some works
in a place where we had some Pocona consultancy. They
were really good, really delivered.

Speaker 1 (11:44):
So that kind of covers the database and scheme a
side of things. To step away from the code you
had mentioned about onboarding people with a larger client base.
What does that process look like for you guys, And
how do you really bring a junior or mid developer
into the company and have them productive quickly.

Speaker 2 (12:05):
Yeah, so a lot of this comes from tooling and education. Right,
there's as like senior developers or people who have just
different experience from different places. We've accumulated huge amounts of
knowledge and it's kind of all tribal, and I think
the if you join a company that doesn't have a
great strategy, a lot of the strategies for sharing that

(12:27):
knowledge is like just work together, go submit pull requests
and have them code review and learn from the code review.
And I think that's okay, you can learn that way,
But there are better ways to push information to people.
And this is a concept about like just in time education.
So one of the an interesting example of this can
be through the linters. So I did to talk about

(12:50):
this as well for the twenty twenty Couch edition of
Rails comp called Communicating with Cops that focused on using
rubocop as a mech isn't to provide education. Did a
little bit of deep dive into how ubi cop works
and how to build your own custom camp. But one
of the things that we approach with that CLEO is
as people make mistakes and learn about bad patterns, we

(13:13):
try to codify those patterns so that it's it doesn't
happen again, but people get education about it right as
it happens. A good example of this that is super
trivialent doesn't often bite people until like there's just an
unexpected case would be maybe the Rail's convention of naming files.
We've seen cases where people maybe make like a user's model,

(13:35):
but then make a typo in like the spec. So
rather than call them like user, spec called it users
and it's plural or something along those lines, And you
know this is like the spec is still run, but
there might be some tooling that we expect to adhere
to the Rails convention and it doesn't quite line up.
So you can have a linter that basically checks the
name of the files and the name of the classes

(13:56):
and make sure that they're in line, and if not,
alert people and do that as part of their editor
or do that as part of them committing code. And
they get warnings and they get education as they're writing code.
So they just wrote something, they save the file, they
get a little warning popped up being like, hey, you
may have made a type out here. And this goes
even too as far as behavior. If we know that

(14:17):
there exists bad patterns, so for instance, making an HTTP
call inside of a transaction, which we know is going
to be potentially bad, we can actually automatically prevent that
as soon as that starts happening, as soon as we're
able to detect it. So it might be in a test,
might be as part of a winter. We provide that
education right back to the developer so that they understand

(14:38):
what they did wrong and the avenues of what they
need to do to fix it. So now when a
junior developer enters the company, they can actually just feel
free to start writing code, and write even code in
kind of a way that maybe breaks some patterns, and
a lot of time they're going to start getting education
right away, and then we can do all of the
usual things as well. As pull requests come in, we

(14:59):
can review them and provide more education that way. And
if we find constant patterns of every junior developer we
come in makes the same mistake, let's codify that so
that they get the feedback immediately.

Speaker 1 (15:10):
Yeah, that's kind of one of my pet peeves. I
guess you could say with linting is that if a
particular project has a set of practices it likes to follow,
maybe it is no more than one hundred characters on online,
that kind of feedback should never happen in a code review.

(15:31):
That if you have those kind of expectations, then they
need to be known expectations via a linter, whether it's
Rubacopper standard RB, and it should never be an unknown
exception to our unknown expectation to the developer. So I'm
definitely on board with that, And that's something that I've

(15:51):
had to fight and struggle with, is going through code
reviews and having everything kind of nitpicked, because one, it
decreases the morale of the developer if every pull request
they're making it's just getting bombarded with styling quirks or
requests to change. So I could definitely agree with that point.

(16:14):
And I think that every project should adopt some kind
of linter if there are expectations of what they're doing.
Even if you bring in rubocop, you disable everything by
default and then you just start adding in or allowing
which exceptions your team follows on that particular project.

Speaker 2 (16:34):
Yeah. Absolutely, And I think there's even one step farther
of a lot of linters can do auto correcting. So
if you you know, if you care about having on
one line space between methods, don't even have rubo cop
or a linter warned about that. Just auto fix it.
Like that's something that developer just doesn't need to worry about.
And you know, it also removes a lot of this

(16:55):
argument over like should I use double quotes? Should I
use single quotes? If it just auto medically fixed and
developer can write whatever that they want, that's fine. But
I've also run into issues of having pull requests being
bombarded by style and it really distracts from the code
review about the behavior.

Speaker 1 (17:11):
Yeah. Absolutely, although you do have to be careful about
the auto correction. I remember one time in my earlier
days of development, when Ruby mind came out, I tried
out ruby Mine's code refactoring thing. I forget what they
call it, but I had some really poorly written classes
and it just absolutely broke everything. Like, I have no

(17:34):
idea how that happened, but things just were not working
the way they were before, and I had to pull
that merge back out because you know, of course, as
a early developer, I didn't have any tests on the application,
so I didn't really notice that things were broken until
they got deployed.

Speaker 2 (17:50):
Yeah. Yeah, I did definitely need to be careful there.

Speaker 1 (17:54):
So you also previously mentioned about so not necessarily on
boarding developers, but having a lot of developers work on
the project. So what point do you go from a
small shop to a large shop where you have to
start putting different kinds of practices in place? And what
are those kind of practices when you're dealing with a
lot of developers on a single code base.

Speaker 2 (18:16):
So I actually it's not clear where that point exists.
I think it's probably going to be different for every organization,
and probably different for exactly the work that you're running into.
I think the thing is to be listen to the
pain points of the developers. So if you notice that
there are you know, there's pieces of friction that occur
between developers, like that's the point where maybe there's actually

(18:38):
some tools that need to be built to make this easier.
So one thing that I think comes up really quickly
in organizations is often the concept of like a testing server.
So you've got your developers environment, you've got your your
maybe your CI but maybe you want like a production
like environment for things, and so you have a staging serve.
You know, when there's five developers, it's really just coordinate

(19:00):
and be like, oh, staging is mine now, I'm going
to test something. When it's done, I will hand it
off and maybe reset it back to whatever the master
branch and let people work that way. But that really
falls apart when you have one hundred developers. How do
you coordinate one server where everyone is trying to test something.
If you have one hundred developers fighting for that resource,

(19:21):
you can kind of budget a little bit by maybe
having a fixed number, and you know, you round robbing
them out, but again, at some point that's going to
break down. So if you think about, like, what's the
problem here is that every developer wants to potentially test
something on an asynchronous schedule, maybe it actually makes sense
to build some tooling so that you can spin up
a like staging servers on like Amazon Easy two or

(19:45):
on Google on Demand and just route them there. And
so that's something that we ended up having to do
really early of building our own toolings so that we
can we call them beta environments where we can have
arbitrary number of them. Someone spent the effort to basically say,
like this branch on GitHub, I want a clone of

(20:06):
the site on Amazon, and within like ten minutes, you've
got a domain that points to it. You've got the
full stack, you can you have full control, you can
do whatever you want, you can break it, and it
gives developers a lot of autonomy to test things that
they want, and you know, removes a lot of this, Oh,
let's deploy it and see what happens. You have a

(20:27):
full environment that you have full control over, Go test it,
go see it with as much data as you want,
and then see what happens. Another example kind of along
those things is like deployments. Do you have a handful
of senior developers who can deploy or do you deploy
on like a every Monday you do a big deployment
that's going to start really breaking down when you have
a lot of developers. You know, at CLEO, everyone has

(20:51):
the ability to deploy, everyone has the ability to merge code.
So we give the power to the developers and now
you know, a junior developer can come in, write a
fix to a read me, merge the code, deploy it
without having to really bother people outside of getting a
code review. And know, now we're deploying code probably upwards
of thirty ish times a day, and that number is

(21:12):
just only going to go up. And so as we're
running into these issues, we are just looking at what
can we do to build tooling so that it's no
longer frustrating for developers. And the important part of this
is developers need to voice things, and you know, managers
and companies need to listen. If we're wasting five hours
a week per developer on this one thing that's frustrating,

(21:32):
like build tooling around it.

Speaker 1 (21:34):
Yeah, that's one of the things that I did just
for my own hobby project and just continual learning, is
that a self hosted git lab instance and I set
up a Kubernet server which will automatically create the infrastructure
for the application that got pushed, so it always happens
on any kind of development or master branch push and

(21:57):
then also on each commit up to the repository and
they'll spin up an entire infrastructure within Kubernetes with the
FQDN that that feature can then be tested. So it
works on smaller applications. I don't know how it would
work on applications that consume thirty gigs of RAM of resources,
but I think on smaller applications that kind of thing

(22:21):
can really save you from having to have dedicated test
servers that shared by several people.

Speaker 3 (22:27):
When are you going to do an episode on that, Dave?

Speaker 1 (22:32):
I do have a Jefre Ruby episode on Kubernetes, which
that's where I got the inspiration from. On that episode,
I just didn't tied into the CICD portion.

Speaker 3 (22:44):
I got a I got a question for you, Kyle.
It sounds like you've got to You've got a lot
of data if you're running thirty minute migrations, and you've
got a lot of developers, and you've got a good testing,
good infrastructure. What I what I've found is a lot
of kind of real memorable problems I've had is where

(23:04):
you get something running and it feels like it's going
to be fine, but then it gets deployed to the
master database and that's the point at which there's some
bad data in there. There's something in there from ages ago,
from a previous version, and it absolutely sinks you. And
these days, whenever I possibly can, I just pulled the

(23:27):
entire production database out and test against that. Do you
do that or is your database just so huge, kind
of throwed around. You can't do that, especially with a
lot of her developers.

Speaker 2 (23:40):
It used to be something that we did. We used
to have We used to call it the snapshot, and
you could point environments at the snapshot and run test
queries on it, but we actually do it did hit
a size where the time it took to set up
the snapshot every day was taking longer than it would
take to actually back it up. So it was just
starting to become unfeasible for us. And we're also dealing

(24:01):
with sensitive data and we don't necessarily want to give
free access to all of that data for our our clients,
so we instead try to invest in a little bit
of tooling. We definitely still have issues where everything looks
good in development, everything looks good in like data or
test and we deployed to production, something is wrong. So
we think about what can we do to make that better?

(24:24):
And so we you know, if it's about a lack
of index on like a database query or something like that,
we can try to check that ahead of time and
build some tooling and alert people when something goes wrong.
But also in production we can be say like, hey,
this query took thirty minutes, that's unacceptable, this career took
five minutes and return that information as like an exception

(24:46):
to the developers that they need to fix, but without
interrupting the actual request behavior. And if things go really south,
just roll it back, like we don't really have. It's
not a blame if someone deploys thing and goes south
and they quickly roll it back. We just try to
take that as a learning opportunity. And how can we
take that learning opportunity and share it to everybody so

(25:07):
that everyone learns from it. Then answer your question.

Speaker 3 (25:10):
Yeah, I mean you must be dealing with a lot
of data, and I've worked with you call it hipA
data in the States where it's kind of confidential data,
and that hugely complicates testing data chranswers because you have
to have to either heavily anonymize or write your own
tools kind of replicate a few one hundred thousand medical records.

Speaker 2 (25:34):
Yeah. All we can also do is I mentioned earlier
that we could talk. We have these data environments that
we can spin up. You just use like a sequel
dump to store data in there. And although this isn't
necessarily production data, developers have full control over what that
data looks like. And so you know, if we wanted
to see what happens if there is tens of thousands
of something in a table or more, we can just

(25:56):
build like little scripts that can feed this seed that
database and then test it outside of production. It's not
perfect because it doesn't always match the same shape as production,
but you can. It's an iterative process, and that information
gets codified, so you can keep adding to the seeds
in those manner so that it becomes a better and
better representation as.

Speaker 1 (26:15):
We go forward. Yeah, so kind of back to the
technical debt. I have a unfortunate story of something that
I inherited one time where I think metaprogramming is awesome
and can do a lot of really cool things and
can really get you out of a bind in certain situations,
but then it can also be overly abused. And I

(26:37):
was searching for a function that was not working properly
within Ruby, and I couldn't find it in the code
base at all. So I thought, okay, well, surely that
this is in the gym or something. So I started
looking at all the gems that's included into this Rails application,
started tearing apart the gems, opening them to search for

(26:57):
this function. Still couldn't find it. Turns out they are
doing a classy val on something that's pulled from the database.
So they actually stored Ruby functions as column or data
within a column on the database, and that's what was
getting executed. That's where the function was defined. So to me,

(27:21):
that's a what's that?

Speaker 3 (27:23):
What's wrong with that?

Speaker 1 (27:26):
Yeah, so you know, other than you could not possibly
even test that bit of code any with any kind
of reason. But it was a nightmare. So just a
warning to when you think that you're doing something really
cool and elegant that's avoiding code duplication or whatever. I
would much rather have code duplication all across my application

(27:49):
than having that level of obfuscation where you're never going
to be able to even remotely troubleshoot it.

Speaker 2 (27:55):
Yeah, metaprogramming is a like it's actually one of the
best lengths of Ruby. You can do so much with it,
but it's on you once you have it. It's the
hammer and everything is a nail, and you want to
use it. And that's that's often a trap that new
developers when they learn about metaprogramming, they really want to
go into. I think a good lesson to come out
of that story is that if you think about code,

(28:19):
it's written once but read countless times, and so if
you can take the little things to optimize the code
for the reader, that is much better than sacrificing readability
to optimize for the writer. So if it takes you
an extra thirty minutes to write a whole bunch of
cookie cutter methods, but now those methods are in place
and they're static and it's easy to read and reason

(28:42):
about end test, that is well worth that thirty minutes,
because you're going to lose more than that reading that
piece of code in the future.

Speaker 1 (28:50):
Yeah. Absolutely, And it could even be taken into something
like private methods where if you have a class which
has a bunch of methods, start sorting them out which
ones are private methods so they do not need to
be accessible to the consumer, because I've had situations where
I've worked on a class that grew over a thousand

(29:13):
lines and there were hundreds of methods in there, and
I had no idea which ones were publicly accessible, that
were truly supposed to be publicly accessible, and which ones
were really meant to be private. So not having that
level of abstraction, so to speak, you lose a lot
of visibility in how important is this class to the consumer.

Speaker 2 (29:37):
Yeah. Absolutely, anything that you can do to make those
kind of classes easier to understand and read for a
new person is great. And also just backing up a
little bit to your example, this is also an instance
where metaprogramming bit you, but metaprogramming is also interesting that
it could save you because you can also ask Ruby
about Ruby. So if anyone didn't know what this is

(29:59):
a attack that I use all the time for debuton
pieces of code that I've never been familiar with, you
can If you can have access to a console, you
can ask Ruby what methods are available with like a
dot methods call. You can also get access to the
method itself and then ask it like, what is its source,
where does it live where? That can make life easier
to track down methods that may be dynamic or created

(30:22):
by gems.

Speaker 3 (30:23):
I recently learned how to use the LS command in Prey,
and now I just I just live out of the
LS pri command. The Ruby API traffic's drop off considerably.
I find I find the dot methods to be quite noisy.
This is very the bose if you're kind of trying
to pick out which command it is. And I really

(30:45):
like the pre LS command.

Speaker 4 (30:47):
Yeah, I mean you can do to make that less
noisy is take like object dot new and subtract the
methods out of that and sort it and all that
sort of stuff, and you can do it all on
a one liner because we're in But yeah, I'll ask
is another great option.

Speaker 3 (31:03):
It's my documentation suffered for it. I must admit. Now
my attitude is just I can just arlest the class
and see what's going on. Man.

Speaker 2 (31:10):
I think that's another example of someone making some tooling
that you know, makes something that. Yeah, if you knew
to call dot methods and subtract object dot new dot
methods or object dot methods, it's great. But now it's
two characters and it's nice and easy, and it's much
more approachable, and then you can have access to things
that you may not knew existed.

Speaker 3 (31:30):
Can I ask you about can we turn back the
clock and ask you about Rail's zero?

Speaker 2 (31:36):
Oh, it's been a long long time since I've worked
on rail zero. I can try to answer questions, but.

Speaker 3 (31:41):
So it sounds like you've been on a bit of
a journey with scaling things up. What did you do
before rail zero?

Speaker 2 (31:48):
Oh? I Actually most of my career has been working
with rails. So before rail zero, I was working at
like an enterprise Java shop that I don't remember a
lot of details of it anymore. It's kind of too
far in the past. But I think I've been working
with rails now for eleven years. I think, so it's

(32:09):
been just a long time just rails. I don't remember
a lot of the pre rails world. To be honest,
that is.

Speaker 3 (32:13):
The correct answer. There is no other system. I ask
because we were talking about the N plus one queries,
and my complaint is that rails makes it too easy
to do N plus one queries because if you just
kind of thought of all the guides, that's what you get.
If you kind of do a dot lder each. Then

(32:35):
you're going to be there for a while and you
start noticing that when you start getting into a few
thousand objects. So you can be sitting there prototyping something
and think this is great, and then when people start
using it, you drop it in. That's when you start
hitting these gotchas. But I think people forget what the
battle days were before you had the rails tooling out

(33:00):
of time it took when you had to write your
own queries. It's really quite significant. And they mentioned Enterprise
Java that was not a whole lot of object relation
mapping going on in that, so that it is a
double edged sword when you're operating at the scale you do,
what are the parts of rails that start to bite.

Speaker 2 (33:23):
We've definitely been bitten by how easy it's been to
make M plus one queries in the past. I think
pretty much any rail shop is going to be doing it.
Rails offers tooling to help with that, but the tooling
still requires a lot of effort. You have to kind
of know what N plus one query you're introducing and
fix it. So though that's where you can build some

(33:44):
more tooling. There exists a gem that we built a
jip preloader. There's also another community gem called a Goldie
loader that removes stuff like M plus one queries, and
those are ways to like basically eliminate those kind of problems.
Some other things that kind of come off on Rails
as we are building is like discoverability of templates. So

(34:06):
I think you're one of the previous episodes of Ruby
Ropes was talking about this, but as as it scales
up like rails, EERB makes it really easy to render
partials all over the place, but it's really hard to understand,
like if you're looking at a page, where are those
partials actually coming from? And how can you dig back
into them? So we've like that's a challenging thing with

(34:27):
rails as well. There's also some things with the community
for things like paging that can be problematic at scale.
If you look at what some of the basic gems
that offer, it often comes down to a limit offset,
which is also really fine on small data sets, but
as you get too data sets that are really really
large and you're going to page really deep into them,

(34:49):
it actually starts really falling apart and breaking down and
things that you might not know until you actually just
hit that scale. I think the some of the RAILS
conventions also starts becoming a little bit problematic, and you
see a little bit of discussion about this. You know,
Rails at one point said throw all the logic into
the controller, and then eventually the controllers became skinny and

(35:10):
all the models became really fat. And I'm sure everyone
has that god object that exists in their project, the
user object or the account object that is five thousand
lines and really difficult to reason about. And people are
offering opinions of having like service classes or various different
patterns to try to combat that. But we're still trying

(35:31):
to unpack some of the things that you started to
RAILS projects with.

Speaker 4 (35:35):
One question on that as far as how you've seen
and the progression of the companies you've worked at, have
documentation right, Like, on the one hand we've just talked about,
you can use cops, you can use linters and say
go out and try things, break things, autocorrect things, experiment basically,

(35:55):
then there's self documentation making sure you're writing good method names,
good class names that are intuitive, and then there's inline documentation,
and then there's high level documentation of hey, we're using
this set some conventions and everything else. This is a
big question, but what what do you think is the
right thing to put in each of those buckets in

(36:18):
order to make an intuitive project that scales across you know,
more than twenty developers up to one hundred developers.

Speaker 2 (36:25):
Yeah, and you know, here's a little bit of like
my kind of thoughts from it. But I'm not going
to say, like my thoughts here are perfect. I think
everyone's mile edge will vary because documentation is a tricky thing.
So when you get to if you're getting to like
gotcha's like if you ever tell someone like, oh, if
you see this pattern, don't do it like this is
like if you have code reviews that like, oh, I've

(36:45):
been bitten by this before, that should be something that
falls into like the linting or the like the just
and time education where you try to codify that. If
you see people that have inline comments and code that says,
you know, like this next few lines are going to
iterate over something and do these operations, that's probably an
indication that their code is not written well to describe it,

(37:07):
and that comment is not super valuable, so that actually
it might be something like that comment shouldn't exist, and
instead we should maybe extract a method that describes it
better and kind of move in the direction of code
describing itself. When you are implementing something that's specifically tied
to code, it should probably exist at the code level.
So if you are if you have like a module

(37:30):
that you want things to include, and someone developers need
to implement certain methods in there, maybe the module should
define those methods and raise like a not implemented error
that have a very clear this is what this method
should do, this is what it should return. Here are
some examples, and just link to them in your own codebase.
And so now when a developer looks at that specific
piece of code, it's still tied to the codebase. But

(37:53):
all of that's, you know, at the code based level.
There still needs to be something at like a higher
level that's like a read me in the documentation or
in something else entirely, So we have stuff that exists
and read me that's kind of more about like process,
but process is specifically related to our code based So

(38:13):
a good example of this would be how do you
do this these asynchronous migrations? So like this isn't really
super tied to code, because you might make a migration,
but then what's the process for getting that live? So
we have a like a step by step guide to
be for ra clio. If you want to do a migration,
here are the steps that you need to take, and

(38:33):
as much as we can, we just link back to
code rather than re implement the code. But we'll also
just describe things in English and offer templates there and
then we go one level higher to things that exist
more at like a process level for the organization. So
for that we use a tool called Confluence. There's lots
of tools that exist that kind of do similar things,
but for those that's that's things that exist outside of

(38:56):
the code based So if an incident happened, how do
you do a post or would cause analysis on that
and there'll be documents for that. Or you know, if
you wanted to propose a new style of a new
feature that you wanted to get some buy in using
some new architecture, just wanted to make sure that the
approach is correct, you can do like a design doc
in this confluence and get people kind of bought in

(39:19):
well before you've actually written the code. But once the
code is written, that document is less relevant.

Speaker 4 (39:24):
Absolutely. I was kind of going from the standpoint of,
like we were talking about bringing a new developer in
and getting them used to the whole environment, and you've
definitely tackled some of that in terms of, you know,
here's the process migration example there. What about just getting
them used to the entire structure of your application where
certain logics live, certain design paradigms that you've talked about.

(39:47):
Some of those can be encapsulated in lnters, but some
of them are larger than winters, And so is that
when you're doing the specific guide for walking them through
that process.

Speaker 2 (39:58):
Yeah, so then there's definite things that lenders aren't going
to be able to do, Like linder won't be able
to tell whether this thing should be a model or
a service class or something. Right, it's not really going
to be able to it doesn't understand the business logic
of it. So for things like that that we kind
of have to rely on like little handbooks of being
Like here, like we've code fight, our style guide. We
try to make sure that we keep that up to date.

(40:20):
There are some things that we still teach through kind
of tribal knowledge and code reviews, like if some smits
a pull request and we notice it, we'll still correct
it there and we'll do a lot of pairing, so
we'll get developers up to speed by working with people
as supposed to just going off on their own. But
I think this is just a learning process. Like we
I don't think we are perfect at getting developers onboarded,

(40:41):
and I don't think anyone is. And I think that's
the important distinction that you just it's an iterative process.
Is if you if you bring in three developers and
they all have the same issue, that's probably when you
might need to introduce some new documentation and be like, hey,
here's our new developer handbook. You might want to read it.

Speaker 4 (40:58):
And then that absolutely and you've been on top of that.
You have personalities too, and you know certain people gravitate
towards certain things. How what's your methods to when you
have what I would consider external documentation, whether that's living
and or read me. It's not in a Ruby file
or an HTML file or something like that. How do

(41:18):
you guys have any triggers in order to Hey, if
something happens over here and we decide on a new paradigm,
make sure you go update that guide documentation or is
it oh, just like we brought a new person in
and we've got this new convention that's not documented. Oh geez,
we got to go update that documentation. And it's kind
of a only when you discover it type of issue.

Speaker 2 (41:42):
So I think the answer is both. I definitely think
we still have places where our documentation drifts and then
somebody notices and we're like, oh shit, we got we
gotta fix that. But we also do leverage tools like
danger danger GSS, like GitHub where it can look at
code and it's not necessarily like a linter of basically
saying like hey, this is bad, but it can make
a comment of being like, hey, you're doing something maybe

(42:05):
this is related to this this link over here, and
direct developers or whoever's reviewing to go take a look
at the documentation. Maybe there's no changes that require there.
And we definitely need to be careful about how much
noise we generate. But you know, if in the case
of like a migration, if a developer writes some on
migration and then submits it, we could basically say, hey,
did you add a new file to like the dB

(42:25):
migrate file. If so, like make sure you're following the
gut steps in here and make sure that it aligns
and kind of point them back at the documentation, both
for the writer of the pull request but the reader
and then kind of helps make sure that things stay
in sync. Not a perfect process. I think we're just
we're slowly getting better at making sure that documentation stays
up to date.

Speaker 4 (42:45):
Yeah, that's always the painful part. Those are great insights.

Speaker 3 (42:49):
What do you think about DHHH? Guy, it's a Weirdosney
I was, I was. It did denounce that just me. No,
I love DHH. She did a book of quite a
few years ago Could Rework, which was prophetic really in
the current situation about working from home. He did a

(43:10):
Rails comp you know, I think it was a couple
of years ago where he said that at base Camp
they have never had a DBA, so they've never employed
a person whose job it was to administer the database.
This is something which Rails has just just magically scaled
up and the database is scaled up. Do you are

(43:33):
you in the same situation? Have you never employed a
DBA for your very large Rails database?

Speaker 2 (43:41):
Yeah, actually we are in the same situation. I believe
we I think we were going to hire a DBA
this year prior to the pandemic, and then I think
there were some complications. But prior to that, the company
has been operating for over eleven years, and I think
now no DBA, we definitely have some DevOps that are
a little bit like focused on making sure that the

(44:02):
database is running and making sure that you know, we've
got replications set up and proper statistics. But we kind
of put the onus on everyone, like you don't have
one person who is the guru of SQL, you have everyone,
and so everyone tries to teach everyone these things and
we try to do our best to share that knowledge
where we can to make everyone as experts as we can.

(44:25):
So we've managed to go, you know, eleven years with
no DBA, and I think we're only getting to wanting
one now because we're trying to do like really customized
processes of how do we you know, this online schema
migration stuff, how do we make that completely automated, which
is actually going to be a completely distinct system to
the rail system, because we're going to want to apply

(44:46):
it to any of our projects, or maybe some gotchaes
between like upgrading my sequel there's probably some things that
they might actually have really good insight into. But I
think our general approach is, even in that situation, we're
going to have one DBA and hundreds of developers, and
we want to make sure that, you know, they may
have knowledge and might be useful for talking through things

(45:08):
and sharing things, but the work it's going to still
fall in with the developers, and you know, I mean
to make sure that everyone is learning as much as
they can and not just blindly hoping that the DBA
is going to handle it.

Speaker 3 (45:20):
Yeah, I mean it's the way. The way if DH
presented it, it was kind of this is this is
a necessary evil mind was to have a database specialist
this instead who rails enables developers to kind of handle
this themselves and not just kind of blame the database

(45:40):
man or woman when the when the thing goes wrong.
Surely as a company gets bigger, you have more specialized
roles and not less specialized roles.

Speaker 2 (45:52):
Yeah, I would agree, and I think there are more
specialized roles, but I think there are skills that apply
to everyone. So you know, as the company grows, you
may have more specialized roles that have more specific knowledge.
But I think probably with that specific knowledge comes the
responsibility that they are not gatekeepers of that knowledge, right
they They may be experts and they're maybe building content,

(46:14):
but I would say part of their job is to
make sure that it's that content is consumable by everyone.
And you know, if they're answering the same questions over
and over and over, they're not doing their job to
educate people on how to self serve and do it themselves.
And that's how we learn and grow as a community
and get better is just by sharing this knowledge.

Speaker 3 (46:36):
It's a it's a really quite an interesting situation. I
don't know what it means for the DBAs, but I
think there's there's definitely more database work out there. But
I think because RAILS just makes it so easy to
work with databases at scale, that you kind of tend
to hit hit that stage much much later on.

Speaker 2 (46:57):
Yeah, I agree, you don't necessarily have to have everyone
customly building SQL like Active record does a pretty good
job of being an RM that let's developers just do
the things they need to do. And you know, there's
data notifications available to easily add tooling that you don't
need the DBA for. But you know, as things grow.

(47:18):
There's things that RAILS doesn't yet have tooling, and maybe
that's something that like if you have a DBA who
is well versed in RAILS, like maybe they can contribute
back to the framework or at their own gems that
can help everybody get better at working with databases. And
you know, it doesn't necessarily invalidate their job, but their

(47:38):
job becomes more of a knowledge producer and they try
to share that knowledge and make the community better.

Speaker 4 (47:45):
Yeah, we're in the same boat. We like to push
that knowledge down as far as possible, but there certainly
are opportunities when you're deep in the materialized viewsed and
windowing and postgraphs or something like that, where you're just like,
I really want to phone a phone, a deba friend.
And that's the conside I would suppose, Yeah, And.

Speaker 2 (48:04):
I think that's that's like the roles of the specialists,
the people who have the specialized knowledge, they're probably more consultants.
And you know, you have someone who's like, I've got
a really gnarly problem. I don't know what to do. Yeah,
like get them to like sit down and help with you.
And that's a big asset that they can help with
people and you know, if that's a one off, it's
a one off. But if they do this ten times
in a week, maybe there's education there, or maybe there's

(48:26):
two ling. And I think it goes to pretty much
any role that you ever feel like you're just throwing
something over the fence. If you push that responsibility also
to the developers, you can also end up with a
much higher quality project.

Speaker 4 (48:39):
Kids them how to use includes and avoid some of
those massive queries and plus one problems.

Speaker 2 (48:46):
Or use some of the gems available and have the
n plus one queries just automatically avoided for you. You bet.

Speaker 1 (48:53):
Yeah, I've had some include statements which spend fifty lines
on some projects that inherited. Its insane the kind of
data that they're trying to return, But yeah, it's crazy
good advice.

Speaker 2 (49:07):
One also, is anything else.

Speaker 1 (49:09):
That we want to talk about. I know we're getting
at about that time.

Speaker 2 (49:13):
I'm just going to mention one thing about includes, because
I think this is a another gotcha of RAILS is
they don't really teach you what happens with includes and
includes as actually does two things in the backgrounds. It
either uses a preload or an eager load, and a
preload splits it off into a different query entirely, where
you do something like select star from table where id's

(49:34):
in this big list. But then there's eger load, which
tries to smosh it into one big query. This is
something where Rails always suggest using includes because it'll handle
that distinction for you. But that distinction actually makes a
difference at scale, and when you're dealing with large tables,
eager load is almost always worse significantly, and so it's

(49:57):
almost all the time you actually want to use preload,
same interface, but it's just this interesting little goatcha that
you don't really realize until it starts biting you.

Speaker 4 (50:05):
And you got to remember everything is just a tool,
and you can either smash your finger with that hammer
or you can build what you want to build with it.

Speaker 2 (50:15):
Exactly great, Kyle.

Speaker 1 (50:17):
If people want to follow you and some of the
stuff that you're doing online, where should they go.

Speaker 2 (50:22):
I don't really have a huge online presence. I do
have a GitHub account, but that's mostly working on like
either public gems for the company. But what I'm trying
to do is be a little bit more present in
the community. So I do have some talks available at
rails cof and I will like my goal is to
be pushing out a little bit more written content which

(50:42):
is available at like the blogs that Cleo provides, so
I can provide a link for that in the future,
as well as a link to any of the talks
that I have. But unfortunately I'm not a super user
on Twitter, but I can also provide my LinkedIn where
I sometimes post new introgeration there as well.

Speaker 1 (51:00):
Awesome, Well, I'm going to move us over to some picks. Looke,
do you want to start us off?

Speaker 3 (51:05):
Yeah, listen, listen to this. Listen to this. Can you
hear that?

Speaker 2 (51:12):
I can't hear anything that.

Speaker 3 (51:14):
That is the sound of me signing up for Drifting
Ruby dot com, which is a quite excellent a series
of rails gusts, including the accident from jQuery to ES
six episode. I am a notorious jQuery user, almost an
unrepentant one, but Drifting Ruby has let me see the

(51:36):
light and I'm a newly reformed character. So my pick
is Drifting Ruby dot com.

Speaker 1 (51:43):
I must say that's a great pick. So all right, hey, Matt,
you want to chime in with some picks.

Speaker 4 (51:50):
Well, my pick comes out of this. I'd say that
danger JS is something that I really want to look into.
We're significantly investing in see I see infrastructure and deploying
those branches like you were talking about, Dave, and so
that looks like a really great way to tie back
to documentation and check the best practices that can form

(52:11):
with the rest of the company. And that's my pick
for today. I'll let you know what I discover awesome.

Speaker 1 (52:18):
I'll jump in with a couple of picks. One is
from Google. It is a type in security key. Other
companies have similar products, like the Ubo key. It's a
USB or a NFC key that will do your authentication
for you. So actually, I have a couple of these
arriving in the mail today in preparations for another Drift

(52:40):
and Ruby episode that I want to do on these things.
So that should be a pretty interesting one. I don't
think it's going to have too much popularity because I
never have one of these keys before later today. And
the other is I have now in front of me
a little rack of Raspberry p i eight gigabyte of

(53:01):
rams that I'm building into a tiny Kubernetes cluster for well,
just because that can really so. I love raspberry pies
and they just released their eight gigabyte versions, which actually
banks it nicer to run some heftier things on it now.
Still slow, but still a lot of fun. All Right, Kyle,

(53:24):
do you want to join in with some picks.

Speaker 2 (53:26):
I didn't prepare anything, so I actually don't have anything
that's off the top of my mind here for things
to just call out.

Speaker 1 (53:32):
All right, fair enough, Well, it was great talking to you, Kyle,
and I always like talking about technical debt because I
am notorious for introducing it.

Speaker 2 (53:41):
I'm always happy to like building tools to fix these
things so that we can make better.

Speaker 1 (53:46):
All right. Well, that's a wrap for this episode. We
appreciate you coming and talking with us. So it was
a lot of fun.

Speaker 2 (53:53):
Yeah, it was wonderful. Thank you, Bye, take care.

All Episodes

Rails at Super Scale with Kyle d'Oliveira - RUBY 667

Episode Transcript

Popular Podcasts

Stuff You Should Know

Dateline NBC

My Favorite Murder with Karen Kilgariff and Georgia Hardstark

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}Rails at Super Scale with Kyle d'Oliveira - RUBY 667