21: Everything Is A Database Problem

Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Wolf (00:05):
Hello, everybody.
Um, it is time for anotherepisode of Runtime Arguments.
I am Wolf, and as always, I'mjoined by my best friend Jim.
Jim, say hi.
Hey Wolf, how you doing?
I'm doing good.
Um This is gonna be an episodewhere Jim happens to be an

(00:27):
expert, not just research, buthe really is an expert.
Um it's gonna be aboutdatabases.
This is our episode number 21,uh, which means it's our 22nd
episode, because we started withzero.
Um like you're supposed to.

(00:48):
Like you are supposed to.
Thank you very much, Lua.
I don't know what's going onthere.
Um and MATLAB, by the way,starts at one.
What the hell is wrong withMATLAB?
What's up with that?
Anyway.
Uh well, Cobalt.

SPEAKER_01 (01:03):
There is stuff as well.

Wolf (01:05):
Oh, I didn't remember that.
There is stuff we normally sayat the beginning, like uh that
we have a website to go to,runtimearguments.fm, that lists
the episodes, and you can sendus feedback.
There's gonna be show notes andtranscriptions.
Um we'll talk about that at theend.
Uh in the meantime, uh let'sstart off with uh how's your

(01:28):
week?
How was your week, Jim?

Jim (01:30):
Oh gosh, I've had a pretty good week.
Um I I I may have mentioned itbefore.
I do all of my developmentremotely.
I I've got this big fancy Macstudio with lots of RAM and CPU,
and I use it as an SSH uhterminal to log into machines
located elsewhere.
Uh I'm finally uh gonna move uhas much development locally as I

(01:54):
can.
Uh our our in my day job, wehave this giant uh web-based
application.
It's a lot of uh Perl and a lotof JavaScript and and uh various
other things.
And um uh it relies on patApache and HAProxy and many,
many other technologies.

(02:15):
And I've always been hesitant tobring it home because I'm on a
Mac, it's running a uh M1 chip.
Yes, I've got one of the olderones, uh, but it's it's an
ARM-based chip, not an Intel.
So I've been hesitant to try todo this.
Plus, I didn't want to pollutemy machine with all the stuff
that I need for development.
Uh, and then I did some seriousthinking.

(02:37):
We're using Docker to deploy forour customers.
Um, so I thought, why not useDocker on my desktop and and do
it that way so I don't have toload up my machine with all
kinds of Perl libraries and andPostgres stuff and everything.
Why don't I just do it all in ain a Docker container?
So I started setting that up andI stumbled upon something that I

(03:00):
am just so amazed at.
Uh, and that is Docker lets youcreate an image that's
multi-architecture.
So you you create a Docker file,that's how you you build the
image, uh, and in that Dockerfile you list all the things
that you need.
Uh, and then when you run it,you I use Docker build X with a
bunch of arguments.

(03:20):
There's an argument you can passon that dash dash platforms, and
you list the platforms you want.
And in my case, I want x86 andARM.
So I list that, and it's amazingbecause it builds the image for
both.
In fact, it does it in parallel,so it's kind of neat.
Now I'm on I'm on the ARM chip,so it builds the ARM stuff

(03:41):
really, really quickly, but itit builds the x86 stuff fairly
slowly because it's doing someemulation.
I don't know if it's using X uhQEMU or or what, but that's kind
of slow, but that's all right.
Uh it it builds these images,and and when I push those up to
a registry like uh GitHub, I useGitHub as a registry, and I also
run my own local registry.

(04:02):
Um when I push it to myregistry, it pushes the whole
image up.
But when I run it, it only pullsdown the the layers that it
needs for that architecture.
So the the the pull is fast andit's really nice.
And this thing is workingperfectly.
I just I just love it.
So I'm starting to dodevelopment locally.
Now, uh uh Wolf, uh you and Ihad the conversation this

(04:24):
morning about the issue I raninto and and the way I develop
with Docker.
We're we're gonna do an episodeon Docker at some point, but um
you know Docker can the Dockercontainer is all of the tools
and stuff that I need, but Idon't have the source code in
there.
I mount the source code in adirectory, it's in a directory,
I mount that as a volume toDocker.

(04:44):
So it sits outside of the Dockercontainer.
Uh, but the Docker container cansee it because I mounted it.
I ran into a problem.
I I know this is getting kind oflong, right?
Uh I ran into a problem in that.
It was I was making changes tomy source code.
Um, but my my application wasn'tseeing those changes.
Uh you know, I I would hit a CGIscript and it wasn't getting the

(05:07):
right script.
And sometimes it looked like acorrupt file, even though I look
uh you know locally and thefiles are just fine.
Well, it turns out uh bindmounting doesn't actually exist
on Mac.
Uh it's a great thing for Linux.
You bind mount a volume toanother location, and it's the
same data, just with twodifferent references to it.

(05:31):
Um inside the Docker container,you see you see the data,
outside the container, you seethe data.
It's all the same data.
Well, you can't do that on Macbecause Mac doesn't have bind
mounts in the kernel.
Uh, what they have to do then issynchronize the data.
Uh, it actually copies the datafor use by Docker.
Um, and and you get out of syncto the point where I would have

(05:56):
to uh restart the Docker enginein order to pick up my changes.
That became unworkable.
Um, but I did find a trick, itcost money.
I actually had to upgrade andstart using the Docker Pro um uh
subscription, which gives mesome kind of fancy schmancy
synchronization, and that'sworking so far.

(06:16):
I've only been doing it foreight hours, but it's working,
and I'm happy, so I'm developinglocally.
This is going to be uh a wholedifferent way for me to work.
Uh and and the nice thing is Iwon't be relying on the internet
anymore uh or my customer sitesor or uh anything.
Uh so that's kind of neat.
So yeah, in a long-winded way,that was like five minutes of me

(06:38):
talking about Docker.

Wolf (06:41):
To me, the amazing part about that is not you moving
from remote development tolocal.
It is stuff you didn't say.
It is a Vim user who uses VSCode additionally.
It's not like you switched to VSCode, but Pro code.
Now you're using it.

Jim (07:01):
Yeah.
Yeah, I am.

Wolf (07:02):
I am.

Jim (07:03):
That's a whole world of difference for me.

Wolf (07:06):
You using um a bunch of AI stuff.
So lots of changes for you.

Jim (07:13):
Uh it's it's overwhelming and fun and and uh it's kind of
neat.
And uh you you didn't think Icould do this, did you?

Wolf (07:21):
I was based on the All I can say is watching you grow in
this way absolutely warms myheart.
And I'm gonna say the words thatwhen I say them to my wife, uh,
because she names something uhthat is like what what it's
already like for me, and I saythese words, she becomes

(07:42):
furious.
Uh I'm gonna say them to youwith the hope they don't make
you furious.

Jim (07:47):
Okay.

Wolf (07:48):
Welcome to my world.

Jim (07:51):
Thank you.

Wolf (07:52):
You're you're you're you're welcoming me with open
arms.
I am.
So how was your week?
Let me tell you, let me tell youabout my week.
Um lots and lots of stuffhappened this week.
Uh one of the things is um Jimpointed me to some uh Claude

(08:12):
Best Practices documents onAnthropics site.
And a consequence of that, firstof all, I try to get better
every day.

Jim (08:25):
Whatever things that you do, you are so good at that.
I tend to stick with what worksand don't venture out much.

Wolf (08:36):
So um I'm always trying to reduce friction.
I'm always trying to figure outwhat's the smarter way to do it
than what I'm doing today.
Can I do it that way tomorrow?
And uh I got pointed to thisbest practices for for a Claude
Code document.
Jim, you pointed me.
And I have slimmed down myClaude Code files.

(09:01):
I have taken things that I don'tneed Claude to keep in mind
because that's the that's thething that's a limited resource
for Claude.
When you're working with Claude,there's a thing called the
context.
And the context is everythingit's got in mind solving your
problem right now.

(09:22):
Um it's it's all in one session,and when it gets to as you
develop more and more, Claudeslows down.
Eventually you have to clear it.
Claude sort of forgets thingsunless you tell it to remember.
Things get pushed off the end.
So your Claude Your Claude MDfiles, um, they're they're in

(09:44):
memory, they're in your contextthe whole time.
So they need to be supervaluable and super slim.
And if there's stuff you onlyneed Claude to know about and
understand for just a moment,like for instance, when you
start up a new task or closedown an old task, uh, for

(10:06):
instance, uh, that it needs toupdate your work log.
That's a thing I do, that youwant it to um uh uh commit and
push or uh figure out whatthings might be.
Any anything that you don't needClaude to keep in mind the whole

(10:26):
time, instead of making that bepart of your Claude MD, you make
it into a skill.
And skills are things that youinvoke by hand with a name.
Uh there's built-in skills, or Idon't even know if they're still
called skills when they're builtin, but for instance, in the
latest Claude, you can say slashinsights, and it will tell you

(10:50):
about all the things you've everdone with Claude Code and
whether you succeeded or failedand how you can be better.
It does that with a web page,it's really great.
But I have um new skills formoving between tasks and
remembering what I've done sofar, things like that.
Um, and I've pulled out thespecific languages that I work

(11:16):
with.
I work a lot with Python, I worka little bit with Bash, and I'm
learning Rust and not giving itnearly enough time.
It doesn't need to rememberthose things all the time.
For instance, at work, I'm onlyusing Python.
It doesn't need the other two atall.
So why put these in my Claude.mdfile?

(11:37):
Don't.
Um, but also it doesn't need toknow the things I care about
with Python, like for instanceto always use type annotations
and stuff like that, until Iactually want it to maybe write
some Python.
So in my ClaudeMD file, I say,and if you're going to write

(11:59):
some Python, go look at thislanguage file.

SPEAKER_01 (12:02):
Oh neat.

Wolf (12:02):
That way, not in my context, giving me more and more
room.
So I'm getting better and betterand better at um reducing
friction in the places whereClaude can help me.
Um, and mostly that is about umcontext switching, my contexts,
not Claude's context.

(12:24):
Moving from one task to another.
Not nearly as much about, hey,Claude, write some code for me
as everybody wants.
Everybody's all about, oh,Claude can write code for me.
Claude, at least right now,Claude does not write code that
is good enough for me.
It's good enough for tests,right?
But it's not good enough for youknow the principal code.

(12:47):
Um always read.
Always read what Claude isdoing, always know what it what
it means.
Make sure you understand.
Um but that's what I did duringmy my uh cool.

Jim (13:00):
I I I've got something I want to say.
We uh we've mentioned this many,many times.
This whole podcast came out ofWolf and I meeting for lunch
every Saturday at a sushi placeuh near Ann Arbor.
Uh so we're still doing that.
And we've uh started invitingmore friends.
It used to be just me and Wolffor years, it was just the two
of us.
Uh, but now it's it's become agroup.

(13:21):
There's like at least four of usthere every week.
And uh if you're in the area,please come come to Biwaco uh uh
in Saline, uh, 11 o'clock onSaturday mornings.
Um when we were there last week,uh uh one of the things that
Wolf says a lot, and it's kindof funny, is he says he talks
too much, right?

(13:41):
We we would you just say that'saccurate, Wolf.
You say you talk too much.
And I'm gonna go on record hereand say, yeah, Wolf talks a lot,
but you know, when he does talk,it's always good information.
I I've been around a lot ofpeople, or at least a handful of
people, who talk and talk andtalk.
They talk too much, and I walkaway from the conversation

(14:04):
without any additional knowledgethat I had going in.
With Wolf, he he talks a lot.
I'm not gonna say it's too much,because when I walk away from
that conversation, I've learnedsomething.
Uh, and if you join us forlunch, you can learn too.
And we'll learn, we'll learnfrom you, whether you want to or
not.

(14:25):
But it's always uh uh enjoyableuh to to especially when Wolf uh
talks about something he'spassionate about.
Uh it's a lot of fun.
And I do have a lot of passions,yeah.
So he does not talk too much, hejust talks a lot, but it's good
talk.
So yeah.
It is too loud though.

(14:46):
It can get loud.
Uh the more passionate you areabout, the louder it gets.
But again, did you ever see mecomplain about it?
No.
No.
It's it's fun.
It's fun to watch him get woundup.

Wolf (14:59):
Um, you know what it's time for, though?
Uh feedback?
It is time for feedback, and uh,I know you've got a couple
things, but let me start.
Um last episode, we were talkingabout Git.
Yes, and I I misspoke.
I said something, and Iimmediately knew it was wrong,

(15:23):
and I beat myself up about itafterwards when I re-listened to
the episode, and that was this.
Um before the days of Git andMercurial, um, it was the case
that uh in Git and Mercurial,the central repo is exactly what

(15:45):
you have on your local machine,um, except maybe on your local
machine you also have sourcefiles, um, not just a database
of objects.
Um in the old days, the thingthat you had on your machine was
source files, but it couldn't beused as a database.

(16:07):
You couldn't go back in time.
Anytime you wanted to find outabout histories or previous
versions or whatever, you neededa network connection and you had
to get to the central repo uhrepo.
Now I talked about what wasactually at that central repo,
and I said something very closeto these words.
I said CVS stores um theoriginal source file in a way

(16:36):
where it is the text plus umhunks of changes that will get
you back and forth between thedifferent versions, but all in
one file.
And the thing that was uh amistake about the words that I
used there are that I said CVSwhen what I should have said was

(16:58):
RCS.
RCS uh, of course, was one ofthe earlier the revision control
system.
Um CVS uses uh uh hidden historydirectories for each uh
directory that you're in uh thatgets you the changes, and the
source file that you that isactually there is the latest

(17:21):
version.
So it's easy to get the currentthing, and you can go back in
time if you want.
Um so that was my mistake.
I said a thing with CVS, but itwas really RCS.
That's that was feedback fromme.

Jim (17:33):
From the last episode, episode 20, uh on uh Git.
What do you got, Jim?
So I've got a couple of things.
Um episode three, way back inMay, we talked about
WebAssembly, uh WASM.
Um I just came across a reallygood article just last week
about the history of it.
Um and I I I thought it wasreally interesting.

(17:55):
The the the way the browserdevelopers all got together uh
and pushed for it, um, withoutmanagement really knowing what
was going on.
Uh uh the the Microsoft peopleuh working on uh their browser
and trying to include WASM andthe Mozilla people and the
Chrome people, um, they all sortof colluded together, and Apple
also with Safari, they allcolluded together and they all

(18:18):
told their management that theother guys are doing it.
And I thought it was prettyfunny.
That's how they got it bymanagement because you know, try
to tell management you want todo something that that that it's
gonna take a while and um you'renot really sure about the
benefit yet.
Uh you might get pushback frommanagement, but they just told
the management, yeah, the otherguys are doing it.
We have to do it.
Uh anyway, there's a really goodarticle.

(18:39):
I'm including the link to thatarticle in the show notes so
that uh you can go read thearticle as well.
I thought it was kind of neat.
Uh, the second piece of feedbackI have is um uh it's sort of
related to the our very firstepisode, episode zero, pass
keys.
Um, uh, you know, I'm I'mventuring into the world of uh

(19:01):
of AI and I'm using Claude Code.
And uh our good friend Marlonsuggested that I ask Claude Code
to write me a pass key serverand client.
Uh and of course, for me theserver is in Perl.
For Wolf, it would have been inPython, I'm sure.
Uh created the server in Perland the client in JavaScript.

(19:24):
Obviously, JavaScript for theclient makes sense, right?
So I did that, and 10 minuteslater, I had a working
implementation of a pass keysetup uh where I can take that
server code and incorporate itinto my web app and the client
code.
Of course, I gotta inclincorporate that into the into
the um uh uh the the client sideof my app.

(19:46):
But had I had to go searchingfor that and and piece together
all the little pieces to makethat work, I I would still be
doing it two weeks later.
Uh, I was just really impressed,and thanks to Marlon for
suggesting that it was a reallynice exercise.
Uh and to top it off, I tookthat code that I generated uh
and I put it into a repo, andyou can find it on our GitHub

(20:09):
repository, the runtimearguments GitHub repository.
And I've included a link forthat in the show notes as well.

Wolf (20:16):
I just want to make sure, Jim, when Claude wrote Perl for
you.
Yes.
You did read that Perl, right?

Jim (20:28):
Am I supposed to read it?
I took a look at it.
I'm not incorporating this intomy application until I really
read it.
But just the fact that it workedwas pretty amazing.
That's as far as I got with it.
Uh if I do take that code anduse it, I will read it.

Wolf (20:44):
A long, long, long time ago, when I was in my early 20s
and I was working on my firstreal application, which was a
word processor for theMacintosh, the when the
Macintosh looked like a toasteror whatever it was.
Yeah, the Fat Mac.
That was the one we were writingfor.
And it was called Full WrightProfessional.
Yeah.
Um, Full Wright Professional,uh, the code name for Full

(21:08):
Wright Professional was DancBear.
And the reason it was calledDancing Bear uh is very strongly
related to one of the sentencesyou just said, and that is this
the amazing thing about adancing bear is not how well it

(21:28):
dances, it's that it dances atall.
Yeah.

Jim (21:33):
That's that's good, and that's kind of where I'm at with
that.
Before I incorporate any of thatcode into my system, I'm gonna
make sure I understand it.
And I quite likely won'tintegrate it directly as it was
written.
Uh, this is more of an exercise.
In fact, I know that they'reusing some of the tools that I
don't use in my Perl code, buthere's a working example that I
can steal from now.

Wolf (21:54):
All right.
With that, uh, let's get to themeat.
Um Jim?
Yeah.

Jim (22:01):
Yeah.
Today we're talking aboutdatabases.
As Wolf uh mentioned, this issomething that he calls me an
expert.
I don't know about that.
I do like databases.
Um, I I enjoy working with them.
Uh, and uh I'm gonna start rightoff by saying the database that
I use professionally for severalyears now, gosh, for 25 years
now is Postgres.

(22:22):
Uh so a lot of what I say aboutdatabases is gonna be slanted
towards Postgres because I don'thave the experience with others.
I I've played with SQL Serverand MySQL, but it's all Postgres
for me.
But that's not really what we'retalking about.
We're talking about databases ingeneral.
Um a lot of times when I do anepisode like this, I go deep

(22:43):
into the history and I spend 20minutes talking about the
history of databases.
I'm not gonna spend that muchtime on it.
I I will mention that the uh theterm database goes back to 1962.
Uh uh that's according to theOxford Dictionary.
I would have thought it wentback much further than that.
Um but uh it it it goes back to1962, and that happens to

(23:03):
coincide with the available ofuh hard disks, uh rotating
media, um, which again seemslike that would go back further.
But I guess you know, back inthose days they weren't storing
on hard disks, they were storingit on on tape and punch cards
and and those kinds of things.

Wolf (23:21):
But the name database um just reminds me that uh the the
older I get, the more time Ispend programming, every problem
you face is at some level oranother a database problem.
Maybe it doesn't take Postgresto solve the problem.
Maybe it's SQL, uh SQ Lite,maybe it's not even um uh an

(23:47):
official database, it's just aspecial data structure, but it's
it's a data problem, it's adatabase problem.
It is anyway.
I I didn't mean to interrupt,keep going.
A data problem, yes.

Jim (23:57):
Uh maybe that's why I I gravitate towards databases
because I sort of like how theywork and and they they they've
proven very, very useful for me.
I I I'm fascinated with data ingeneral, and a database is just
a huge collection of data.
Um gosh.
Uh interesting, one of thethings I found in the research
was uh one of the firstdatabases actually created was

(24:20):
the IBM IMS InformationManagement System.
And guess what?
They created that for the Apolloprogram.
Remember we talked about uh thekey fob episode about is your
key fob more powerful than thethan the rock than the computers
that took a man to the moon.
Um uh yeah, that's where that'swhere IMS came from, the Apollo

(24:41):
program, uh running on uh on asystem 360.
So that was kind of neat.
Uh it wasn't until 1970 that avery, very important figure in
the world of databases, uh EdgarCod, uh EF Codd, he wrote a
paper uh titled Relational Modelfor of Data for Large Data
Banks.
Here he's talking about databanks, we you know, uh

(25:03):
databases.
Uh he lays out the whole new wayof organizing and accessing uh
data uh using tables and rowsand columns.
That sounds familiar, doesn'tit?
Um queries would join the tablesbased on relationships between
the tables using a set ofoperations based on the
mathematical system ofrelational calculus.

(25:24):
I I don't think of relationalcalculus calculus when I'm
working on a database, butthat's underneath the hood, uh
the the methods that they used.
Um he powered the the the theidea of normalized tables.
Uh and if you're in databasedevelopment, you know about
normalization, and that is whereyou don't repeat data.
Uh you you have uh like ifyou're a classic example, uh a

(25:49):
sales order, right?
You got the header, whichcontains who the client is and
many other things, and then yougot the detail table that
contains all the details for theorder, the line items for the
order.
Um that comes out of uh EF COD'suh uh normalization of tables.
Get the data in the right placeso you don't have to repeat it.

Wolf (26:10):
I would love, I would love to be somewhere where I where
the database actually wasnormalized.
I don't yeah, I don't know thatI've ever seen it.

Jim (26:21):
It's uh to me, it's trouble if you try to go for a hundred
percent normalization, right?
Um, but you should strive forwhat you can.
Um again, you know, think of thethe header uh uh detail tables,
uh that kind of a thing.
Put the data in the right place,uh, and it and it all makes

(26:41):
sense.
Um by basing this thing onmathematics, um it's all in
mathematical terms.
So queries can be rewritten andto it to optimize them uh and be
proven correct throughmathematics.
Uh you know, I was a math majorin college, and this is still

(27:01):
way beyond what I understand.
How they prove that a query, arewritten query is correct.
I I don't know, but it it's it'sthe basis of query optimization,
which is huge these days.
I mean, the database I workwith, Postgres does an awesome
job at optimizing the query whenI tell it what I want.
It figures out the better way toget it.

(27:21):
Um, and that's again going backto back to mathematics.
Uh, and all of this work wasdone before SQL was invented.
Uh the document, by the way, Ido have uh our show notes are
going to be very, very busy thistime because I have a link to
that document from EFC in theshow notes.
So definitely check that out ifyou get a chance to take a look

(27:43):
at it.
Um his paper, by the way,inspired uh a lot of work at a
lot of teams at variousuniversities, including uh
here's another name that'simportant in databases, and
that's Michael Stonebreaker.
Uh he created Ingress, which wasa uh a relational database.
Um he was at the uh Universityof uh of Berkeley in California.

(28:08):
He created that.
Uh he left the university and uhtook his Ingress code and
created a new project um calledProgress.
I'm sorry, not Progress,Postgres.
Progress is another database.
He created a database calledPostgres.
Um that's the beginnings of thedatabase that we're using today.

(28:35):
Um I I thought that was kind ofneat.
Uh it didn't include SQL rightout of the box.
Uh that was something that gotadded a few years later, and
that's when they changed theproject named up Postgres QL.
That's the official name of whatI use uh these days.
Um in the mid-70s, though,before Michael Stonebreaker did

(28:58):
his thing, uh IBM created adatabase called System R.
That was the firstimplementation of SQL, the
structured query language.
Um so that's that's important.
And I know a lot of people arescared of SQL because I I
believe because they don'tunderstand it.
Um people shy away from it.
Um I I I sort of embrace it.

(29:19):
I love the way SQL works.
It I can I can write some crazyqueries and get that kind of
data out that I want.
Uh, before SQL though, a lot ofpeople were writing code and
they were relying on ISAM andVSAM.
You remember those terms?
Wolf, you've run across thatbefore, haven't you?

Wolf (29:36):
Uh-huh.

Jim (29:36):
ISAM uh index sequential access method.
Uh there was files with indexes,uh, they were all separate files
with different data in differentfiles, and your program had to,
for instance, read the salesheader record out of one table.
Uh, once it had the sales uhheader record, then it could go
read the details out of anothertable.

(29:58):
And then programmatically, youwould piece that data together
and produce a work order or asales order or an invoice or
whatever you had.
Um, it was a really common wayto work uh if you did COBOL.
And and I did a lot of COBOL uhback in the day, and it was all
using ISAM.
Uh VSAM is kind of similar as avirtual storage access method.

(30:19):
They created some more accessmethods, so you could uh read um
uh sequentially, uh, you couldread indexed, you could read
relative.
So if you're on one record, youcould get to the next record,
uh, that kind of a thing.
Um and that was kind of neat.
Um, but but then you know let'stalk about real databases, the

(30:40):
kinds of databases that we usetoday.
Um maybe a little hit uh uh alist of some of the old
databases first, is the olddatabases first, and then we'll
talk about the the the currentones.
Uh but way back when there wasIBM DB2.
It was a big, big database thata lot of people used.
Uh Informix uh database.

(31:00):
Do you remember?
Have you have you ever playedwith that?
That was the first database.

Wolf (31:05):
Not Informex, but the very next one you're gonna do.
Okay.
If you know, yeah, okay.
Well, I'll tell you when you getthere.

Jim (31:11):
Yeah, okay.
Uh Informix, that was the firstreal production level database
that I used back in the late80s.
I wrote a uh uh uh tickettracking system uh for the
company I worked for and foranother company.
We we both shared it.
Um and and that was for you knowtaking customer uh uh tickets

(31:32):
and storing them in the in thedatabase.
Um and it used a language calleduh Informex 4GL, um which we
don't see much of these thingsanymore.
Um the the uh 4GL is fourthgeneration language.
Uh the whole idea was a wholelanguage and database built
together so that you had formsand reports and queries and data

(31:55):
all in one package, um, so youcould write your programs and it
would access the data and showit on the screen or print it on
paper or whatever.
Uh informics was one of the bigplayers in that world.
Um uh for uh corporationsanyway.
Uh let's talk a little bit aboutthe some of the smaller players
though.
Like like Wolf, you you said youused this D Base.

(32:19):
Remember that in the uh I doremember it.

Wolf (32:22):
And the reason I remember it is because that word
processor I talked about.
Yeah.
Uh my little company andArpersoftworks, we got bought by
Ashton Tate and moved out toCalifornia, and we were in the
building with the D base guys,and we um we used D Base, um,

(32:42):
learned all about it.
It was interesting and new andum Wow.
So I had close-up knowledge.
That's pretty neat.

Jim (32:51):
Um that was uh like a 4GL, right?
It had a programming language,it had a forms generation thing,
it had a report generator, uh,of course, it had the database.
Uh all that stuff built into onething.

Wolf (33:03):
We actually um kind of have something today that is not
unlike that.
It isn't really one piece, butto a user, to a to the to the
engineer, to the programmer, itacts like one piece.
I'm dying to know what's Ruby onRails.

Jim (33:22):
Yeah, okay.

SPEAKER_01 (33:24):
Yeah.

Jim (33:25):
But Ruby on Rails doesn't really include the database,
does it?
Um you had to set up a database.

Wolf (33:30):
The whole point of Ruby on Rails is uh that it is the
database and the ORM and all theaccess and the views and like
that's the point of Ruby onRails.
That's why people love it.
Because it solves that problemthat everybody has.

Jim (33:50):
Well, that's neat.
Um there were some variations toD Base, uh, and they were all
really built on uh uh they werelike clones of D Base.
I think they even worked with DBase files, and that was uh
Clipper.
Remember that one?
Uh Foxbase, Fox Pro.
These are some tools I I'venever used, but the next tool I

(34:10):
used quite a bit, and that wasuh Microsoft Access.
That was um that was a wholeenvironment.
You had a database uh using theJet database engine, you had the
query language, the forms, umeverything.
Um, and you could write programsthat interacted with the screen
and saved the data in thedatabase and allowed you to
print reports.

(34:31):
Um, I did quite a bit of work inthat.
Um, it was it was kind of neat.
In fact, I was playing aroundwith that and I wanted to access
the data outside of Access.
Uh wanted to get access to itright in a C program, uh, the
data in the Access database.
And I saw the announcement thatMicrosoft made about uh ODBC.

(34:54):
Remember ODBC?
It's still out there.
Um, it was very, very early on,and I I I think this might have
been predated um uh email even.
I think I had to call a phonenumber because I wanted to find
out what it was.
I called the phone number, itwas Microsoft in Seattle, and
the guy was really interested totalk to me, and he didn't know
how little I really knew.

(35:16):
Uh he he he was all interested,and several days later a package
showed up uh on my door, um,which was this thick printed
manual, and it was it was not anicely bound manual, it was like
somebody printed it out and puta cover on it and mailed it to
me, and uh uh uh some floppydisks with the code on it, and

(35:38):
basically anything I needed toget started uh writing C
programs to access a JETdatabase or access database.
And I was so somebody sent methis.

Wolf (35:49):
It's amazing you what you can get when you just ask.

Jim (35:53):
Yeah.
I just asked, and I was flooredthat Microsoft was sending me
this, didn't charge me anythingfor it.
I think they were looking forpeople that had interest in this
kind of thing and wanted to getit out there.
And I I I'm still impressed tothis day that that that
happened.
I thought it was pretty neat.
Um anyway, another database thatI think it's got its roots in

(36:13):
the in the 80s, maybe, and Imentioned it earlier.
Uh I misspoke.
Progress, the progress database.
That's another 4GL.
Uh, and I mentioned that becauseum I have some friends that have
done a lot of work in progress.
In fact, uh big companies likeQuicken Loans, which what are
they now?
Rocket Mortgage, they're athey're a progress shop, they're

(36:36):
using progress uh in their backend, and I find that really,
really interesting because it Iknow my friends like it.
I I was never all that impressedwith it.
I did some work on it and and II never really cared much for
it, but it did have a fulllanguage with forms and reports
and and all that kind of stuff,and it was pretty neat.
Anyway, let's get into somemodern database stuff.

(36:58):
Um, there's a few terms that Ithat I'm I'm gonna talk about
ahead of time before we talkabout the databases themselves.
And one of the one of the bigones is acid compliance.
Have you seen this?
A C I D.

Wolf (37:10):
Um I've seen it mentioned.
I don't actually know what itmeans.

Jim (37:15):
What are the tasks?
Um the letters A C I D.
A is for atomicity.
Uh transactions are atomic.
You you do a transaction and itall works or it doesn't.
And if it doesn't work, it'sit's never partially committed.
Uh when you commit atransaction, the whole
transaction, uh, that data getswritten to disk.

(37:37):
It's part of your database.
If anything fails along the way,uh you get a duplicate key
violation or some other uhconstraint violation, or there's
a hardware failure, it won'tleave your database in a in a
half-written state.
Uh that's atomicity.
Um the C is for consistency.
The data must be valid.

(37:57):
You can you can set upconstraints on your data.
You can say that this field is adate field, and it better
contain a valid date, or it'sit's not gonna update, right?
Um you can have triggers thatfire when the when the when when
the update happens.
Uh a really big one forconsistency is referential
integrity.

(38:18):
Remember, I talked about theexample of the sales order where
you've got the sales orderheader and then the detail
lines.
Uh well, with referentialintegrity, you would never be
able to write a detail line ifthe header didn't exist.
That's if you've set up yourforeign keys properly, like you
should.

Wolf (38:35):
Okay, so now you just said a new word.
Yeah, a word that happens to bevery important to me, a phrase,
and that is foreign keys.

Jim (38:44):
Yeah.
Well, okay, a foreign key isthat the detail table in this
case has a field in it as one ofits columns uh that is a pointer
to the master or the the uh theheader table, uh sales order ID.
So every detail record has uh asa foreign key setup uh that

(39:04):
points to the uh uh headerrecord.
So if you're working on ordernumber one, two, three, four,
all your detail lines containthat field, uh contain a column
with that value in it, one, two,three, four.
And then you've add a headerrecord with the ID of one, two,
three, four.
Uh you can't insert a row thatuh points to a uh uh a header

(39:28):
that doesn't exist, and youcan't delete the header record
if you have detail recordspointing to it.
That's reference referentialintegrity.
It's hard for me to say, but I II use it all the time.
Um that's what what helps keepyour data consistent.
You won't have uh dangling uh uhdetail lines out there with no

(39:52):
header.
Um it's really quite important.
Um the I in ACID is isolation.
Uh that is uh really it involvesuh locking and concurrency.
And here's a big one for youthat's part of that MVCC,
multiversion currency control.
That's a whole uh we could spendhours talking about that.

(40:15):
Um, that's this whole idea thatwhen I start a transaction and I
select some data for it and Istart updating the data, I have
a picture of what that datalooks like throughout the entire
uh processing of thattransaction until I finally get
to the point where I do myupdate and I commit my
transaction.
Uh that data looks the same tome.

(40:37):
Other people outside of thattransaction, they might query
the database, they're not gonnasee uh the partially uh uh
complete um uh updates that I'vedone.
Like I might in my transaction,I might do inserts and updates
and deletes and stuff.
But people outside of thattransaction, other processes,
they don't have any idea thatthat's going on.

(40:58):
They they won't see it at all.
They're isolated from me and I'misolated from them.
Until I commit that database, uhthat that uh transaction, uh
they won't see any of that.
Now, once I commit it, if theydo a select, they're gonna get
the updated database, uh theupdated data.
Um so that's that's yourisolation.

(41:19):
And then finally, the D in assetis durability.
Uh, once a transaction has beencommitted, it'll remain
committed even in the case ofsystem failure.
Um that's really important.
When you write data to thedatabase, you want to make sure
it's there.
Um, if a system uh were to fail,uh you want to make sure that

(41:42):
the database management systemtakes care of you.
And uh while the data might notbe in the database tables, what
the what they do is they writeit to a a wall file, a
write-ahead log.
So this this file is a list ofall the transactions that
happened on the system.
Every single uh insert, update,delete, uh, even the DDL stuff,

(42:04):
the the database definitionlanguage stuff, like create
table.

Wolf (42:08):
And it has log in its name, but it's not the kind of
log you're thinking of.

Jim (42:14):
It's not a log you can go look at.
Um it it gets written to the logfirst before it it does the
actual updating of the tables.
Um and when it does that write,it does a synchronous write or a
uh uh an F sync uh on the on theuh table uh on the file on disk

(42:35):
to make sure it's written sothat you you don't lose data.
If you uh die in a in a crash,when you bring the database back
up, it will look at the data andit'll look at the wall file and
it'll make sure that all of thethings in the wall file have
been applied to the database.
And if they haven't, it willapply those for you.

(42:56):
And it may take a few seconds,it it may take minutes, um,
depending on your system and howlarge your database is.
But your data uh has durability,it will be there.
And and you know, in my world,I'm I'm in the healthcare
industry.
Uh when we see a patient and weupdate their their medical
records, we want to make surethat data is there.

(43:16):
Uh we need we need confidencethat it's there.
So that's the durability part.
So that's acid.
Um so now, Wolf, you know whatacid is, right?

Wolf (43:24):
We and with a lot of the things you just said um sound
like the things that Git broughtto the table over CVS.
Like, for instance, isolation uhand atomicity.
Uh a commit in Git might change50 files, but when you're
looking at the server, somebodyelse who's not you, even if

(43:49):
you're in the middle of a commitand that person isn't, they're
either gonna see exactly what itwas before your commit, or
they're gonna see Exactly whatit was after your commit.
Right.
They're not going to get halfwaythree of the 50 files, which
with CVS, they could have gottenjust three of the 50 files.

Jim (44:11):
And notice notice how git uses the word commit.
Just like Databas.
Right?
Yep.
You do git add, and that's uhlike that's like an insert or an
update in SQL or a delete, andyou do commit and it it commits
that data to the to the uh tothe you know to the git uh file

(44:32):
store, whatever that is.
Um another term uh I'm gonnatalk very briefly about, and
that is uh you've probably seenOLTP and OLAP, right?
Most of the databases, in fact,all the databases that I ever
work with are OLTP, onlinetransaction processing.
Uh keyword there is transaction,uh, and that's that's what makes

(44:52):
them uh uh ACID compliant.
You you write data in an OLTPdatabase and it's gonna be uh uh
uh atomic and consistent andisolated and durable.
Um the other one though is OLAP,O L A P.
That's online analyticalprocessing.
That's what you do uh for um uhif you want to do uh data

(45:18):
analysis.
You're not doing much in the wayof updating that database.
You just have a pile of data andyou want to analyze it using um
whatever tools you have.
Um uh that would be an OLAPdatabase.
Those are typically very, very,very fast at reading and
probably quite slow at writing.
Uh, but what you're interestedin is the reading.

(45:40):
You load up this database, nowyou just start analyzing the
data.
That's what an OLAP database isfor.
Um, and here's maybe the biggestuh uh term we're gonna talk
about, and that is SQL,structured query language.
That's the whole language, uhincluding select and insert and
update and delete and uh begintransaction and commit and or

(46:03):
begin work rather that startsthe transaction and commit and
and all those rollback.
If you decide halfway throughupdating uh you you don't want
to continue, uh you issue arollback statement and it'll
undo everything you've done andnobody will be the wiser.
It's just it's like it neverhappened.
Um that's SQL.

(46:25):
Um stored procedures.
Um that's an interesting one,and that's that's really kind of
why I picked Postgres over someof the others uh back when I was
choosing it back in the early2000s.
Uh stored procedures are theability to write little snippets
of code and sometimes bigsnippets of code and store those
in the database and call them asfunctions on your data.

(46:47):
Like I've got stored proceduresto do some fairly simple things
like uh format a phone number.
Uh you know, there's aninternational standard for phone
numbers.
Uh there's a number that goesalong with it, I forget what it
is.
But a phone number would besomething like plus the plus
character, a country code in theUS it's one, and then an area
code and a uh uh uh theseven-digit phone number.

(47:10):
Um so I I've got a function thatwill take a phone number and and
print it that way for me.
Uh I've got other things forformatting dates and formatting
names.
I've got a lot of storedprocedures that I use.
And I just call those in myselect list.
Like I'll say select um uhformat phone number, and then in
parentheses, the the phonenumber column.

(47:31):
And then the output of that isgoing to be a nicely formatted
phone number.
So let's store.

Wolf (47:35):
This immediately gives me a question.
Yep.
Um and and that is um I love thefact that uh at least with
respect to databases thatunderstand SQL, SQL is roughly
the same for all of those.
Um pretty close.

Jim (47:53):
There's little standard body that they all try to comply
with, but they all have theirown extensions to the language
as well, including Postgres.
They they've got their ownPostgres isms.
But stored procedures Those areall over the place.
Right?

Wolf (48:11):
Yeah, what the what the hell's going on there?

Jim (48:13):
Well, there's no real standard to that, I don't think.
In fact, there might be astandard, but Postgres has their
language, uh PLPG SQL.
Uh SQL Server's got uh transactSQL.
MySQL has stored procedurelanguage, they don't give it a
name.
Uh the thing that I findinteresting though, at least
Postgres and MySQL, um, I thinkMySQL does this.

(48:36):
They allow you to uh use otherlanguages for your stored
procedures.
Postgres lets you write a storedprocedure in Python if you
enable that, if you load thatlanguage extension.

Wolf (48:45):
I like it.
That's yeah, you like italready, don't you?

Jim (48:48):
Uh I I can write stored procedures procedures in Perl if
I want.
Um, I I've I've opted never todo that.
I just use the tool that theygive me.
I don't want to go embeddingPerl into my database.
Um it doesn't seem right to me.
Uh and there's a lot of peoplethat say you should never
include business logic in yourstored procedures.

(49:09):
I I'm not sure I agree withthat.
I think the one thing aboutstored procedures is it lets you
um um set a constraint where thedata has to uh uh fit a certain
thing.
Uh and the stored procedure isthat test of of uh uh of the
fitness uh which sounds likebusiness logic.

(49:33):
It does.
It does.
Now some people would wouldimplement entire businesses in
stored procedures.
I I've not gone that far, but Ido implement some constraints uh
it in there, and I do I do uhstored procedures that like
well, let me give you an ex asimple example.
Uh every one of our c of ourtables there has a created and

(49:54):
an updated timestamp in it.
So I've got a really simplestored procedure.
It's a trigger.
I'll talk about that in a sec.
But every time I do an insert oran update on a table, it updates
those columns for me.
An insert's gonna uh gonna fillin the created and the updated
column.
An update isn't gonna touch thecreated column, but it's gonna
fill in the uh the updatedcolumn so that anytime anybody

(50:16):
updates data in the database, weknow when that row is created
and we know uh when it was lastupdated.
That's a stored procedure.
Uh I've got others that you youyou insert into a table and it
might take that data and insertit into other tables, like a log
or something.
That's closer to business logic.

Wolf (50:35):
Um The thing that uh I feel is great about stored
procedures is they're sort ofthree kingdoms.
There's the database machine,there's the client machine, and
there's the pipe in between thetwo.
And uh the database machine isprobably big and fast.

(50:58):
And the client machine probablyisn't big and fast.
And the pipe is the skinniestpart of the whole system.
Yeah.
So anytime you can not send dataacross the pipe and you can
attack that data with thebiggest, fastest machine that's

(51:18):
available, which by the way, didI say that was the database
machine?
Because it is.
Yeah.
Um, that seems better to me.
So if there's a place where youcan do work and not send the
data to the client to do thework, that can be a win.

Jim (51:38):
It it could be argued both ways, I think, because if you
have a thousand users connectingto the database and they're all
pushing data to the database,and then it's causing lots of
processing to happen on thedatabase, besides the normal
write the data to a file ondisk.
Um, that might not be a greatthing.
Yeah, sure, it's a big powerfulmachine, but a thousand users

(51:58):
pushing their data at it andexpecting that machine to do all
kinds of complex logic, maybenot the best idea when you could
just let the clients do that,right?

Wolf (52:09):
Yeah, I do hear you.
I think the reason myperspective is slightly
different from yours is in myworld, uh in my database,
there's not nearly as muchwriting.

Jim (52:25):
Um, doing a lot of reading and writing, and and and I have
to be careful of the the loadthat I put on the database
server.
We don't have a problem with it.
We do have a big, really fastmachine uh for the database, and
it performs really, really well.
But I I'm I'm uh I recognize thefact that I don't want I don't

(52:46):
want to put too much non-data uhnon non-file writing stuff in in
there.
Um but here's another reallygood reason.

Wolf (52:55):
We're both thinking about the same thing.
Yeah.
I mean, even though we havedifferent constraints, the fact
that we are thinking about thisproblem, that's important.

Jim (53:05):
Yeah, and I think I I'm fine with putting some some
business logic in there becausethink about it, you've got a
database server and it's it'sgot the data.
It's the one thing in common.
You you might have lots ofprograms out there, lots of
clients out there that could bewritten in all kinds of
different languages.
Um and the database is where youget to impose the strict

(53:26):
checking of the data.
Um you know, somebody can writea Perl program to access the
database, somebody could write aPython program.
Somebody could sit there atPSQL, the command line tool for
interacting with the database,and they could type in the SQL
to insert a row into this table.
All of those things, uh, the onething they have in common is the

(53:47):
database.
And if the database is in chargeof making sure what data is
correct and what data is not, uhI like that idea.

Wolf (53:55):
Yeah, in a video game, um there's almost always a server,
in a multiplayer one, there'salmost always a server.
The server has to decide, ohyeah, you shot the guy, he's
dead.
If the client decides that it'sjust an invitation to cheating.

SPEAKER_01 (54:13):
Right.

Jim (54:14):
Good point.
Uh so I said I'd mentiontriggers.
Triggers are a way to cause codeto run when something happens.
Think of it as an event.
Like an event happens when youinsert a row into a table, or
when you update a row or youdelete a row.
Those are all things that cancause a trigger to fire.
And all that means is it runsthat attached code when that

(54:35):
happens.
Uh, and you can have triggersthat are either row level or
statement level.
So uh for instance, if I if I uhupdate a thousand rows in my
table, a row level trigger willfire once for each row that gets
updated.
If I have a statement leveltrigger, that trigger will run

(54:55):
once, no matter how many rows Iupdate.
One row, a million rows, it'sonly gonna fire that thing once.
So, you know, think about whatyou're trying to do in your
stored procedure, whether youwant row level or statement
level.
Um next thing that I think isreally important.
I I hold this one near and dearto my heart, and that is
replication.

(55:16):
I I love replication.
I've talked lots about replicatereplication.
I've given talks at our usergroup on it, uh, I've talked to
people about it.
I can't I can't stress thisenough about how important
replication is.
And that is the ability to haveyour main database uh with all
your data in it, and then have areplica someplace else.

(55:37):
Preferably, uh well, I I likehaving one on site and I like
having one remotely.
So this is your disasterrecovery plan right here.
If something were to happen tothe main office, uh that
alternate site, the remote site,has a replica of the database,
and you're good to go.
You haven't lost any data.

(55:57):
Um, this is not in place of abackup.
Backups are still really, reallyimportant, and there's always
mechanisms to back up adatabase.
And I highly suggest you dothat.
In fact, I I would really,really encourage that.
Uh, but replication I just lovebecause I'm not gonna get a call
uh that uh the data, the the theserver crashed and the data's

(56:19):
gone.
You know, what are we gonna do?
And a backup is only as good ashow often you do it and and how
good your backup is.
Uh with a replica, though, it'sit's at the transaction level.
Every transaction that getscommitted gets pushed out to the
replica.
And and that's just that givesme the ability to sleep at
night, knowing that I've got aI've got multiple copies of that

(56:41):
data out there.
Um it's really useful.
And there's different kinds ofuh replication.
Uh the one that I use is MasterSlave.
I've got one writer that's themain node, um, the main database
server.
All the data gets written there.
And then I've got slave uhreplicas that are uh I've got uh

(57:01):
in fact, I've got my bigcustomer, I've got two local
replicas, and I've got oneremote replica.
Um that's a master slave.
Uh all I can do on those remotereplicas is read the database.
I can't write to it becausethere's no mechanism to push the
data back.
The other type of replication ismulti-master.

(57:22):
That's where you have two sitesor more.
You might have 50 sites, andthey're all able to you're able
to write data to any one ofthem, and that data gets pushed
to all the others.

Wolf (57:33):
Um that's which is exactly like our episode about how the
blockchain works.

Jim (57:41):
Yeah.
Yeah, the the replication or thethey didn't use the term
replication for that.

Wolf (57:45):
Uh it was just they didn't, but the idea is that um
each blockchain is a thing, anduh when you m make a write to a
blockchain that turns out to bethe winner, it goes everywhere.
Every one of them is just asgood as any other.

Jim (58:01):
Yeah, yeah.
Um and that's neat.
I I don't do it that way.
I haven't really needed to.
I I I'm not I don't havethousands or tens of thousands
of writers.
I've got a fairly small,constrained set of uh users that
can actually write to thedatabase.
So write into the masterdatabase, that's that's what I
need.
And the replicas are purely forfor uh safety for the durability

(58:23):
part of what I'm doing.
Um but if we're talking aboutreplicas, we got to talk about
the CAP theorem.
That's CAP.
And of course, it's anotheracronym that it means something.
Um, if you have a distributeddata store store, like a
replicated database withmulti-master capability where
anybody can write to anydatabase or uh and the data

(58:46):
eventually gets pushed to allthe others, uh caps is for three
things.
The C is for consistency, everyread receives the most recent
write or an error, no matterwhich node they connect to.
Uh all the data is consistenteverywhere.
Uh availability.
Every request received mustresult in a response.

(59:08):
Um, you know, if you want thethe the five nines uh uh
reliability mark, you want tomake sure your data is always
available.
Um that's the availability partof it.
And then finally, you gotpartition tolerance.
What do you do if the networkgoes down between the nodes?
Right?
Uh what when a network failurehappens, uh updates to one date

(59:31):
one node aren't going to getpropagated to the other node,
and and vice versa.
So now you're no longerconsistent.

Wolf (59:39):
This immediately makes me ask about a phrase that I have
heard many times, um, whichseems to relate uh uh to the
very first thing you said,consistency, and to this
partition tolerance thing, whichI think is a horrible name, but
like a very important concept.
They wanted to and they wantedto create an acronym.

(01:00:00):
I I don't like that.
Um But the phrase that I wantyou to tell me about is eventual
consistency.
What what is that?
Explain.

Jim (01:00:09):
Yeah, that sounds kind of bad, doesn't it?
But that's if you have two nodesand and either it's a slow link
or or the link goes away forsome reason and network failure
or or something, uh people areupdating both and the data's not
consistent, but the promise isit'll eventually be consistent,
right?
If you're if you're a companylike Facebook and you're you're

(01:00:32):
updating somebody's timeline,somebody's typing posts and
stuff, if that if that postdoesn't arrive at all the other
nodes, and believe me, Facebookhas a lot of nodes, uh, if that
if that doesn't arrive at allthe other nodes instantly, it's
not that big a deal.
Eventual consistency is goodenough.
Um there's other cases whereit's it's not good enough.

(01:00:55):
So the thing about the CAPtheorem, uh one of the one of
the parts about that I findinteresting is um you can't
guarantee all three all thetime.
All three of those things.
The consistency, availability,and partition tolerance.
You can have any two, but youcan't guarantee all three.
Uh because if there's a networkfailure, you're gonna lose out
on the partition tolerance.

(01:01:17):
Uh if if the uh a node goesdown, uh you're not gonna have
availability for that node.
Um if you know consistency.
Um if somebody's updating onedatabase in one place and it
doesn't get to the other place,you're not consistent.
So you can't have all three atthe same time.
You can have any two.

(01:01:38):
You can have consistency andavailability, you can have
availability and fault toleranceor partition tolerance.
You just can't have all three atthe same time.
You can't guarantee it.
In normal operation, you'regonna get all three, right?
Uh if there's no errors, youyou're gonna be just fine.
Um, so let's move on.
This is uh, of course, takinglonger than I expected.

(01:01:59):
So sorry for that.
There's still a few things Iwant to cover.
Uh ORMs.
I know, Wolf, you've got somethoughts about object relational
mappers.
Oh my god.
Tell us what that is, first ofall.

Wolf (01:02:09):
An object relational mapper is um this idea that SQL
is too hard, and that probablyhowever you deal with whatever
the native thing is inside yourlanguage, so in Python objects,
regular old objects, or I guessit's called objects in a lot of

(01:02:30):
different languages, um, thatwhen you connect to a database,
instead of you executing SQL andgetting back rows and looking
into the columns of those rowsto see what things you have,
what you ought to get back, saysthe ORM, is things that look

(01:02:53):
just like objects.
Um and so I have a sales orderand I can ask for the products
that are in it.
And and those would be, youknow, child, whatever, it
doesn't matter.
The point is you don't know thatit's SQL underneath, but it is,

(01:03:13):
and that means you're notwriting the queries, some piece
of code is, and it also meansthat probably when you ask for
certain things, like forinstance, let's say you're using
in Python a thing called SQLAlchemy.
Um SQL Alchemy defines a thingcalled a relationship.

(01:03:35):
A relationship looks exactlylike a property.
Like um maybe you have children,you're a parent and you've got
children.
Well, children isn't an actualfield in you, it's not a column.
Um, it didn't get read.
What happens is when you ask forchildren, SQL Alchemy does

(01:03:56):
another query right then, behindyour back, totally opaque.
You didn't even know it.
Please, Jim, tell us about the Nplus one problem.

Jim (01:04:09):
We covered that in another episode.
Uh it it gets out of control.
And the fact that uh uh the ORMis is behaving that way um
doesn't give you a chance tooptimize the queries the way you
might like to.
I don't use an RM an ORM.
I I've always just written myown SQL and embedded it right in
my programs.
And you know, some people don'tlike that, but I have the

(01:04:32):
opportunity now to write queriesthat are fast.
I can do joins in my queries andget back all the data that I
know I'm gonna need for whateverit is I'm doing.
An ORM sort of sidesteps thatand it doesn't give you the
chance to tune it.

Wolf (01:04:46):
Um worse, um there's a zillion different ORMs, uh, and
not just one per language, notone for Perl and one for Python.
There's you know, 10 for Pythonand 20 for who knows?
Yeah, there's only one SQL.
If you write it in SQL, you'regonna see the right thing

(01:05:08):
everywhere you look.
So I have an opinion, and nowyou've heard it.

Jim (01:05:13):
Well, I hope I'm not talking down to people to say uh
I think ORMs are good for peoplewho don't want to learn or don't
know SQL.
Um maybe, maybe it makes sensein a lot of cases.
Uh uh, it's just not the way I'mgoing.
Uh if if if if you the listenerhas any feedback on that, please
let us know.

(01:05:33):
You might have good reasons whyyou'd want to use an ORM.
Uh, it's just not the directionthat I've gone.
Uh so let's move on.
Um, let's talk a little bit.
I haven't even mentioned thisthing yet, but we talked about
SQL.
Let's talk a little bit about uhNo SQL.
Um No SQL seems to be amovement, right?
Um at least it was, I don'tknow, ten years ago.

(01:05:56):
I don't hear s that much aboutit anymore.
I think it's sort sort of there.
And one of the things I wassurprised to learn was No SQL
doesn't mean no SQL.
It means not only SQL.
Most of these NoSQL databasesprovide an SQL interface for you
to query the data.
So let's talk a little bit aboutwhat No SQL is to begin with.

(01:06:20):
You know, we talk aboutrelational databases where you
got rows and columns and tablesand all that kind of stuff, and
there's this relationshipbetween the different tables.
NoSQL is really more aboutstoring things that maybe don't
have that kind of structure.
Think of it as the ones I seeall the time store a JSON

(01:06:42):
object.
That's what you give it.
And you might give it a key orsomething so that it can find it
quickly.
But it's just like a JSONdocument.
And you know, if you're writinga uh a social media site where
you just have um a profile for auser that has a whole bunch of
fields in it.

(01:07:03):
You don't want to lay out uh awhole uh relational database
schema to handle that.
You just store it.

Wolf (01:07:10):
Especially if they're variable, like if they might
have more or less information.

Jim (01:07:16):
Yeah, variable.
Uh and you want to addinformation later, you you know,
you come up with new things youwant to track.
Uh you don't have to do the olddatabase uh schema change to
handle that.
It's just more fields in the inthe uh the JSON document.

Wolf (01:07:32):
Um Yeah, at um at SourceForge, when we replaced
the old Forge with the newForge, I was working on the new
Forge.
Um we used Mongo for that, anduh that's that's a No SQL
database.
And and it's JSON, just like youjust said.

Jim (01:07:49):
Yeah, there was there were two of them uh kind of coming up
in the in the mid-2000s, uhMongoDB and CouchDB.
And CouchDB is still out there,but it seems like everybody's
using MongoDB.
I've not used either one.
Um I I just choose to usePostgres.
Uh, but those are two very, verypopular ones.
Now, some of the uh in fact allof the the web hosting sites,

(01:08:12):
the the the cloud serviceproviders, uh Amazon and Azure
and those, um they have theirown uh No SQL databases too.
And I think in many casesthey're built on top of Mongo or
or CouchDB.
Um but it's a it's a reallyinteresting way to store data.
Uh and they've got all kinds ofreally neat indexing strategies

(01:08:35):
and stuff, and you can use SQLto query it.
So you know it might look likeSQL when you're all done, so
it's kind of neat.
I I'm just I I I just don'tdon't use that.
Um but before we uh before weend this little section, I
didn't really mention some ofthe SQL database.
Certainly I talked aboutPostgres.
I I could talk about that allday long, and there's others,

(01:08:56):
MySQL.
In fact, way back in the early2000s when I chose Postgres, I
looked at MySQL and I looked atthe list of features that
Postgres had, and it supporteduh transactions and triggers and
um you know the things that Iknew I needed.
And MySQL supported those too.
But when I looked deeper, theydid this weird thing where MySQL

(01:09:19):
had a couple of differentengines for the back end, and um
some of the uh back ends, I Idon't even remember the names of
them, but there was one thatsupported transactions, but it
didn't support triggers, and theother one supported triggers,
but didn't support transactions.
So you if if you weren'tcareful, you'd get pretty deep

(01:09:40):
into MySQL uh into the world andthen realize well you can't do
the the the transactions likeyou wanted to do.

Wolf (01:09:47):
When I looked at the and there were no stored procedures
and there was no replication.

Jim (01:09:52):
Yeah, yeah, and and uh but uh uh I think there were stored
procedures if you use the rightback end.
And and I thought that was kindof nonsense.
What they had, I think they hada checklist of does MySQL
support this feature?
Check it does, check it does.
But the checklist wasn't clearabout what you had to do and the
fact that some of them were werenot compatible with others.

(01:10:15):
They were you you couldn't haveall of them at the same time.
In Postgres, you could, and thatseemed like the the clear winner
to me.
And I stuck with it, and I'mstill very, very happy with that
with that choice.
Um, there are other databases,certainly.
Um you mentioned SQL Lite.
Um, I'll mention it again.
That's a that's a great, greatlittle database.

(01:10:36):
Um and it's in everything.

Wolf (01:10:38):
It's in it might except for Excel, it might be the most
popular database in the world.

Jim (01:10:43):
Yeah, yeah.
It's it's everywhere.
It's in your phone.
Whether it's an iOS phone or anAndroid phone, you've got uh SQL
Lite in there.
Um there's something like Ithink the number was something
like six billion installationsof SQL Lite in the world.
That's a lot, you know.
You might have many of themalready, you know.
Your watch probably has it forall I've done.

Wolf (01:11:05):
Everything has it.

Jim (01:11:07):
Yeah.
Um that's kind of neat.
And uh Wolf turned me onto thisthing, uh Terso.
It's a Rust implementation ofSQL Lite.

Wolf (01:11:17):
Uh which I tried to use uh when I was I added a database of
the things that you'veinstalled.
I have a dot file manager named.NET.

SPEAKER_01 (01:11:28):
Yeah, yeah.

Wolf (01:11:29):
And I have been growing it, and I added to it a database
of what has been installed soyou can more easily uninstall
and check health and things likethat.
And I thought, well, I'll I'llstart with Terso.
Um and I ran into instantproblems with that, uh, just
installing the package, uh,something having to do with

(01:11:52):
whether it was synchronous orasynchronous.
It it uses an SQLite databaseunderneath a file.
The date the it's compatiblefile-wise.
Yeah.
So I just moved to SQL Lite, butwhat what does that even mean?
Is this an appropriate time toask you what asynchronous versus
synchronous means?
Or are we too short on time?

(01:12:14):
You tell me.

Jim (01:12:14):
Let's let's keep going, right?
Uh asynchronous versussynchronous.
It has there's three meaningsthat I know of.
Uh uh one of them is uh whetherthe database writes data to the
disk uh asynchronously, so itcan sort of dispatch the write
and come back and continue onwith whatever you were doing,
assuming that the database isgonna that that the server is

(01:12:35):
going to update or the operatingsystem is gonna actually write
that data to disk.
It's not my problem.
It says it'll do it, so it doesit.
That's asynchronous.
Synchronous would be where youdo a write and you wait for the
response.
You wait for the write toreturn.
Um, and there's a setting inPostgres to enforce that uh
Fsync writes or something likethat, uh, so that every write to

(01:12:57):
the to the disk is confirmed,written, before you can move on,
before that that query returns,um, the update or the insert
query.
Um that's that's one way to talkabout synchronous versus
asynchronous.
Uh another way is if you'retalking about replication, uh,
you can set up synchronousreplicas where if I if I write

(01:13:18):
my data, it's it's one thing tohave confirmation that it's been
written to the disk, but now youcan also get confirmation that
the uh remote replica haswritten it to the disk.
And you you can't continue onuntil some number of your
replicas have confirmed thatthey have written the data to
the disk.
Uh, or or you can uh I I don'tset my replicas up that way.

(01:13:42):
I certainly turn on uhsynchronous rights for the local
store, but for my replicas, Idon't because the network
latency uh is kind of bad.
And and I know it's kind of likethis eventually my replicas will
be uh uh consistent.
Um uh but then finally the thirdway to talk about uh synchronous

(01:14:04):
versus asynchronous is uh theasync await paradigm.
Um you know, the thing you usein your language.
Uh uh progress uh uh uh Pythonsupports that, right?
And um uh Python, JavaScript?
Yeah.
Of course, Perl doesn't, butthat's all right.
JavaScript certainly does, andthat is you can call a function

(01:14:29):
and either just let it go do itsthing while you move on, or you
can do an await on it and waittill that function returns.
Uh the the functions have to bewritten that way.
The library has to be written toallow that.
And it wasn't that one of thereasons why you couldn't use
Terso because it was doing thatand it told you it needed it or
something.

Wolf (01:14:50):
Um I I didn't wait around to diagnose it.
It gave me trouble.
It's and there and I instantlymoved on to SQL Lite to
investigate this problem later.

Jim (01:15:02):
Okay, yeah.
One thing I noticed when I wasexamining the the features of
Tirso was um there's a lot ofthings that aren't compatible.
You know, there's a long list offeatures that SQL Lite has, and
Terso does not implementeverything.
It it implemented it missedseemed to me like half of the
things.
And maybe they're not importantto you, uh, but you know, to say

(01:15:24):
it's compatible, you you gottayou gotta be careful with that.
It it may or may not be.
Right.
One more thing, one more thingabout SQL Lite.
Um, I've listened to a podcastepisode from the Co-Recursive
Podcast, which is one of thebest out there.
If you get a chance to listen tothat, it comes out monthly, I
think.
And uh Adam Gordon Bell justdoes a fabulous job with his

(01:15:46):
interviews.
They're like hour-longinterviews, and he interviewed
the guy that wrote uh SQL Lite,and it was it was a fascinating
conversation about how he didit.
He was he was doing acontracting job for the U.S.
Navy at the time, and theyneeded something that was going
to work, and and he wrote that,and it's pretty neat.
I I've of course include a linkto it in the show notes so you

(01:16:09):
can go listen to that episodebecause it was really, really
interesting.
I highly advise listening tothat.

Wolf (01:16:14):
I I guess this does give me one more question about SQL
Lite.
I I think I know the answer, butI'm I'm asking this one on
behalf of the listener, and thatis um SQL Lite's uh different
than Postgres in this way.
Postgres is running somewhereand you ask it questions.
The question is in your code,but the action of answering that

(01:16:38):
question is happening in someother process.
SQL Lite's a library that's inyou and it all happens inside
you.

Jim (01:16:46):
Right.
Well, inside your program.
It's called in process, it'srunning as part of your your
application.
Uh, it doesn't send out anetwork request or through a
Unix socket or anything likethat to go ask the database
manager to do something for you.
It is your application is thedatabase manager using that SQL
Lite library.

(01:17:07):
That's how it works.
And also, uh, it's a single filedatabase.
Your entire database is storedin a single file in SQL Lite.
Uh as compared to Postgres,every table is one or more files
on disk.
Uh the files I think are limitedto a gigabyte in size.
So if your database takes morespace than that, it just creates

(01:17:29):
another file and it just keepson going.
Um uh so if you look at a uh atthe the database uh at the file
layout for Postgres, there canbe tens of thousands of files uh
because of that.
Every index is a file.
I think every view, uh I'm notsure if views are a file or not,
but the indexes are a file.

(01:17:50):
Um anything you do is prettymuch gonna create a file.
Uh and the wall, the right-aheadlog file, those are files.
Um whereas SQLite it's it's asingle file.
Um so yeah, you know, there's alot of other databases that are
out there.
Certainly you've all heard ofthese Oracle, um, uh Sybase, SQL

(01:18:10):
Server, uh, DuckDB.
We can talk about that for asecond, but let's go back to
Oracle for a second.
I've never used Oracle, but inmy research, I came across two
pretty funny things.
Uh their first customer was aCIA in 1977.
And uh they they had a choice uhto develop for the CIA.
They had two choices.
They could either develop adatabase or a compiler for the

(01:18:34):
PDP4.
They flipped a coin and it cameup database.
Uh an actual coin?
An actual coin.
This is this is it's on theinternet, so it must be true,
right?
They they flipped a coin and thecoin came up.
Uh, we're gonna do a databasefor the CIA.
And that's what they did.

(01:18:55):
And now look, uh, Larry Ellisonis one of the richest men in the
world, um, because Oraclecharges so much for their
database.
Um uh and it was uh the companywas founded by three people, uh
Larry Ellison, um uh Bob Minor,and Ed Oates.

(01:19:16):
Ed Oates came from IBM from backin the old IBM database days.
Anyway, the three of them uhdecided that Ellison was the
worst programmer among thebunch, so he became the
salesman.
And look, you hear about Ellisonall the time.
You'd never hear about the otherguys.
So I guess the real money's insales, right?

Wolf (01:19:34):
Mysteriously killed.

Jim (01:19:38):
No, probably not true.
That's made up.

Wolf (01:19:40):
I I those are my that's a pretend thing.

Jim (01:19:44):
Right.
All right, so uh SQL Server,that's from Microsoft.
I've dabbled a little bit withthat.
I had to do some for a for aclient project.
I I didn't like it much,probably because I'm just not
that familiar with it.
But you know, it it's SQL.
I could read the tables and dothe stuff that I needed to do.
Um DuckdB, there's a database.

(01:20:04):
Wolf mentioned this one to me.
That's an OLAP database.
That's for analyzing data.
Uh, it's a single file, there'sno transactions, uh, it's not
ACID compliant.
You know, there's no guaranteethat the data is going to be
there when you write.
In fact, you do very few writes.
It's really all about reading,and it's incredibly fast at
reading.
So you use it for analyzingdata.

Wolf (01:20:25):
And one great thing about DuckDB is it's not just about
its own database files.
You can point DuckDB at allkinds of different data
structures and then just use SQLon those things.
For instance, DuckDB can giveyou SQL access to your file

(01:20:48):
system, to a pendant data frame,uh to CSV files, whatever you
want.
Um so it's kind of neat.

Jim (01:20:58):
Yeah, I know it's used by Facebook, Google, and Airbnb.
Are you guys using it at work?
Is that where you come across?

Wolf (01:21:05):
I'm not, but I am advocating that it's a thing we
could use and might speed us upin places and or give us uh more
consistency.
Uh for instance, why should weuse the uh special API for
pandas data frames or polar'sdata frames when instead we

(01:21:27):
could use SQL, which everybodyknows.

Jim (01:21:31):
Right.

Wolf (01:21:32):
Except the people using ORMs.

Jim (01:21:36):
Right.
So uh that's that's you know,that's kind of a roundup of of
those databases.
There's a lot of cloud servicesthat offer databases.
Certainly, Amazon offers um uhsome No SQL databases, uh Simple
DB, and I think Dynamo DB.
Uh they've got their S3 scalablesecure storage.
That's that's a database.
It's a key value store, but youcan still think of it as a

(01:21:58):
database.
There's no SQL access to it,though.
Uh Azure's got uh a couple ofdifferent choices for NoSQL, the
Cosmos DB, MongoDB, Cassandra.
Uh they've certainly you knowAzure supports uh SQL Server,
MySQL, Postgres, MariaDB.
Um so those are all there.
And I I've talked in the pastabout how I'm on Azure, and we

(01:22:20):
were having Azure host thedatabase for us.
Uh Azure managed database.
And I talked about the problemswith that.
I couldn't really get control ofthe data very well, so I moved
off to my own self-hosteddatabase.
Uh, but Azure, you can do iteither way.
Uh Google, they've got theirthings, the Big Table and
Firestore and Firebase and um afew others.

(01:22:41):
Uh and of course they supportMySQL and Postgres.
One of the things you have to becareful of, I think all these
database services, uh, they havekind of their own database that
claims to be uh Postgrescompatible or MySQL compatible.
Those bother me because you knowthat whole compatible thing is,
yeah, they're compatible untilthey're not.

(01:23:02):
You know, they're compatible forsome features, but probably the
features I want, they're gonnafall down on.
So I've stayed away from those.
Um, there are some things thatyou wouldn't think of as a
database.
Wolf already talked about Git.
Git's a database, right?
You're you're storing thedatabase.
The data in this case is yoursource code.

Wolf (01:23:23):
It's a key value store, and the values are hunks of
content, um, not named files.
Like the name is in a differentobject, a tree object, which is
a directory.
Um, it's just a hunk of contentthat is inside some file, maybe
multiple files, um, maybe filesthat used to exist and no longer

(01:23:46):
exist in the current version.
And the key is um the SHA-1 hashof uh that underlying content
object.
Um so a simple database, yes,but that's what it is.

Jim (01:24:02):
Sure.
A CSV file.
You could think of that as adatabase.
It contains data, right?
Uh probably the second mostpopular database in the world
behind uh SQLite would be aspreadsheet, uh Microsoft Excel
spreadsheet.
Think of all the corporate datalocked away in spreadsheets out
there on people's desks uh thatit may or may not get backed up.

(01:24:22):
Umly the person actually usingthat computer really knows
what's in it.
And it might be importantcorporate data.
Uh, I'm not saying spreadsheetsare a great place to store your
data, but it does happen.

Wolf (01:24:33):
Um I so I I don't want to uh accidentally waste time, but
there is a database you didn'tmention.
Oh, it's open source.
Yeah, it's called Firebird.
Um, and a long, long, long timeago, uh and Firebird still
exists today, but it it was itwas around back then too, when

(01:24:58):
we first decided to make a thingfrom Mozilla that was more
user-focused, the first namethat we thought of to call it
was Firebird.
Yeah, and that drew instant, notnice, instant aggression from
the Firebird people.
They were not nice at all.

(01:25:19):
Really?
Like, okay, this is a mistake.
A problem happened, we need toresolve the problem.
Um, but they weren't like, hey,we already had this name, let's
figure out a way where we canthey weren't like that at all.
They were, you guys are jerks,and you're trying to steal from

(01:25:40):
us, and you're bad people, andyou're gonna go to a bad place,
and you need to fix yourselfright now, you arrogant asshole.
Yeah, and this is I guess we'regonna have to beat that.

Jim (01:25:50):
Yeah, you were you were working at uh at um uh Mozilla.
Mozilla.
Yeah, that's right.

Wolf (01:25:57):
And it was a big deal.

Jim (01:25:59):
Yeah, yeah.
All right, so uh let me let metell you just uh I I've got to
keep this brief, but I want totell you a story about something
I did with the database.
You know, uh I I've beencomputing uh professionally
since the mid-80s, and uh we hadthis system, it was all written
in COBOL, and uh eventually itwas Accu COBOL and the the

(01:26:21):
database there, it's not adatabase, it was ISAM files.
We did reads and writes and andupdates, and every file, you
know, the header file was a wasa file, the detail file was a
file, and we had to handle allthis in program code of how to
uh you know access a whole salesorder, that kind of a thing.
Um, and we started to migrate tothe web.

(01:26:43):
And of course, you're not gonnause COBOL when you do web
development, uh, you're gonnause a uh some other language.
In our case, we chose Perl forvarious reasons.
I know Wolf wishes we had chosenPython, but this is before I
knew by I knew Wolf.
Had I known him, I definitelywould have gone Python, I
guarantee it.
Anyway, we use Perl.
And um uh so we had lots andlots of data tied up in these

(01:27:06):
Accu COBOL ISAB files, and we wecouldn't access them from
outside COBOL.
Uh we couldn't write a Perlprogram that could read and
write that data and interactwith it in any way.
Uh, so the neat thing about AccuCOBOL was it had a runtime, it
was it was kind of like a uh aJava uh a Java type program
where you compile the code downto bytecode and then you run it

(01:27:27):
with a runtime.
And the runtime we had uh theability to link C uh uh objects
into uh the runtime.
And Perl is great at embeddingin another system, especially if
it's C.
So we actually embedded the Perlinterpreter in the COBOL

(01:27:47):
runtime.
So from COBOL we could callPerl, and that was really
useful, and especially becausein Perl I could interact with
with Postgres, so that meantfrom COBOL I could read.
Read and write Postgres uhdatabase tables.
Really, really useful.
Um, and and uh this is back whenwhen our friend Dave was working

(01:28:10):
for me, and we wrote uh aconversion, we had over a
million lines of code, and allthose all those programs they
were doing um uh uh you know,read from this table, write to a
table.
Uh you didn't have select andand that kind of stuff.
You had cobalt verbs to interactwith the data.
We wrote some programs uh tosweep through that million lines

(01:28:32):
of code and uh convert all ofthose uh COBOL read write update
verbs into calls to subroutinesthat would behind the scene uh
use a Postgres, it use an SQLcommand.
And um it worked amazingly well.
This allowed us to move all ofour data out of those COBOL ISAM

(01:28:56):
files into a Postgres databaseand still use all of our COBOL.
And Wolf, I know you've saidthis many times.
Why is it a bad idea to do acomplete rewrite of your system
all at once?

Wolf (01:29:07):
Uh it would be a giant surprise if you did a complete
rewrite and finished.
Yeah.
But does that ever actuallyhappen?

Jim (01:29:16):
So a little bit at a time.
By doing this, uh, you know, thebig important step was moving
the data into a database, into aproper database.
And we could have used any otherdatabase too, because Perl has
has uh uh uh bindings to all thedatabases.
So what we chose chose Postgresby moving all the data into
Postgres, like we did it over aweekend for the client, and and

(01:29:38):
that you know, the Monday theycome in and now they're
interacting with our system, butall the data is no longer in
files on disk the way it alwayshad been, it's now in a
database.
And much to our surprise, it ranfaster than you know the old
way.
And it was it was fantastic.
And what this meant now is wecould start replacing pieces of

(01:29:59):
functionality in our system withnew programs, new web-based
programs.
And it you know, it accessed thesame database.
So now, you know, whether we'recoding in COBOL or Pearl, we
could write and read the samesame data.
It's fantastic.
Uh, took about two years uh tocomplete the project where we

(01:30:20):
finally threw away the lastpiece of COBOL.
But let me tell you, it wasabsolutely worth it.
And and the customer, you know,we kept delivering new
functionality over time.
We didn't have to say, we're notgonna be able to do anything for
you at all for two years whilewe write this whole new system
uh that may or may not work.
So it really was great.

Wolf (01:30:40):
Uh so I know I know you are you are proud of lots of
things that you've done in yourlife, um, but I have to say, of
the different problems that youhave solved where I know how you
solved them, this is one of themost admirable uh solutions.
Thank you.
It was ever.

Jim (01:30:59):
I I I love doing it.
And and and uh it worked outreally well.
And it I felt a little badhaving to throw all that old
COBOL code away because wedidn't need it anymore.
It meant now we're strictly inthe database, but that was the
end goal anyway, right?
Uh so that all that uh linkingof the Pearl runtime into the
COBOL, we threw that all away.
We didn't need it anymore.
But it was really, really cool,and I love doing it.

(01:31:21):
But look, I've been talking forway too long now.
You know, we always estimate howlong these episodes are gonna
be, and I said I'd be happy ifit was 50 minutes, and here we
are at just over 90.
So I think we've got to wrapthis thing up.
Uh so I I do have a couple oflines uh uh of summary that I
want to talk about.
First off, pick the rightdatabase for the job you're
trying to accomplish, right?

(01:31:42):
If it's an SQL database youneed, definitely pick one.
Pick the right one.
If you if you go towardsPostgres, great.
If SQL Server is your answerbecause you're in a Windows
world, great.
Pick the right one and thenunderstand it.
Make sure you understand all theoptions that are available in
that database.
Like I mentioned in an optionoption in Postgres where we
turned on uh synchronous writes.

(01:32:02):
That's really important in aproduction environment.
It may impact the performance alittle bit, but it's worth it
because your data is gonna besecure.
Um uh something I didn't mentionat all that is uh uh tuning your
queries.
Uh, there's a there's a uhfeature in SQL, it's a standard
part of SQL, it's calledexplain.
You can do that in front of yourselect statement or in front of

(01:32:23):
your query, and it'll tell youwhat the query planner figured
out was gonna use, and it'lltell you if you use explain
analyze, it'll tell you how longit's gonna take to run.
And believe me, I've takenqueries that take minutes and
I've cut them down tomilliseconds by looking at the
output of explain analyze andrealizing, oh yeah, it's doing a
sequential scan of the tablethere.

(01:32:45):
I better create an index on thatcolumn so that it doesn't have
to do this sequential scan.
Like I said, milliseconds versusminutes, huge, huge win.
Um, and if you can set upreplication, do it, it'll help
you sleep at night.
Uh and finally, like Wolfmentioned at the very beginning,
when you look at it, allproblems are database problems.

(01:33:06):
So, yeah.
I talked a lot.
I hope you got something out ofthis.
I could talk a lot longer aboutthis stuff.
There's a lot of things I didn'teven cover, but uh, thanks for
listening.

Wolf (01:33:16):
Wolf?
Yeah.
Uh first of all, uh, I want tosay to everybody, thank you so
much for listening.
Uh, we love talking about thisstuff, and we love talking to
people who care what we'resaying.
Uh, and if you cared, and if youwanted us to say something

(01:33:36):
different, or you saw somethingthat or heard something that was
wrong and you want to fix us, orif there was something that we
said that uh struck a chord withyou and you want to say, hey,
that was right, send usfeedback.
Um so feedback atruntimearguments.fm um in the
show notes.

(01:33:56):
Uh you'll find uh individualemail addresses for us and the
website uh which linkseverything.
Uh we're gonna have notes on allof the things that we talked
about.
I think we should add, just forJim's story about migration, a
link to the Ship of Theseus uhuh story.

(01:34:21):
Um and if we include that link,you'll understand why Jim's
story relates to that.
Um I think that's basically Ithink that's basically all you
need to know.
Lots and lots of notes on thisone.
There's gonna be a transcript.
Um and uh I hope you gotsomething out of it.

(01:34:45):
Um Jim is absolutely an expertin this.
I do want to say one thing.
Um I'm always trying to getbetter, and one thing I learned
in the last uh 12 calendarmonths was the difference
between like I learned aboutCTEs and using them in a query

(01:35:06):
and why that is really good andeasy to understand.

Jim (01:35:09):
Common label expressions.

Wolf (01:35:12):
Uh and then within the last maybe month or so, someone
showed me that it can be thatsubqueries are faster than CTEs
in in some cases.
Yeah.
Um so boy, I sure like CTEs, butI also like speed.
Anyway, um thanks for listening,everybody, and uh send us send

(01:35:36):
us feedback and ideas andeverything you want.
We'll talk to you soon.
Uh bye from me, Jim.
Yeah, thank you.
Bye, everybody.

Jim (01:35:44):
And uh hey, join us for lunch sometime.
I'm gonna include informationabout that in the show notes if
you want to join us on aSaturday in the Ann Arbor area.
So, yeah, thanks a lot,everybody.
Bye bye.

All Episodes

Episode Transcript

Host

Jim McQuillan and Wolf

Popular Podcasts

Stuff You Should Know

The Joe Rogan Experience

Betrayal Season 5

.css-15opob5{left:0;position:absolute;top:0.8rem;} All Episodes

.css-14f5ked{margin:0;word-break:break-word;display:-webkit-box;-webkit-box-orient:vertical;box-orient:vertical;-webkit-line-clamp:2;overflow:hidden;}21: Everything Is A Database Problem