Episode Transcript
Available transcripts are automatically generated. Complete accuracy is not guaranteed.
Unknown (00:13):
Hello, and welcome to podcast dot in it, the podcast about Python and the people who make it great. When you're ready to launch your next app or want to try a project you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode.
With their managed Kubernetes platform, it's easy to get started with the next generation of deployment and scaling powered by the battle tested Linode platform,
including simple pricing, node balancers, 40 gigabit networking, and dedicated CPU and GPU instances.
(00:40):
And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover.
Go to python podcast.com/linode
today to get a $100 credit to try out their new database service, and don't forget to thank them for their continued support of this show. Your host, as usual, is Tobias Macy. And today, I'm interviewing Florian Wilhelm about PyScaffold,
(01:02):
a Python project template generator with batteries included. So, Florian, can you start by introducing yourself?
Yeah. Sure. Thanks for having me. So, yeah, my name is Florian.
I'm head of data science
at Innovex. We are an IT project house.
I got a PhD in numerical mathematics,
and then I started in the field of data science. And throughout the whole time,
(01:26):
even also in my studies, I was using Python.
And, yeah, it's cool to be here at this podcast.
And do you remember how you first got started working with Python?
Yeah. Actually, it was even during my school times when I remember correctly. So when, I was doing internship
at SAP, which is, like,
(01:46):
1 of the largest software companies in Germany. And there, I had to to learn Perl
for scripting. It was whole Perl 5 point 4, I guess. And
back then, I was quite excited about Perl, but 1 of the colleagues told me that I should definitely also look into Python.
Yeah. So he convinced me to take a look into Python, and
(02:09):
I pretty much locked directly loved it and preferred it to Perl because it was way more readable
and easier. And this is what got me started.
1 of my favorite jokes about Perl is that most programming languages are write once, read many, and Perl is write once, read never.
Yeah. Yeah. Yeah. It's exactly what they say about Perl. Yeah.
(02:32):
And so in terms of the Py Scaffold project, can you describe a bit about what it is and some of the motivation that you had for building it and some of the overall goals that you had when you started it?
Yeah. Sure.
So
when I was doing my PhD,
so it was in numerical mathematics.
So we also started to use Python as a kind of clue code for all our numerics code in c plus plus and Fortran.
(03:00):
And in this research group, I was the guy who was really excited about Python and told everyone about Python, and people started using it.
But then
everyone was more like scripting, and pipe wanted to kind of
help everyone to
write proper packages so that we could share our software components.
And the first thing I did, I was writing a huge, Wiki
(03:24):
page. I think it was in in red mine or what kind of Wiki there was at the time. So I was writing, like, a whole documentation
about, yeah, have this setup dot py, and you should put this in there. And don't forget to add this, and this is how you should do it.
And
when I later joined the first company after my PhD, I was also kind of taking this Wiki with me, all the knowledge I have gathered. And also
(03:52):
in the first company at Blue Yonder, we also started to use Python for c plus plus code.
And then I realized
after a while so, okay. No 1 is actually reading my documentation anymore because it has grown to, like, 20, 30 pages.
And I thought, okay. Why not just
create some project generator
(04:13):
for helping people to apply all the best practices,
everything from the docs really easily because, yeah, the actual
barrier was for them to read the docs and having some kind of project generator that boot straps your code and you can easily start using it is is way easier to just use.
(04:35):
And this is basically when,
it was back in 2, 000 14.
Blue Yonder at the time was the company I was working for where I had this idea. I pitched it with my boss. He gave me some time to work on this
and also allowed it to be open sourced,
which I'm really grateful for that they allowed it. And, yeah, this is how it all got started.
(05:00):
And then after I mean, now it's, yeah, it's 8 years ago,
it started to grow and constantly
evolve.
And, yeah, the rest is history.
The context of the fact that you started it 8 years ago is interesting because if you were to go today and look for, I just need a template for XProject. There are
(05:22):
all kinds of different options out there. There are different cookie cutter templates. Some of them are 1 off. There are some kind of generalized options.
And I'm wondering, at the time that you started the project, what the kind of available landscape looked like for somebody who just wanted to say, I just want a template that gives me a Python package that has all of the basic setup that I need so that I don't have to read through, you know, reams of documentation to figure out which configuration parameters to put where and what the, you know, the package architecture needs to look like?
(05:51):
Yeah. So back at the time, I think it was also about the time when cookie cutters started, actually. And they also have a a Python template.
But at the time, I was not really convinced, so I had different ideas how to tackle
this problem.
And
yep. So,
basically,
the main idea was so the cookie cutter, you can basically use for for everything. Right? So you can roll your own templates.
(06:18):
It's really easy to configure your own templates, but you still have to have the kind of knowledge. Right? Or if you look on the web page now, you have, like, tons of different templates, and somehow you need to decide. So what is the right template for me and what are the pros and cons of each template? And still, you have to kind of read a little bit in the documentation
(06:41):
what it is really for and how you should use it.
And the idea with PyScaffold
is more or less that you have something really opinionated. That's kind of a best practice approach that comes with everything that you need to get started
developing your own package really easily.
And then there are, like, extensions that kind of deviated
(07:03):
change a little bit
the kind of basis, but it's not a complete new template.
It's a different approach. So there's this core template, and you can alter it a little bit,
which is in contrast to cookie cutter where you can have, like, complete different templates that are completely independent of each other. And I like the pie scaffold approach more.
(07:26):
So the mission the goal always was to have, like, this 1 go to tool for beginners
of packaging
that you can just you know, you install, Py scaffolds. You run put up my project,
and
you have already a really nice scaffold there that comes with everything you need.
(07:46):
You can easily build your package. You can upload it to PyPy.
You can run talks for your unit tests using Pytest.
You can build your documentation
using things. You have pre commit, for instance, configured to have, like, black and flake, 8, and so on.
(08:07):
And everything,
like, is included.
And, also, you have this nice documentation
that we built over the years. So the web page explaining what everything is what everything does,
and then you can kind of start step by step,
like, using all those best practices.
And I think this helps people a lot because
(08:30):
if they weren't using PyScaffold,
they would have to make a lot of decisions.
So of what they should use, what kind of build system. I mean, nowadays, there's, like, can also use poetry
instead of setup tools.
You can use PDM, then there's hatch. So there are many different tools and also for documentations,
different tools, and how to configure those. I think you can easily
(08:54):
spend more than 1 week,
like, just configuring your package and your whole development setup. But in the end, what you wanna do is to develop where you're working on your project, your idea.
And so Py scaffold helps you to get
really fast to the point where you can start just hacking away on your idea,
(09:15):
and everything else is already with same defaults
covered in in PyScaffold. So this is
our main goal.
As you mentioned, there are a large and growing number of options for any, you know, particular use case or any aspect of the kind of environment setup or the way that you structure your projects.
(09:35):
And before we get too much into the details of
which choices you've made where,
I'm interested in discussing kind of the categories or types of applications that you're focused on supporting with PIScalfold and some of the ways that
those areas of focus developed both from your own usage and some of the evolution of the project over time?
(09:57):
1 main focus, you can build a library with PyScaffold,
also some real Python application where you have console scripts and so on. So this is all supported.
Then for specific applications, maybe thinking about, like, more data science or data oriented application,
(10:24):
We basically
we developed those extensions whenever we needed them. Since I'm in the field of data science, I brought the DS project for the data science project extension,
and everyone can just contribute
extensions whenever they want. But, yeah, we are completely open
(10:44):
in this regard to whoever wants to contribute a new extension for some really specific
use case. For instance, I saw last time I checked that someone
had, like, a extension for Visual Studio
Code together with a container
configuration
because
1 person seems to, yeah, develop
(11:06):
with the help of Visual Studio Code and, some containerized
setup.
And, basically, everything's possible with the help of the extension
system. And I think at that point,
huge shout out to Anderson,
Bravoielli
because
so he's right now the core maintainer of PyScaffold, and
(11:26):
he introduced the extension
system back then. So he changed a lot to make this possible to have really
yeah. To have extensions easily available in KiteScaffold.
I'm definitely interested in digging into the extension system a bit. But before we get there, as far as the
(11:46):
kind of decision making aspect of saying for this type of project, so for a Django project,
these are the, you know, different
basic plugins that you want to use. This is how we're going to default the project structure.
You know, we're going to use Poetry or PDM or you know, just curious if you can talk through what your process was for
(12:08):
deciding on what constituted best practices, both
across all of the templates that you use in PIScalfold and for kind of each specific
use case for a given project.
Yeah. So, of course, everything's kind of opinionated.
So we had huge discussions
with several people from the Python community, or sometimes it was also just,
(12:31):
like, trying out different tools. And so PyScaffold
uses
right now has been using setup tools because it's like the big default
way of building
Python packages.
And since, also, setup tools evolved a lot,
it's still
used also a lot and has all the features that are needed. So we looked at different tools, for instance, Poultry. But on the other hand, Poultry itself also comes with a kind of minimal
(13:03):
project setup. So and it also directly includes
managing your whole virtual environments, and we wanted to have those separated that you, for instance, can also use
an Aconda,
which is maybe a little bit more preferred in the field of data science for managing isolated environments.
(13:23):
And yeah. So, basically,
we evaluated
different ways and looked, of course, at a lot of other open source tools, how they are doing, and this kind of
yeah, then helped us to kind of extract what we think are the best practices
from
different project. And, of course, this evolves over time. For instance, when we started,
(13:49):
like, in
2014,
it was completely normal that you would write a setup dot py. Right? Because at first, Python was like, yeah. We can configure everything with the power of Python itself.
And then later on, the whole community realized, okay. But being declarative is actually much better. So we should give up some of the huge power of Python
(14:13):
and go over to some configuration file. And at first, this was a setup dotcfg.
And, yeah, we welcomed
this and implemented this too in Hinescaffold and always yeah. We are always keeping up with what setup tools does so that our users always have the current and most modern way of setting up setup tools. And now, I think since a few months, it's also possible to completely
(14:43):
to completely configure setup tools with the help of pyeproject.toml.
This is also gonna be, like, the next step, what we're gonna do. Then for documentation,
I think,
yeah, it was, from the start, pretty
clear that Sphinx
is the 1 go to tool.
Right now, we are also evaluating MK Docs,
(15:05):
which it seems like the community is, like, growing more in on the MK Docs side.
When we started, most of the docs were written in restructure text. But nowadays, markdown
is,
much more preferred. So we first started this as an extension that you could say, okay. I wanna rather write it in markdown.
(15:26):
And now we are thinking about maybe just to completely switch over so that your default project directly is all in markdown. But these are just ideas
because we kind of want a way to have it standardized.
And unit tests, I think, in the beginning before
Pytest became really big, there were also a lot of other
(15:47):
unit testing tools that we first checked out.
And, yeah, this is how we kind of look
what happens in the community. We also get, for instance, Anderson. He's, like, reading all those peps before going to sleep, it seems. So he knows all the peps and what we should implement next and so on.
(16:08):
And then there's also a guy from setup tools,
SCM,
So he developed this tool. Yeah. Setup tools as SCM,
and this tool allows you to do semantic versioning
and kind of bridges the gap between
git
(16:28):
and
the version strings of your Python package. So you can just tag a certain commit,
and this,
version is then also used directly in the metadata of your project.
Yeah, this allows you to avoid any kind of of conflict
because you want this to be always consistent. Right? What you haven't get and what kind of packages you are building. And he's also
(16:54):
giving a lot of really valuable
suggestions
to PyScaffold.
For instance, he
suggested that we change to the source layout so that you don't have the the actual Python package at the root of your project,
but that you rather have a source
directory
and the package is a subdirectory of the source because this has a lot of advantages. And so this is basically how those best practices
(17:22):
so we listen to different ideas. We discuss them,
and then we look, okay, who else is doing this? And is this already kind of a mainstream?
And then we
include it into PyScaffold.
The kind of determination of what is mainstream is also interesting because it also depends a bit on who you're looking to for kind of signals where
(17:46):
some kind of areas of the community are focused on, you know, everything should be in poetry, and we use pyproject.toml
for everything. Other people are looking to some of the newer PEPs where I think PDM is 1 of the package managers that uses the
directory based dependency management so that you can have sort of like a NPM style where you just install your packages into
(18:09):
the project directory. You don't have to worry about managing your virtual environment.
And I'm interested in your
general kind of philosophy and approach for how you
determine what your
opinions are about how you want things to be done and how you
balance that against the tendency for a project as it goes on to just add more and more configuration flags so that
(18:36):
by the end of it, you might as well just do all the work yourself from scratch rather than rely on the, you know, so called opinionated
approach because the opinions become diluted because there are so many options.
Yeah. If we were actually
now to switch setup tools for something else,
then we would maybe over some
time have a kind of,
(18:57):
yeah, an extra flag to go back to setup tools.
But then we kind of would drop setup tools if this ever happens. So we are still in discussion, and setup tools for me is still a it's still a good way to build your Python packages.
But the idea is not to just add more and more and more, as you said, but to kind of replace them
(19:22):
after a while, like, 1 thing with the other because you don't need to build tools, actually. So you have to decide anyway for a single 1. And since we wanna be a tool that you can just run and run put up my project, and then
someone makes decision for you and gives you the default 1 where we think it's a good choice for you. And, I mean, right now, all those different
(19:48):
things, build systems and also environment managers like PDM, Hatch, Poetry.
There are so many. And right now, it's really hard to say who's gonna win in the end.
And this is where we just kind of take our time and wait.
And, also, actually,
right now, pie scaffold
(20:09):
doesn't really come with
a way of managing your virtual environment. So it's completely agnostic, which I think is good. So you can
do whatever you want. You can even use pipenv if you wanna use this, or you can use anaconda, whatever you want because we we are concentrating on this single thing, which is
(20:31):
giving you a nice
Python project setup. So it's not about managing your environments.
There, you can still choose whatever you want. Maybe poetry, of course, would be a little bit
difficult because, I mean, we use setup tools. So
you would have to replace then some
(20:52):
yeah. But in the end, after we have done the switch to Pyprojecttoml,
it should hopefully be only, like, 1 line
in the beginning of Pyprojecttoml,
where you define your adult system.
And then you could even use poetry. So, actually,
if you wanna develop
a Python library or application,
(21:14):
you should not care that much about the build system in the best case because this should be completely transparent for you. It should
just know the command how to build the wheel and not decide, okay. Should I rather go for there there's also flits.
There are so many different tools, and
some implement those caps. Some implement there's also PyFlux
(21:38):
or something that only comes with this new way of the underscore underscore
pipe packages. So this way of not having
isolated
environments at all.
And yeah. So this is still some beast we kind of say, okay. We are re agnostic.
Who knows if we later gonna adopt something or not? So I would rather actually keep or stick to the UNIX philosophy, which is do 1 thing and do it really well.
(22:08):
And a project setup has actually or should, in best case, have nothing to do with how you
manage your virtual
environment. Right? It should be kind of a to
it. Absolutely. You should just build the wheel, not reinvent it.
Yeah.
Absolutely.
And then in terms of the actual implementation of Spy Scaffold, I'm wondering if you can talk to some of the ways that the project is designed and some of the
(22:38):
choices that you've had to make in terms of how it is managed and structured and the evolution that it has gone through as the overall ecosystem and the adoption of best practices has changed?
Yes. So implementation
perspective,
we always try to keep things as simple as possible. Right? So we also have nice documentation about this for contributors who beginners who want to contribute to PyScaffold.
(23:04):
And,
yeah, basically, we have
a module for the command line interface,
then we have a module for our internal
APIs, So how to create a project.
And then the base principle is
that structure of your project, which is the kind of nested dictionary
with the files
(23:25):
and the content of the files.
And
and this put into a pipeline called the action pipeline.
And each step of this action pipeline, which can be create a file or check the consistency
of flags or make sure
that git is there or initialized.
This is then applied on each step on this structure.
(23:49):
And in the end, the end structure at the end of the pipeline is then
kind of deployed on your hard drive. So this is the basic idea.
And this also gives the flexibility.
So if your
extension
introduces another flag,
then you have access to that flag. And, basically, you define a new action in this pipeline. You can say it should be should be run after after another step in this pipeline,
(24:17):
and this gives you the whole flexibility
to change at that point the project structure however you want it.
And this is basically the whole magic. So I think it's kind of really simple to understand what's going on.
And, yeah, throughout the time, a lot
changed. So when we started,
(24:38):
we wanted to have this versioning
feature in there. I think it was
way before setup tools SCM was actually invented, so we were using version here for that.
And, also, we wanted to have declarative
configuration options. So there was a time when we used this Python build reasonablest.
(24:59):
So it's from OpenStack,
PBR.
So we were using this, and we were vendorizing
those packages.
And, yeah, later on, we decided to rather
use, like, real dependencies
for this. So the reason for vendorizing it
was that we were, at that time,
a kind of development
(25:20):
requirement.
So we tried not to have that many dependencies. So we would just package everything into PyScaffold. So this changed in version
version 4.
So there were a lot of kind of architecture
changes
throughout the time. And
I think the biggest 1 was
the change to have an extension system
(25:43):
that Anderson introduced. And he also
helped to restructure
and reorganize
the source code a lot, which helped us to maintain
this over so many years. Because, I mean, when I started it, 8 years is already kind of long for a Python project. Right? So much happened even to the Python language. There was also the switch from Python 2 to Python 3. And in the beginning, we wanted to support this. So we had 6
(26:10):
also in the templates
to help people develop a package that runs for Python 23
at the same time.
This then also got dropped
after a while because nowadays, you basically don't care about Python 2 anymore. Right? So and this is how we
evolve
this. And for version 4,
(26:31):
we also
took the time to just see what happened over those 8 years. So I know if you know this this course,
if it's pronounced like this. This is visualization
tool for
Git where you just put in your whole Git repo, and then it creates some nice visualization.
We put the PyScaffold
(26:52):
repo in this tool, and I think it's a 4 minute video where you can say see easily how much has changed
over
those 8 years.
Really, a lot has changed over those 8 years. And sometimes it was a real challenge to keep up with all the changes and all the taps.
And sometimes it's also hard
(27:14):
to say no to people for kind of suggesting new features
and saying, okay. Please, could we also have this and that? And then you think about, okay. It's not only
implementing it. It's also maintaining it over many years and kind of removing something from a package. It's much easier it's much harder
than actually adding something. So you also always have to keep this kind of balance
(27:39):
between not adding too many features because you have to maintain everything.
On the other hand, you also wanna be, like, welcoming to new
contributors
and to also allow them to, yeah,
implement their ideas and help improve pie scaffolds. So
Another aspect of the work that you're doing is that
(28:01):
by being opinionated, you have to make your own determination about what constitutes best practices. And as you mentioned, that means
staying up to date with the different conversations that are happening, the PEPs that are being adopted, the changes to the Python language and ecosystem.
And
in your work of doing that, I'm curious what you see as being the
(28:21):
kind of noisiest or most volatile elements of what constitutes best practices and what are the pieces that are
largely stable or widely agreed on. So I know that black, for instance, is
generally accepted as a good choice. Obviously, there are people who have their own opinions who dissent, but it's
fairly uncontroversial
(28:42):
versus the question of kind of environment management that you were discussing earlier of Poetry versus pipenv versus PDM, etcetera.
I'm wondering what you see as kind of the, you know, most stable versus most volatile elements of that ecosystem.
So definitely something like Plaque
and Flake 8.
(29:02):
And, also, I would say, pre commit. So also having this end of line fixer and white space and everything that helps you kind of
have a certain standardized
style in your Python project. There is, I think, a lot of agreement that those tools make sense. Also, for unit tests,
pytest
(29:23):
is still the go to tool, I would say, and also talks
for running tests.
For a long time with regard to documentation,
it used to be strings. But, yeah, as we mentioned before,
people also use MK Docs. And what I like about MK Docs that it also
(29:43):
has a declarative
way of configuration.
And Sphinx, on the other hand, has this const dot py where you also use Python for configuration.
So then
more and more people are adapting
type hinting.
So this
is still,
like, not 100%
(30:04):
used, but there's a lot of agreement that people really like it. And,
I mean, if you used
Python
docstrings to document your source code, you would anyway write if it's a list of int or a list
of characters or whatever, a list of strings.
And, yeah, having this now directly
in the parameters of your function and the return value, I mean, it's it's much easier to read, and you have many, many benefits for
(30:34):
instance, like static type checking. And there's also tools that even compile your Python codes, like the mypy compiler and so on. So there's a lot
of things going on in this regard. So,
yeah, mypy is definitely something
I would say or it doesn't have to be,
(30:54):
MyPy, I mean, also other tools.
But type hinting,
I would say, is yeah. Many people agree with it, and this is secure to stay.
We also wanna support this more in high scaffold.
Then when it comes to the build system,
this is, like, really hard to say what's gonna happen there in the future.
(31:18):
So
I actually hope that setup tools
is gonna be the winner because, I mean, it also does this 1 thing, and it does it also well. It has a kind of legacy code base.
So this is maybe why people
have their problems with it. And sometimes
yeah. It has some really rough edges,
(31:39):
so to say. So I had a lot of fun trying to understand what's, going on in setup tools,
but I think it's improving a lot. So the Python package authority, the pipe PyPA
I mean, they also have FLIT,
which is like a light wide weight version of setup tool, so to say. But it also is missing a lot of features, like
(32:02):
you can only have pure Python packages, so no compiled code.
You don't have an extension system. For instance,
what I explained before, this having semantic versioning directly from Git
in your metadata
of your Python package
is not possible with Flit, which I think is a is a downside. So
(32:24):
I'm not a huge fan of it, I must say, but
there are many others. I haven't really evaluated so much PDM right now.
And this new PEP
with the local environments, with the Py packages.
But then again, there's also Poetry
and Hatch. Hatch is is definitely cool, I would say, because it already comes with a lot of features. But this is, I would say,
(32:50):
the part
where there's the most fluctuations,
where it's not really sure what it's gonna happen. And I think,
on 1 hand, it's kind of nice that people are inventing new stuff. But on the other hand,
it's also a huge distraction
because
in the best way, you just you wanna have this 1 tool that you just use, and it does whatever you want. And
(33:16):
you develop, like, other languages like Rust, for instance, come with cargo,
where you have this 1 go tool how to handle everything.
And it's actually
a pity that Python had over,
like, almost decades now
so much pain with handling your dependency,
(33:36):
with installing
things, and so on.
And,
yeah, with respect to other things like, I would say, the the source layout of a Python package is also,
yeah, many benefits of it so that you, for instance, cannot accidentally
test something completely different because you import
(33:58):
the package directly,
not whatever you installed.
So
this, I think, is also kind of
yeah. It makes sense. And
most people agree that you should set up your project like this.
Also, that you have should have a license
file or,
file
called contributing
(34:18):
and a readme.markdownor.rst.
So the the over and and that your docs folders should be named docs, and you have a test folder. So certain things are already kind of,
yeah, pretty
defined. And I don't think that the overall,
like, structure from a directory perspective is gonna change a lot. But, of course, you had some tools,
(34:43):
yeah, some tools, especially the built systems. I think they
they built surely changed a lot. And, yeah,
let's see who's gonna win in the end.
In terms of
the extension system that you mentioned earlier, I'm wondering if you can talk to some of the ways that
that
enables
(35:03):
people to
use Pyscaffold
as a utility and build their own templates and maybe some details
on sort of how the templates are implemented as compared to maybe a cookie cutter template.
The templates
of PyScaffold
are
just Python packages
themselves, so you can just pip install them.
(35:26):
And, yeah, they're using the
Python
hooking system.
So the entry points basically
so that PyScaffold can discover those
extensions.
And they are discovered, and then you define
your flags and your actions.
(35:47):
And then
you can also define where in the pipeline
this extension this your your action is gonna be placed. So
it's really not that hard to write an own
extension. And we also wrote an extension
to write PyScaffold extensions.
So, you can just pip install
(36:08):
install py scaffold
dash custom dash extension,
and then you have a special flag. And then you get a setup, a project setup that already includes all the necessary subfolders
and has some stubs
with pseudo code or not some pseudo code with, like, all the boilerplate code
that you need to write your own
(36:31):
PyScaffold
extension.
And, of course, also the the documentation, I think, at that point is really great. So it's it's not that hard to write your own extension.
What cookie cutter does is clone some
some GitHub
repo,
which uses
Jinja or Jinja 2 to define
(36:52):
the overall structure, and then there's some substitutions
going on.
So I think this is, like, really different to how we do it. So
I must say that, of course, it's easier. So if you are already using Jinja a lot, then it's easier
to write your own
cookie cutter template.
But on the other hand, I mean, the cook for a cookie cutter template,
(37:17):
you would then need to decide first, okay, from which of the many templates are out there. I wanna kind of clone it and then do your changes.
And in case of py scaffold, you always know that you have this
core setup that is also gonna evolve over time.
And then your extension
(37:38):
just
defines
how you differ from this core
definition of your project. And so this also means that maintaining
your extension
is way easier for PyScaffold.
And
it's a different
thinking in this regard that you only define the changes and not like I copy everything
(38:00):
and modify in place.
For somebody who's getting started, you mentioned that you have this put up command to generate the scaffolds
for
somebody who maybe already has an existing project and wants to migrate to adopt some of the opinions that you are have encoded in py scaffold or for somebody who started a project in PyScaffold,
(38:22):
but the underlying template has evolved, what are the options for being able to
incorporate those changes into that existing project?
We have a migration
documentation
about this.
What you can basically do, it's, again, really simple. So just make sure that your current project
(38:42):
already uses Git. I mean, this is, for instance, also something
which is here to stay. So everyone uses Git. Right? There's no doubt about it anymore. So almost everyone agrees on this. This is also a requirement for ISCAPSOL. So let's assume that your current project, you're using git.
Just make sure that that you have a clean
(39:03):
repository. Everything's,
checked in.
And then you just go to the parent directory and type put up minus force
and the name of your project.
And
and then PyScaffold is kind of overwriting,
your
current project setup with its own
(39:23):
files.
And
then you can just git merge tool
and git diff
to kind of
change
whatever you need to change. So it depends if you already have a kind of similar
project setup. Of course, there won't be
that many changes.
Yeah. If it's completely different, for instance, if you don't have the soft layout, there are some steps you can do before, but everything is written in the documentation.
(39:52):
Of course, this process like this cannot really be automated because we never know how your project might look
like. But I would say
the process shouldn't cost you more than, let's say, 2 or 3 hours depending on how large your project is.
And, yeah, it's kind of easy.
And your experience of building Pyscoffold
(40:15):
and using it and kind of evolving the project and now handing it off to a new maintainer,
I'm wondering what your
overall approach to community engagement has been as far as
do you have any
maybe
voting process for deciding what are the right opinions to have or sort of who to listen to and what your approach has been for kind of bringing on your new maintainer and handing over the reins because that's 1 of the kind of least
(40:45):
formalized or least understood approaches in open source as far as, like, how do you, as somebody who just starts a project because it's something a problem that you want to solve, then say, okay.
This is how I am going to manage the
kind of ownership
and life cycle of this project once I have moved on to other things?
That's a hard question. So I also
(41:07):
yeah. I'm really open to all kinds of ideas and suggestions how to do this. Our current approach, and I think it worked quite well, is,
first of all, be always be welcoming to new contributors
and trust new people always. So you should not be too hesitant.
And, yeah,
listen to their ideas,
(41:30):
kind of then also challenge their ideas, but in a nice
way, and discuss first new ideas
before
you get some huge pull requests
that change of, like, 80% of your code base.
This can be formalized.
So we have a lot of documentation
about, as I said before, how our
(41:51):
code looks like, also how to do your first pull requests. So, really, like, from the beginning, like, you click the clone button
in GitHub and so on.
So everything
is included, and this helps new people
making their first pull request. And we also had pull requests
just fixing typos, which I really liked. And
(42:14):
then
I find it really important because it's, of course, writing open source code in most cases
is nobody is paying for it. Right? So whenever
someone
wants to contribute,
they are kind of giving you their
spare time in most cases. Right? And I think this should always be appreciated. So if someone
(42:36):
takes the time to fix our
typos in pie scaffolds, so we are really grateful for this. And I think it's important to show this in a community
that you write, hey. Thanks a lot.
We also mentioned those in our
as authors, as contributors to PyScaffold,
and so on. So this is really important to have this, like, welcoming
(42:59):
way for new contributors and always also explaining things when there are questions, of course.
Also important is that
when someone tries to make a first pull request, not to just think, okay. I can now
implement whatever they ask for
myself in maybe,
2 hours because I know the code base, but rather to say, okay. You could implement it like this.
(43:24):
And why don't you make a pull request for it so that people kind of get motivated? Okay. There's someone helping,
and I write a small,
maybe new unit test or whatever. And they fix a little problem, and then they have their first pull request. And this gets many people motivated then to
go on and to maybe write a huge
(43:47):
feature for PyScaffold
and so on.
And then there's always, like, trust. For instance, I trust Anderson, like, 100%.
And I've
known him now for many years. And, of course, if he has an opinion on something, this opinion
is, like, really valuable. And this is also why
(44:07):
I said, like, okay. I wanna also
go on to other projects. So still, I want to contribute. But since you have been doing, like, the most work and the greatest feature in the last years, you have been implemented, so you should be the core maintainer
of Pyescafold.
And this
is, like, the usual
(44:29):
kind of, if you call it, ring of trust, or you have certain people,
known them for many years, and they have done an awesome job. And, of course,
you trust them more
in contrast to someone
coming
in and saying, okay. I wanna change
completely the goal of PyScaffold.
Still, in this case, 1 has to be, like, really also, again, appreciate
(44:52):
that they have cool ideas,
but in a nice way,
saying no. And this, like, learning to say no
in an open source project for me was the hardest thing because you have motivated people, and they say, okay. Let's add this, and here's the pull request. And I've already put,
like, maybe days of work into my pull request, so please accept it. And then saying, no. This changes the whole project into a direction
(45:21):
that we kind of don't wanna go. It's the hardest part, but I think it's also
something
you have to do as an open source maintainer
because, otherwise,
you're kind of losing the goal of your project. So
yeah. But I think there's a lot to learn still for us and for me how to do this. And as you said, I think maybe there's no no best way how to do or how to work with the open source community.
(45:50):
Yeah. But, again, some best practices, I assume.
In your work of building and using the project and engaging with the community, what are some of the most interesting or innovative or unexpected ways that you've seen Piscaffold used?
I really like seeing that people are using the custom extensions,
building their own, like, really specific extensions
(46:12):
for their certain needs, like, something like on the b project
and this Visual Studio Code.
And someone also
added an interactive
shell for
PyScaffold.
So beforehand,
everything was just that you define the configuration flags for PyScaffold.
(46:32):
So 1 person,
he built this interactive way, which I found really
surprising because I actually never needed it. But we realized
then due to him and also that other people were using this project that he built, that we should have this as a core feature.
(46:53):
Yeah. But other than that,
yeah, I think,
yeah, most people
are using PyScaffold
as it is intended,
and we are always happy to hear
when people are using it.
And yeah. So
if you're using PyScaffold
so to the listeners, yeah, let us know about it.
(47:14):
Just use the GitHub discussions
and tell us about it. We are really happy to hear
what kind of projects you are using PyScaffold
for,
yeah, if it helps you or how it can be improved.
In your work of building the project
and growing it and now
adding a new maintainer, what are some of the most interesting
(47:37):
challenging lessons that you learned in the process?
For me, this was my first open source project.
All the community aspects that we discussed before, this was really challenging.
And
we have a few core contributors,
but I would love to see a lot more people contributing to PyScafold
and building this up. So this was really challenging.
(48:00):
Also, keeping up with all the peps, of course, and that Python itself is
really changing
so rapidly
for the whole language.
I rather thought it's gonna be, like, maybe in, c plus plus or so that you have, like, a new standard every 4 or 5 years or whatever so that things are not changing so quickly.
(48:24):
But Python has such a huge speed. Also, the whole ecosystem is is changing.
And this is definitely
yeah. It has been really surprising to me.
It's still a huge challenge to keep up with this.
For people who are looking for ways to be able to quickly scaffold or bootstrap a project or they're looking for an example of best practices
(48:49):
for how to
approach their application
structure? What are the cases where PIScaffold is the wrong choice?
I think it's the wrong choice
if you
wanna have a application that deeply integrates into your operation system, because then you would maybe rather go for some
(49:09):
operation system specific
packaging solution like a DPN package or Flatpak,
Snap, whatever. But it's also
the case
for
other type and packaging systems.
And
I think if you work, for instance, for a company or for some reason,
(49:30):
the context you're working in demands
a really
specific setup.
You're kind of deviating too much from what Hire Scaffold is giving you, then it really might make sense
to use, for instance, cookie cutter instead and roll your whole
template.
And, yeah, this can be really a case where
(49:53):
your own template
that you then also, of course, have to maintain
makes more sense.
But if you
kind of just wanna get going, then a pipe scaffold is the better choice.
In terms of the near to medium term, what are some of the things that you and your co maintainer have planned for, you know, new capabilities
(50:14):
or new areas of use or just some of the projects that you're excited to dig into?
In the next version, we definitely
wanna
go to, yeah, wanna adapt
Pyproject
toml completely because this is now possible with the new setup tools version.
So setup dotcfg
is completely gonna disappear and all the metadata of your project. Everything's gonna be nicely
(50:40):
in Pyproject.
Toml.
Also, we heard from the community
that people
rather have or rather want to have, like, 1 file with all the configuration
in it instead of tons of small files, like file only with the flake 8 definitions
and only with your coverage
(51:00):
settings and so on. And most of those tools now integrate with pyproject.toml
and also wanna integrate this. So this is definitely
the near term goal
for PyScaffold
and also to have an update path for this next version.
We are evaluating
(51:21):
right now, for instance, MK docs, if this could be something, a switch to MK docs. So this is rather an idea I'm having right now, but, yeah, there still is gonna be a lot of discussion. Yeah. Anderson is the core maintainer. I'm gonna discuss it
with him, so it's not yet decided.
This is something then
more integration
(51:42):
of MyPy and to make it easier
for people
to then start with static type checking.
So this would be something
that's, like, midterm.
And,
also,
we've really seen that
people like markdown much more than structured text, and you can have it already in PyScaffold
(52:05):
with the markdown plugin, which is, for instance, also
by default install if you use
the data science project plugin
or extension.
I think this should
be, like, already in the default
core setup
so that you have markdown
by default. So this is also
(52:26):
rather near term.
And then long term
yeah. I really don't notice so much pace in Python's development.
Let's see.
Are there any other aspects of the PyScaffold
project or the overall space of
kind of managing the structure and best practices around the development of an application in Python that we didn't discuss yet that you'd like to cover before we close out the show? Yeah. Just like to mention also that if you want to contribute to PyScafold
(52:56):
and if you have new ideas or if you wanna evaluate,
especially
tools
like CATCH, PDM, and so on,
and maybe have also a sister project or something that maybe works together with Py scaffold and yeah. Helps
yeah. Or make suggestions
that how you can or how to use,
(53:19):
virtual environments
and isolated environments or wanna help us with
adding documentation about this. I mean, this would be
really valuable.
Other than that, I think this whole
Titan
packaging
and structuring your project is really a
huge rapid hole. So if you start looking into this, you will find
(53:41):
a lot of EuroPython
and PyCon
talks
going back
to 2, 000 something
where people discuss
about the different ways of doing things. So it's
actually, the less you care about this and the more you concentrate on your actual work or whatever you wanna do.
(54:03):
So I think I have no more
suggestion
that you should look into something.
So there's a few blog posts.
I,
yeah, could maybe add to the references
of this podcast.
I think 1 important thing
that I see,
especially in the field of data science, so people are using a lot of Jupyter and JupyterLab.
(54:26):
And this brings
own challenges
of how to integrate this in a nice way
into larger projects. So people start with 1 notebook,
and then they realize, okay. The code is growing and growing.
And at this point, how to migrate this into a proper Python project check with the help of PyScaffold.
(54:48):
Yeah. It helps a lot. And we've also written
1 blog post about this.
This is definitely
worth a read, I would say, for people in the field of data science because they're
the,
yeah, like, the software
craftsmanship,
so to say. Yeah. There's still a lot of potential.
(55:08):
Let's say it this way.
Well, for anybody who wants to get in touch with you and follow along with the work that you're doing and contribute to PyScaffold, I'll have you add your preferred contact information to the show notes. And with that, I'll move us into the picks. This week, I'm going to choose the daredevil TV series I just started watching. Originally, it was aired on Netflix, and now it's on Disney plus.
(55:31):
But definitely a very well put together show, a lot of, you know, interesting themes and interesting characters, so definitely recommend giving that a shot if you're into any of the superhero
shows or genre. And with that, I'll pass it to you, Florian. Do you have any picks this week?
Yes. So we started watching the peripherals,
on Amazon Prime.
(55:51):
It's, the creators of Westworld, which was already,
like, a really cool TV show. I think there's season 4 also running right now. I haven't checked out yet. But, yeah, it's the second episode we have seen, and it's really interesting,
really cool show,
kind of sci fi, but also little bit
(56:12):
mystery. You really have to think
what's going on. So if you are, yeah, a fan of Westworld where you also don't know
how things are, and it's a little bit mysterious.
So it's a really cool show. So
I'm gonna continue.
Thank you again for taking the time today to join me and share the work that you've been doing on Piscaffold. It's definitely a very interesting project, and it's always great to have people contributing
(56:38):
to helping to
cement and
share their thoughts on what constitutes best practices for how to structure your Python projects. I appreciate all the time and energy that you've been putting into helping save time and energy for everybody else who wants to just build something. So thank you again for your time, and I hope you enjoy the rest of your day. Yeah. Thank you. You too. And thanks for having me. It was really a pleasure. Thanks.
(57:04):
Thank you for listening. Don't forget to check out our other shows, the Data Engineering podcast, which covers the latest on modern data management, and the Machine Learning Podcast, which helps you go from idea to production with machine learning. Visit the site at pythonpodcast.com
to subscribe to the show, sign up for the mailing list, and read the show notes. And if you learned something or tried out a project from the show, then tell us about it. Email hostspythonpodcast.com
(57:27):
with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.