by Brandon Rhodes
• Home
Untitled
---
categories: Computing, Python
date: 2009/04/22 16:34:49
permalink: http://rhodesmill.org/brandon/2009/pyron-no-return/
tags: ''
title: 'pyron: Making Python package development DRY to the point of no return'
---
I finally snapped last week.
After years of writing verbose and repetitive setup.py files
for my Python packages,
I am unable to write another.
Instead, I have started writing
Pyron,
a tool that gathers the same information
by inspecting a Python package itself.
Not only does this mean that I get to
stop repeating myself,
but that my projects will become much more uniform
because package metadata will be represented through
common conventions instead of explicit (and repetitive) configuration.
Though Pyron is still very primitive,
it has already allowed me to reduce simple packages
to only a README.txt plus their actual Python source code.
The start of the trouble
What happened is that I wanted to create a simple Python package
full of tools for professional authors
working with rst
documents, so that they could monitor their word count while writing,
and convert their rst files into the proprietary formats
used by various publications.
But just to start a new Python project
required me to create four entire files,
and almost as many directories:
./cursive.tools/setup.py
./cursive.tools/cursive/__init__.py
./cursive.tools/cursive/tools/README.txt
./cursive.tools/cursive/tools/__init__.py
The setup.py file itself
repeats the project name over, and over, and over again,
reminding me of the old Adventure game's
“maze of twisty passages, all alike”:
#!python
from setuptools import setup
setup(
name = 'cursive.tools',
version = '0.1',
description = 'Tools for restructured text files',
author = 'Brandon Craig Rhodes',
author_email = 'brandon@rhodesmill.org',
packages = ['cursive.tools', 'cursive'],
namespace_packages = ['cursive'],
)
The first __init__.py file shown above of course looks like:
#!python
import pkg_resources
pkg_resources.declare_namespace(__name__)
Meanwhile, my stub README.txt and __init__.py files
down in the bottom directory contained just enough information to get
me started, whether I wanted to start by writing documentation and
tests or get started by writing actual code:
``cursive.tools`` -- Tools for restructured text files
------------------------------------------------------
The routines in this ``cursive.tools`` package are
designed for authors. They provide command-line tools
that can examine Restructured Text files.
"""Command-line routines for Restructured Text authors."""
__version__ = '0.1'
And, having created these files, I stopped, and stared in horror.
For an entire hour I tried to move on.
I tried to start writing actual code and actual documentation.
I tried to just ignore the stupidity of what I had just written.
Or, in the case of setup.py, what I had just
written by cutting and pasting
from another project on my hard drive —
yes, it's actually become that bad,
that we cut-and-paste file contents between Python projects
because our boilerplate requires so much repetition
while carrying so little information.
But, try though I might, I could not move on to writing code;
I was finally defeated.
The Python language has done such a wonderful job over the past decade
of honing my asthetics and sharpening my senses
that I am now unable to use its own standard packaging techquies!
This new package would have to wait
until I had resolved the problems
that sat staring me in the face.
Let us review them, one by one.
-
After stating so carefully
that this package was named cursive.tools,
I then had to inform setup()
that the project name would also be —
who would have guessed? —
cursive.tools as well!
This is idiotic.
Of course I am giving this project the same name
as the package it contains;
that is a best-practice from which modern Python projects
have no excuse to dissent.
Who wants to have to remember
that you need the ZODB3 package
when all you want to do is import persistent?
Who wants to remember to depend on pyephem
when all you want is to import ephem
(a problem that I, myself, created
in my own misguided Python youth)?
Not me.
And not, if they have any sense, my users.
-
This package is named cursive.tools.
Of course I want cursive to be a namespace package!
That is so painfully obvious that it should not even require mention;
it should be inferred.
-
Similarly, the mention that cursive is a package
in the packages declaration is redundant.
Of course if a.b is a package
then a is going to be a package as well!
There's not even a way to avoid that in the Python language,
so far as I know.
Why even make me type it?
-
The entire top-level __init__.py file —
the one inside of the cursive directory —
is utterly and entirely a boilerplate cut-and-paste.
Given that cursive is already stated
to be a namespace package,
it should not even be necessary to provide the contents
of its __init__.py;
it's standard
and can be copied straight from
PEP-382.
-
The package, you will note, has started out lacking
a long_description
despite the fact
that it has a perfectly serviceable README.txt file.
Many packages jump through the hoops of path manipulation
just to find their own README.txt
so that they can include it as their long description;
but why, in the absence of an override,
shouldn't its inclusion as the long description be the default?
-
This raises the larger question of where, exactly,
should a project README.txt even go —
where on the filesystem, that is, should it be placed?
There seems to be no consistency on this
between different Python packages.
Some people place it directly at the project top-level,
next to the setup.py file,
which is friendliest to developers
checking out the source code from a public repository —
but which makes the README.txt invisible
to users!
Others place it down inside of the package directory itself
so that it will be included in their distribution,
which is better;
and still other Python projects
have two separate README.txt files
so that they have both bases covered!
-
The package version is kept in two different places here:
in the setup.py and also in the __version__
symbol of the module itself.
When the version advances,
both places will have to be updated —
if the developer remembers!
The alternative is for the setup.py to grow
more complex by including its own bootstrap code
that uses path manipulations
to find and introspect the __version__ symbol
inside of the module.
-
The name of the package occurs
both at the top of README.txt
and inside of setup.py.
-
The short description is repeated twice:
once in the title of the README.txt
and once in the setup() stanza of the setup.py.
-
Finally, the directory structure of this project is ridiculous.
If, as the setup.py clearly states,
I am writing the cursive.tools module,
why should I even include both a cursive
and a tools directory?
Since the only legitimate activity
that I can undertake in constructing this module
is to place files inside of cursive.tools,
why do directories exist where files could collect outside
of this one depository?
Obviously,
the above arguments hold only for pure-Python packages;
when C extensions and other special effects come into play,
then excellent reasons arise for a complicated directory structure,
sophisticated metadata,
and possibly documentation above and beyond that distributed
with binary versions of the package.
But for normal packages,
I am finished with writing and distributing a setup.py
by hand.
Toward perfecting Pyron
My new tool for Python package building,
Pyron —
which, for those keeping score, is my very first
bitbucket-hosted project
(and I am very much enjoying these first few weeks of using Mercurial,
since Guido made the
big decision at the end of PyCon last month) —
is not yet mature enough to warrant a first release on PyPI.
Please check out the development version
if you want to take a first look at Pyron.
And, yes, Pyron currently has to include a setup.py
of its own,
which will not disappear until I release the first version
and it can become self-hosting!
Please note that Pyron is only for developers!
The sdist archives and the eggs produced
for a Pyron-powered project
are completely standard;
the end users and developers installing a module
will not be affected by your choice to use Pyron.
It simply keeps your project repository cleaner
by inferring package metadata on the fly
rather than making you maintain a setup.py
in version control along with your Python package.
A package developed with pyron
only needs two files:
README.txt and __init__.py.
The two files quoted above will work just fine.
These simply need to sit in the same directory,
like this:
./cursive.tools/README.txt
./cursive.tools/__init__.py
See?
All of the actual meat of the cursive.tools module
remains when the files are stored like this,
while the while repetition and boilerplate disappears!
Check out the Pyron README.txt
(or, of course, the same information as formatted in its
project page on PyPI)
for more details about how it works;
here, I will just make three last observations:
-
Sometimes I had to choose between best practices
when deciding how Pyron would operate.
Where, for example, should it find the package name?
Instead of looking at the title of the README.txt,
as it currently does,
one could imagine my having written it
to look somewhere in __init__.py
(but there seems to be no agreed-upon place
for a package to name itself),
or even at the name of the directory
in which the package is sitting
(but often the directory will not be named cursive.tools,
but something like branches/0.1
or even just trunk).
In each case,
I have tried to choose the most obvious
and easy-to-maintain convention,
and the real point is that there be some common idiom
for everyone to fall into line with
as more and more packages in the future
abandon their setup.py files
and start using Pyron.
-
Sometimes no best practice existed,
and I had to, frankly, make things up.
Where should the author of a package go,
without a setup.py file?
In a special metadata file that I would have to invent?
In some formatted region of the README.txt file?
By choosing instead that it go
inside an __author__ symbol in setup.py,
I hope that I have at least preserved symmetry
with an existing best-practice
while, again, making future Python projects as readable as possible
should Pyron use become widespread.
-
Pyron should become more sophisticated in the future,
and eliminate even more repitition.
It currently needs project dependencies,
for example,
to be defined as a __requires__ constant
in a package's __init__.py file.
In the future,
Pyron will hopefully gain the ability
to inspect a project's import statements
and make intelligent guesses about its dependencies
that could often eliminate any need
for explicit dependency declarations.
Thanks to Pyron,
I am now happily working away on my cursive packages,
and they should soon see their first releases.
I can now sleep at night,
knowing that boilerplate and repetition
have finally vanished from my development code.
©2014