by Brandon Rhodes • Home

pyron: Making Python package development DRY to the point of no return

Date: 22 April 2009
Tags:computing, python

I finally snapped last week.

After years of writing verbose and repetitive setup.py files for my Python packages, I am unable to write another. Instead, I have started writing Pyron, a tool that gathers the same information by inspecting a Python package itself. Not only does this mean that I get to stop repeating myself, but that my projects will become much more uniform because package metadata will be represented through common conventions instead of explicit (and repetitive) configuration. Though Pyron is still very primitive, it has already allowed me to reduce simple packages to only a README.txt plus their actual Python source code.

The start of the trouble

What happened is that I wanted to create a simple Python package full of tools for professional authors working with rst documents, so that they could monitor their word count while writing, and convert their rst files into the proprietary formats used by various publications. But just to start a new Python project required me to create four entire files, and almost as many directories:

./cursive.tools/setup.py
./cursive.tools/cursive/__init__.py
./cursive.tools/cursive/tools/README.txt
./cursive.tools/cursive/tools/__init__.py

The setup.py file itself repeats the project name over, and over, and over again, reminding me of the old Adventure game's “maze of twisty passages, all alike”:

from setuptools import setup
setup(
    name = 'cursive.tools',
    version = '0.1',
    description = 'Tools for restructured text files',
    author = 'Brandon Craig Rhodes',
    author_email = 'brandon@rhodesmill.org',
    packages = ['cursive.tools', 'cursive'],
    namespace_packages = ['cursive'],
    )

The first __init__.py file shown above of course looks like:

  import pkg_resources
  pkg_resources.declare_namespace(__name__)

Meanwhile, my stub README.txt and __init__.py files down in the bottom directory contained just enough information to get me started, whether I wanted to start by writing documentation and tests or get started by writing actual code:

``cursive.tools`` -- Tools for restructured text files
------------------------------------------------------

The routines in this ``cursive.tools`` package are
designed for authors.  They provide command-line tools
that can examine Restructured Text files.
"""Command-line routines for Restructured Text authors."""

__version__ = '0.1'

And, having created these files, I stopped, and stared in horror.

For an entire hour I tried to move on. I tried to start writing actual code and actual documentation. I tried to just ignore the stupidity of what I had just written. Or, in the case of setup.py, what I had just written by cutting and pasting from another project on my hard drive — yes, it's actually become that bad, that we cut-and-paste file contents between Python projects because our boilerplate requires so much repetition while carrying so little information.

But, try though I might, I could not move on to writing code; I was finally defeated. The Python language has done such a wonderful job over the past decade of honing my asthetics and sharpening my senses that I am now unable to use its own standard packaging techquies! This new package would have to wait until I had resolved the problems that sat staring me in the face. Let us review them, one by one.

  1. After stating so carefully that this package was named cursive.tools, I then had to inform setup() that the project name would also be — who would have guessed? — cursive.tools as well! This is idiotic. Of course I am giving this project the same name as the package it contains; that is a best-practice from which modern Python projects have no excuse to dissent. Who wants to have to remember that you need the ZODB3 package when all you want to do is import persistent? Who wants to remember to depend on pyephem when all you want is to import ephem (a problem that I, myself, created in my own misguided Python youth)? Not me. And not, if they have any sense, my users.
  2. This package is named cursive.tools. Of course I want cursive to be a namespace package! That is so painfully obvious that it should not even require mention; it should be inferred.
  3. Similarly, the mention that cursive is a package in the packages declaration is redundant. Of course if a.b is a package then a is going to be a package as well! There's not even a way to avoid that in the Python language, so far as I know. Why even make me type it?
  4. The entire top-level __init__.py file — the one inside of the cursive directory — is utterly and entirely a boilerplate cut-and-paste. Given that cursive is already stated to be a namespace package, it should not even be necessary to provide the contents of its __init__.py; it's standard and can be copied straight from PEP-382.
  5. The package, you will note, has started out lacking a long_description despite the fact that it has a perfectly serviceable README.txt file. Many packages jump through the hoops of path manipulation just to find their own README.txt so that they can include it as their long description; but why, in the absence of an override, shouldn't its inclusion as the long description be the default?
  6. This raises the larger question of where, exactly, should a project README.txt even go — where on the filesystem, that is, should it be placed? There seems to be no consistency on this between different Python packages. Some people place it directly at the project top-level, next to the setup.py file, which is friendliest to developers checking out the source code from a public repository — but which makes the README.txt invisible to users! Others place it down inside of the package directory itself so that it will be included in their distribution, which is better; and still other Python projects have two separate README.txt files so that they have both bases covered!
  7. The package version is kept in two different places here: in the setup.py and also in the __version__ symbol of the module itself. When the version advances, both places will have to be updated — if the developer remembers! The alternative is for the setup.py to grow more complex by including its own bootstrap code that uses path manipulations to find and introspect the __version__ symbol inside of the module.
  8. The name of the package occurs both at the top of README.txt and inside of setup.py.
  9. The short description is repeated twice: once in the title of the README.txt and once in the setup() stanza of the setup.py.
  10. Finally, the directory structure of this project is ridiculous. If, as the setup.py clearly states, I am writing the cursive.tools module, why should I even include both a cursive and a tools directory? Since the only legitimate activity that I can undertake in constructing this module is to place files inside of cursive.tools, why do directories exist where files could collect outside of this one depository?

Obviously, the above arguments hold only for pure-Python packages; when C extensions and other special effects come into play, then excellent reasons arise for a complicated directory structure, sophisticated metadata, and possibly documentation above and beyond that distributed with binary versions of the package. But for normal packages, I am finished with writing and distributing a setup.py by hand.

Toward perfecting Pyron

My new tool for Python package building, Pyron — which, for those keeping score, is my very first bitbucket-hosted project (and I am very much enjoying these first few weeks of using Mercurial, since Guido made the big decision at the end of PyCon last month) — is not yet mature enough to warrant a first release on PyPI. Please check out the development version if you want to take a first look at Pyron. And, yes, Pyron currently has to include a setup.py of its own, which will not disappear until I release the first version and it can become self-hosting!

Please note that Pyron is only for developers! The sdist archives and the eggs produced for a Pyron-powered project are completely standard; the end users and developers installing a module will not be affected by your choice to use Pyron. It simply keeps your project repository cleaner by inferring package metadata on the fly rather than making you maintain a setup.py in version control along with your Python package.

A package developed with pyron only needs two files: README.txt and __init__.py. The two files quoted above will work just fine. These simply need to sit in the same directory, like this:

./cursive.tools/README.txt
./cursive.tools/__init__.py

See? All of the actual meat of the cursive.tools module remains when the files are stored like this, while the while repetition and boilerplate disappears! Check out the Pyron README.txt (or, of course, the same information as formatted in its project page on PyPI) for more details about how it works; here, I will just make three last observations:

Thanks to Pyron, I am now happily working away on my cursive packages, and they should soon see their first releases. I can now sleep at night, knowing that boilerplate and repetition have finally vanished from my development code.

comments powered by Disqus

©2014