Grok has book. Book good!

I little suspected the great chasm that lies between the simple act of agreeing to review a book, and the actual exercise of sitting down later to write the review. It feels quite pleasant, really, to jot off a positive reply to the publisher's polite question. One feels magnanimous for agreeing to help advance our civilization by reviewing a book about Python, and for helping out the publisher in what, after all, are such hard economic times. It is fun when the free copy arrives, crisp and smartly bound.

But then, eventually, one has to write the actual review.

And so, a full four months after that friendly email from Packt Publishing, it is time that I sit down and put together some thoughts about Carlos de la Guardia's first book, Grok 1.0 Web Development. Carlos is a long-time veteran of the Zope and Plone communities, and Grok, of course, is the web framework that places a simple and agile convention-driven engine atop the otherwise notoriously XML-ridden Zope application framework. Grok is an important project, because it packages the technology of Python's oldest and most experienced community of web developers in a way that makes it easy to extend and use.

(more...)

Posted in Books, Computing, Grok, Python, Zope | No Comments »

Python multiprocessing is different under Linux and Windows

One of the great recent advances in the Python Standard Library is the addition of the multiprocessing module, maintained by Jesse Noller who has also blogged and written about several other concurrency approaches for Python — Kamaelia, Circuits, and Stackless Python.

I have wanted to try the multiprocessing module out for some time, and now have a consulting project that will really benefit from multiple processes: they will let our application run third-party plugins without having to worry that any bugs or indiscretions which they commit might damage or hang our main server, which can remain safe in another process.

First, one can only stand in awe at the achievement — and the amount of work — that the multiprocessing module represents. I cannot imagine the time that it would have taken our team to figure out all of the differences between Linux and Windows when it comes to processes, shared memory, and concurrency mechanisms. In fact, the approach we are taking might not even have been feasible under those circumstances. By figuring out how to get locks, queues, and shared data structures all working cleanly on such different architectures, the multiprocessing authors save Python programmers out on the street like me from reinventing a dozen wheels when we need to support multi-platform concurrency.

Well, almost.

There is one rather startling difference which the multiprocessing module does not hide: the fact that while every Windows process must spin up independently of the parent process that created it, Linux supports the fork(2) system call that creates a child processes already in possession of exactly the same resources as its parent: every data structure, open file, and database connection that existed in the parent process is still sitting there, open and ready to use, in the child. Consider this small program:

from multiprocessing import Process
f = None

def child():
    print f

if __name__ == '__main__':
    f = open('mp.py', 'r')                                                      
    p = Process(target=child)
    p.start()
    p.join()

On Linux, the open file f keeps its value in the child process; the child has inherited an open connection from its parent:

$ python mp.py
<open file 'mp.py', mode 'r' at 0xb7734ac8>

Under Windows, however, where the multiprocessing module has to spawn a fresh copy of the Python interpreter to which it gives special instructions to just run the function f(), the module is a clean slate without an open file inside:

C:\Users\brandon\dev>python mp.py
None

Now, my complaint is not exactly that the multiprocessing documentation is misleading on this point; under its section on Programming guidelines, it makes it quite clear that:

On Unix a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.

I have no quarrel with this advice; if I am careful to pass everything the child needs in its list of arguments, then I can be sure that my code will work under both Linux and Windows.

But I do wish that the multiprocessing module provided more support for testing this condition more rigorously under Linux. In particular, I wish that there were some way of turning the simple forking logic off — of saying, “Yes, I know that Linux will let you create a child process very simply using fork(2), but for my sanity would you please create the child process from scratch like you do under Windows so that I can test whether my code accidentally depends on residual state from the parent process that I did not see that I was using?” I looked at the multiprocessing "forking.py" module to see whether I could turn on the Windows-style process spawning even from inside of Linux, but the mechanism is chosen by a bare module-level check of "sys.platform" and if I overwrite that variable with 'win32' the code then dies when it tries to import "msvcrt" which is available only under Windows.

There is, thus, even in principle, no way that I can test my multiprocessing application under Linux which will give me any assurance that my child processes are not accidentally taking advantage of data structures and open connections left lying around by the parent process; only by actually moving over to Windows itself can I see how my child code really behaves on its own. I have created a feature request in the Python bug tracker to see whether this situation can be improved.

But even with this one inconvenience — which is troubling me much less, now that I at least understand why my application was behaving so differently under Windows — the multiprocessing module is still a huge leap forwards for Python programmers who need to run code in heavyweight processes with all of the isolation and safety that they provide. Thanks again to Jesse and the multiprocessing team!

Posted in Computing, Python | 7 Comments »

Sphinx + Mercurial = My favorite CMS

Though I write and maintain some of the content for our Python Atlanta web site, updates and additional content often come in from other users. For example, our Plone interest group — headed up by Christopher Johnson — has their own page on our web site. And the information about our book club is both written and regularly updated by Doug Hellmann.

How can a collaborative site like ours best be edited and updated? Well, I would like to report some modest initial success with an experimental approach: I now maintain the site as a Sphinx-powered documentation system stored in a BitBucket repository into which I pull changes made by my collaborators. The advantages are several.

  • The change management tools supported by traditional CMS systems, even at their best, seem somehow anemic when compared to the toolkit provided by a good DVCS like Mercurial. Where, for example, does even a capable CMS like Plone provide anything like Mercurial's “backout” or “blame” commands?
  • Markup as well-designed as reStructuredText is not only a lot of fun to use, bit it also very cleanly separates content from design. Authors working in plain text tend to produce clean, readable content without the messy markup often associated with visual HTML editors, or, worse yet, the disaster that is Microsoft Word.
  • Staging — a feature I find essential, but which seems missing from many default CMS configurations — occurs automatically! Each author can see locally how the site will look with their changes, and after doing a pull I can review the site's appearance on my laptop before finally deploying the new content to the production site.

To top it all off, authors get to use their own editor-of-choice when making contributions, and we all get extra practice cloning and merging in my favorite DVCS. I am optimistic about this direction, but I will post again if we wind up hitting snags in the future. Finally, of course, feel free to clone our repository if you want to see how Sphinx looks when running a generic web site.

Posted in Computing, Document processing, Python | 8 Comments »

Ubuntu Python: raise an exception, import 190 modules

Imagine my surprise, while writing my first PEP 302 compliant import hook this afternoon, to carefully watch “sys.modules” for the results of my import but see it suddenly grow by nearly two hundred modules! What on earth had I done wrong? Some quick experiments revealed that my only sin was having the temerity to raise an exception. Let's try raising a simple NameError:

>>> import sys
>>> len(sys.modules)
35
>>> foo
...
NameError: name 'foo' is not defined
>>> len(sys.modules)
225

That's 190 extra modules — merely importing them takes around 60 ms on my laptop! Where are they all coming from? And how could an exception cause so many imports, including such illustrious modules as “email”, “mimetools”, and “xml”?

After reading Ubuntu's “sitecustomize.py” file and all of its consequences, the situation became clear. Their apport crash-reporting subsystem instruments Python with an exception hook that, when invoked, discovers that my system says “enabled=0” in my “/etc/default/apport” file and so it undertakes no special crash logging. But, on the way to loading the routine that performs this simple check, it performs two quite flagrant and unnecessary imports, pulling in both “apt” (that brings with it 83 packages) and “apport” (an additional 107 packages).

The solution? I have removed the “python-apport” package, along with the “ubuntuone-client” suite that depends on it. After the uninstall, exceptions are — wonderfully enough — not causing a single import of a new module! Now, finally, I can continue writing my import hook in peace.

Posted in Computing, Python | 35 Comments »

Opening tabs remotely in Google Chrome

Now that I use Google Chrome almost exclusively, I miss the fact that a running Firefox instance could be controlled from the command line so that Emacs could call for a new tab when I clicked on a URL. It would run a command something like this:

firefox -remote 'openURL(http://example.com/, new-tab)'

But after a few months of manually cutting and pasting URLs into Chrome — which wasn't actually that bad, since the address bar in Chrome is such a convenient and large target — I decided that I needed a real solution. After not finding anything like a -remote option, I discovered that Chrome can at least be run with a debugging port open:

google-chrome --remote-shell-port=9222

The protocol that Chrome speaks is primitive enough that it was quick work to implement a small client in Python. Rather than merely cutting and pasting its code here on my blog, or even be satisfied with making it available on bitbucket, I decided to place the code inside of a new Python package and make it generally available on PyPI as chrome_remote_shell.

Thanks to this simple package, a four-line program (not counting the shebang and comment) is now all that I need to ask Google Chrome to open a new tab:

#!/usr/bin/env python
# Name this file "google-chrome-open-url"
import sys
import chrome_remote_shell
shell = chrome_remote_shell.open()
shell.open_url(sys.argv[-1])

To teach Emacs to start using Google Chrome when I clicked on a link, I only needed to supply it with two new settings:

(setq browse-url-browser-function
      'browse-url-generic)
(setq browse-url-generic-program
      "google-chrome-open-url")  

And now everything works. I hope that these notes prove useful to someone else. Enjoy!

Posted in Computing, Emacs, Python | 11 Comments »

Leaving Python Magazine

It was with regret that I tendered my resignation yesterday as the Editor-in-Chief of Python Magazine. While the publisher will keep producing the magazine by distributing PDFs on the web site, the transition to the new format has dragged on long enough — both for both myself and our customers — that I have run out of enthusiasm. My last responsibility will be to shepherd the February and March issues through the publishing process and safely on to the PDF readers of our subscribers.

I hope that the authors featured in the October issue will forgive me for not writing my usual blog post last year touting their achievements; I had just received the sad news that the publisher could no longer afford the rising costs of printing and shipping Python Magazine, and I did not want to further advertise the magazine until its fate was certain one way or the other.

I have by no means been a perfect editor. In particular, the publisher hoped that I would get the magazine — which was running eight weeks late — back on schedule. Instead, my bumbling first month as editor made the magazine an additional week late, and by the time I hit my stride in May it was another week behind. Although the schedule then stabilized at a steady ten weeks late, I never did manage to start reeling the fish back in. The only metric, I suppose, which I can really claim to my credit is that I oversaw a nineteen-fold increase in the number of em-dashes in the magazine — 247 appeared over the course of 2009, up from only 13 the year before!

I should express thanks to my co-workers: Arbi, Emanuela, and Cathleen are smart, helpful, and professional, and were patient with me as I learned the ropes. Doug Hellmann gave me ample training as he handed over the reins, and also supported the magazine later as an acquisitions editor. Several associate editors performed solid reviews of incoming articles. And, of course, the greatest privilege of being Editor-in-Chief was to help such a wide array of voices from the Python community find their way into print — from Steve Holden, the illustrious chair of the Python Software Foundation, to young Meran Cambpell-Hood, an eleven-year-old from New Zealand who described using Python for the first time to process data for her science fair project.

Which reminds me: the authors from the October issue never got their moment in the spotlight! The article by Meran Campbell-Hood about her science fair project was the most fun to edit, but every single article was interesting and taught me something. Steve Holden interviewed James Tauber about the secrets of a successful Python start-up; Yusdi Santoso finished his two-part series on the Python program he wrote to produce the PDF for the beautiful EuroPython brochure last year; the original editor of Python Magazine, Brian Jones, returned to talk about why he now tends to choose Django for web projects rather than PHP; and Joe Amenta introduced his "3to2" project, which will help Python programmers support their old Python 2 users while still moving ahead with the transition to Python 3. Finally, Greg Newman explained how to turn Emacs into a powerful Python IDE, and Steve Holden and myself rounded out the issue with our usual editorializing.

With many of the rest of you, I am eager to see the debut of the new Python Magazine web site. And I look forward to seeing everyone at PyCon 2010 in less than three weeks! While I will not have the joy that I did last year of walking the halls of PyCon as a newly-minted Editor-in-Chief, able to make dreams come true and grant the fame and fortune of being a published author, I will at least enjoy being a developer among developers in the best programming language community on Earth!

Posted in Python | 16 Comments »

The September 2009 issue of Python Magazine

The September issue of Python Magazine appeared on the web late last week and only now, as a new week has started, am I finally sitting down to announce it! The articles range from technically heavy development topics to high-level thoughts about the whole Python community, with plenty in between.

I have to say that our prettiest article this month is “Using Python to Create Beautiful Documents” by Yusdi Santoso, who shares the basic secrets to document generation that he learned when building the EuroPython 2009 brochure using a Python program. Traditional typesetting and computer typography were both interests of mine when I was growing up, so it was fun to read Yusdi's introduction to using ReportLab to generate PDF documents. I look forward to his follow-up article that we will soon be publishing, on the specific techniques that he used in creating the EuroPython booklet.

The other technical articles are an introduction to using SOAP in Python; a guide to displaying objects in a Mac OS X GUI created with PyObjC; an article introducing Python's own built-in Tkinter GUI toolkit; and a small excursion of my own that attempts to explain the popular “trick” (well, it really confused me the first time I saw it!) of defining a decorator using a pair of nested functions. I should confess that my own article contains what is probably this issue's biggest mistake, as pointed out quite promptly by alert reader Emanuel Woiski: in the code sample that is its whole crux of my example, I somehow managed to omit one of the most crucial lines, shown here in bold:

def log(function):
    def log_wrapper(*args):
        print "called %s%s" % (
            function.__name__, tuple(args)
            )
        return function(*args)
    return log_wrapper

I suppose I will now need remedial cut-and-paste training of some sort.

Finally, the issue is rounded out by three articles that move back from Python coding and step out to wider vantage points. Justin Lilly provides an excellent guide to customizing your Vim setup so that it becomes a powerful Python integrated development environment. Steve Holden muses about why diveristy is so difficult and reveals some of the recent goings-on surrounding the diversity statement that the Python Software Foundation has been working on. And my own editorial seeks to point any Python Magazine readers who do not yet have a strong connection with the wider community in the direction of greater engagement with the world of Python.

All in all, I think the issue is a nice mix of fact, experience, and opinion. Please consider subscribing if you would like to hear more about what people are doing with Python, and how. I enjoy reading it; so might you.

Posted in Computing, Document processing, Python | 3 Comments »

Google Earth and Middle-earth

GetPaid for Plone logo

Importing a normal, rectangular map of Middle-earth as a Google Earth overlay is too narrow toward the north.

I wanted to measure distances in Tolkien's Middle-earth. While a flat map distorts such measurements, it occured to me that Google Earth can correctly measure both lines and paths across the curved surface of the globe. I soon found excellent documentation for using image overlays with Google Earth, so I downloaded a map of Middle-earth and tried placing it on the globe.

Imagine my disappointment when I saw the result shown in the above image! At first I made the mistake of not holding down the Shift key when resizing the image in Google Earth; the Shift key is absolutely critical for the image to maintain its aspect ratio as you stretch it to the right dimensions. But even after learning this habit, it was painfully clear that the Middle-earth map's projection was different from that expected by Google Earth: the map is far too narrow at the top.

Obviously, it was time to pull out Python, my favorite programming language, and see whether the Python Imaging Library could help me make short work of converting a map from one projection to another.

(more...)

Posted in Computing, Python | 4 Comments »

Python at the 2009 Atlanta Linux Fest

GetPaid for Plone logo

My Python table at the Atlanta Linux Fest. You can also watch a short video of me demonstrating a depth-first search to some students who dropped by the table. (Thanks, Richard Davies, for the video!)

Running the Python table at the Atlanta Linux Fest this past weekend was a really incredible experience.

First, there was the great feeling that the pillars of the Python community were standing behind me as I stepped forward to represent my favorite programming language. It was Andrew Kuchling who noticed that exhibitor tables at the Fest were free for non-profits like the Python Software Foundation, and Steve Holden who forwarded me a heads-up since I live in Atlanta (the Fest had not yet made it on to my radar). The inimitable Aahz personally shipped me the promotional kit, including a huge “Python” banner and stacks of brochures, that he himself had just used at OSCON 2009. And, completing the loop, it was Andrew who followed up to ask if there were any last things that I needed, and sent me a pile of over one hundred Python stickers that wound up being very popular at the Fest. (I returned home with exactly one, which is sitting next to me on my desk as I type this!)

Here are some lessons that I learned from the experience:

(more...)

Posted in Computing, Plone, Python | 8 Comments »

GetPaid needs customizable forms

I would like some advice from Zope and Plone folks about how to create forms that are not only easy for other developers to specialize, but which allow several specializations to be composed together. While I have used zope.formlib and z3c.form before for simple tasks, I have not yet been able to tell whether they support these more advanced kinds of operations.

Some background: I am doing some work on GetPaid for Plone with the generous funding of Derek Richardson who, if his dreams had not carried him away from grad school at the end of the Spring semester, would have tackled this same work as part of the 2009 Google Summer of Code.

The current mechanisms that GetPaid provides for customizing its checkout process are very primitive, and my task is to improve them. That is why I have been thinking about customizing forms.

(more...)

Posted in Computing, Plone, Python, Zope | 7 Comments »