<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/"
     >
  <channel>
    <title>Let’s Discuss the Matter Further</title>
    <link>http://rhodesmill.org/brandon</link>
    <description>Your Blog's short description</description>
    <pubDate>Thu, 19 Jan 2012 00:48:19 GMT</pubDate>
    <generator>Blogofile</generator>
    <sy:updatePeriod>hourly</sy:updatePeriod>
    <sy:updateFrequency>1</sy:updateFrequency>
    <item>
      <title>JavaScript Breaks Math</title>
      <link>http://rhodesmill.org/brandon/2012/js/</link>
      <pubDate>Sun, 15 Jan 2012 02:59:10 EST</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/2012/js/</guid>
      <description>JavaScript Breaks Math</description>
      <content:encoded><![CDATA[<div class="document">
<p>Why do we Python programmers
stay so annoyed with JavaScript's broken <tt class="docutils literal">this</tt> keyword?
After all, every programming language has rough edges.
The problem with <tt class="docutils literal">this</tt> even turns out to be easy to work around
once you learn the knack.
So why does it feel like JavaScript has committed a fresh offense
every time we trip over&nbsp;it?</p>
<p>The answer, I suggest, is that JavaScript manages to disturb
very deep mental scaffolds
with the behavior of the <tt class="docutils literal">this</tt> keyword.
Not all programming language annoyances are created equal.
Some involve inconsistency within a language itself,
when a pattern set up by one feature is broken by another
(think of method names in the Python Standard Library).
More serious issues can involve hassles with a language's syntax
or poor behaviors within its type system.
But in this case JavaScript decided
that it would abandon a key property
of the system that lies <em>beneath</em> it —
that it would break the conventions that,
in fact, underlie all programming languages.</p>
<p>JavaScript decided that it would break <em>mathematics.</em></p>
<p>Let me explain by starting with a simple example.
You are already familiar with operator precedence,
and how multiplication binds more tightly than
addition when both operators appear in the same expression.
In the following expression,
<em>a</em> will first be multiplied with <em>b</em>,
then the result of that operation will be added to <em>c</em>.</p>
<div class="line-block">
<div class="line">(1)</div>
<div class="line"><em>n</em> = <em>a</em> × <em>b</em> + <em>c</em></div>
</div>
<p>There is, in other words,
a hidden intermediate result inside of this equation:
the result of the multiplication.
So&nbsp;(1) is, in fact, a shorthand
for writing this sequence of two separate binary operations:</p>
<div class="line-block">
<div class="line">(2)</div>
<div class="line"><em>x</em> = <em>a</em> × <em>b</em></div>
<div class="line"><em>n</em> = <em>x</em> + <em>c</em></div>
</div>
<p>Note that our ability to transform (1) into the pair of lines (2)
does not involve any special properties of the operators themselves.
This does <em>not</em> illustrate some special feature
of multiplication or addition,
like the Distributive Property!
Instead, we are working down at the lower and more fundamental level
of asking what a complex math expression even <em>means</em>.
So while we must use argument and proof
to learn that addition is commutative,
the operators and their precedence are simply
a matter of <em>definition</em> —
of what we decide it means when we string symbols together
to form an expression in the first place.</p>
<p>Now it turns out that the familiar programming language idiom
of calling a method in a language like Python or JavaScript
is quite precisely analogous to expression&nbsp;(1),
because it separates three symbols
with a pair of binary operators,
where the left operator binds most tightly:</p>


<div class="pygments_murphy"><pre><span class="p">(</span><span class="mi">3</span><span class="p">)</span>   <span class="n">n</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">b</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</pre></div>



<p>Until a programmer really grasps what it means
for a language to have “first-class functions” —
functions that can themselves be manipulated as values —
it might be difficult to see that <tt class="docutils literal">a.b</tt>
makes quite good sense simply standing by itself.
It means “take the <tt class="docutils literal">a</tt> object,
look and see whether it has an attribute named <tt class="docutils literal">b</tt>,
and resolve the value of that attribute.”
And so <tt class="docutils literal">a.b</tt> works perfectly in front of <tt class="docutils literal">(c)</tt>
so long as the result of the attribute lookup
happens to return a callable.</p>
<p>So expression (3) can be decomposed like expression (1),
and in Python the following two steps are
exactly equivalent to statement (3) —
except, of course, for defining an extra local variable <tt class="docutils literal">fn</tt>:</p>


<div class="pygments_murphy"><pre><span class="p">(</span><span class="mi">4</span><span class="p">)</span>    <span class="n">fn</span> <span class="o">=</span> <span class="n">a</span><span class="o">.</span><span class="n">b</span>
       <span class="n">n</span> <span class="o">=</span> <span class="n">fn</span><span class="p">(</span><span class="n">c</span><span class="p">)</span>
</pre></div>



<p>This is, again, simply a property of how expressions work in math —
of the fact that you ought to be able to compute intermediate results
by pulling an expression apart into its constituent binary operations.
But, alas, JavaScript decided to break this property of expressions,
and makes extra invisible magic happen
when the two operators are used in combination —
magic that does not happen when they are separated into separate steps.
Or, perhaps there is a more interesting way to think about it:</p>


<div class="pygments_murphy"><pre><span class="cm">/* In JavaScript this is NOT a pair of binary operations. It is a SINGLE</span>
<span class="cm">   ternary operator that, to sow confusion among programmers, happens to</span>
<span class="cm">   use the same symbols as two well-known binary operations. */</span>

<span class="nx">a</span><span class="p">.</span><span class="nx">b</span><span class="p">(</span><span class="nx">c</span><span class="p">);</span>

<span class="cm">/* This ternary operator is roughly equivalent to: */</span>

<span class="kd">var</span> <span class="nx">fn</span> <span class="o">=</span> <span class="nx">a</span><span class="p">.</span><span class="nx">b</span><span class="p">;</span>
<span class="kd">var</span> <span class="nx">old_this</span> <span class="o">=</span> <span class="k">this</span><span class="p">;</span>
<span class="k">this</span> <span class="o">=</span> <span class="nx">a</span><span class="p">;</span>
<span class="nx">fn</span><span class="p">(</span><span class="nx">c</span><span class="p">);</span>
<span class="k">this</span> <span class="o">=</span> <span class="nx">old_this</span><span class="p">;</span>
</pre></div>



<p>Younger programmers,
for whom <tt class="docutils literal">a.b(c)</tt> is simply a gesture,
may find our distaste for JavaScript's behavior inexplicable.
The problem is worst
for the experienced programmer or mathematician,
who — every time she types it —
remembers what the dot and parentheses really mean
as clean and separate operations,
but has to remember that their meanings change
when they appear in combination.
This semantic instability flaunts a very long tradition
of defining math operators
so that expressions can be composed together
and broken down again
without changing their meaning.</p>
<p>And that, I think, is why it annoys us:
because from early grade school through college
we have learned that math expressions compose and decompose cleanly,
and JavaScript takes that symmetry away.</p>
<p>One last note for newer Python programmers reading this:
you might be suspecting that Python itself has some kind of magic
involved here, because how else could it remember later
whether you had pulled method <tt class="docutils literal">fn</tt>
off of the specific object <tt class="docutils literal">a</tt>
instead of off some other instance of that class?
The answer is that every lookup of an instance method
returns a new object, called a <em>bound method</em>,
that remembers the object on which the lookup took place.</p>
<pre class="doctest-block">
&gt;&gt;&gt; class C:
...     def __init__(self, n):
...         self.n = n
...     def __repr__(self):
...         return 'C%d' % self.n
...     def fn(self, m):
...         return self.n + m
...
&gt;&gt;&gt; a = C(100)
&gt;&gt;&gt; b = C(220)
&gt;&gt;&gt; a.fn
&lt;bound method C.fn of C100&gt;
&gt;&gt;&gt; b.fn
&lt;bound method C.fn of C220&gt;
&gt;&gt;&gt; b.fn(5)
225
</pre>
<p>What about your own least favorite language features,
whether in JavaScript, Python, or something else?
Are they all simply about scruples and inconvenience?
Or can you identify some deep-seated assumptions
of your own mental scaffolding
that keep ruining your experience with a specific language?
Let us know in the comments!</p>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Why am I going to CodeMash?</title>
      <link>http://rhodesmill.org/brandon/2012/codemash/</link>
      <pubDate>Tue, 10 Jan 2012 22:40:55 EST</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/2012/codemash/</guid>
      <description>Why am I going to CodeMash?</description>
      <content:encoded><![CDATA[<div class="document">
<p>By this time tomorrow I will doubtless be wearing swimming trunks —
in northern Ohio — <em>in winter</em> —
while listening to people rave about Java and C# and Ruby
and wondering what I have gotten myself into.
I&nbsp;will be at <a class="reference external" href="http://codemash.org/">CodeMash</a>,
a conference that was started by developers
who wanted an event that focused solely on the programmer
while including a wide range of languages and technologies.</p>
<div class="dropshadow alignright">
  <a href="http://www.flickr.com/photos/irisphotos/6668819779/" title="Kalahari Water Park in Sandusky, OH by iriskh, on Flickr"><img src="http://farm8.staticflickr.com/7149/6668819779_081cc16758_m.jpg" width="240" height="159" alt="Kalahari Water Park in Sandusky, OH"></a>
</div><p>It was lucky that I signed up
the moment that CodeMash 2012 registration opened back in October,
because the 1,200 conference tickets
sold out in only <strong>20 minutes</strong> —
no event in the Python community
had prepared me for that kind of demand!</p>
<p>I&nbsp;learned about CodeMash conference from its co-founder
<a class="reference external" href="http://brianhprince.com/">Brian Prince</a>
when we were fellow speakers at <a class="reference external" href="http://pyohio.org/">PyOhio</a> last July.
So why did a Python programmer like me decide to attend?</p>
<ul class="simple">
<li>Even though the technologies that Brian chooses
are different from mine,
he is clearly animated by the same passion
for combining good code with good community.
If his co-founders are at all like him,
then I knew that a great event was in the works.</li>
<li>I want to learn another conference's culture.
For example, I was stunned to see strong suggestions
on the CodeMash mailing list
that attendees should <em>not</em> carry laptops —
instead, they recommend pencil, paper,
and something they call
<a class="reference external" href="http://www.dachisgroup.com/2011/12/the-sketchnote-revolution/">Sketchnotes</a>.
Since good teachers have a reflex
that makes them try to bring the whole audience along with them
as they make a point,
we could actually be slowing up PyCon speakers
when half the audience is face-down typing
and clearly a half step behind what is being said.</li>
</ul>
<div class="dropshadow alignright">
  <a href="http://www.flickr.com/photos/alan_barber/4277666916/" title="CodeMash 2010 by Alan.Barber, on Flickr"><img src="http://farm3.staticflickr.com/2713/4277666916_9fd2ec0fb6_m.jpg" width="180" height="240" alt="CodeMash 2010"></a>
</div><ul class="simple">
<li>I could predict that moving to
<a class="reference external" href="http://en.wikipedia.org/wiki/Bluffton,_Ohio">Bluffton, Ohio</a>,
as winter descended
would leave me with very few opportunities
to meet other developers.
After several weeks of being the only programmer I know,
a&nbsp;large regional conference
will let me bask in the company
of other people who understand what I do for a living.</li>
<li>It is too easy to judge other languages
by what I think are drawbacks in their design,
or by the poor code that most programmers produce
whatever their language.
I&nbsp;want see what excites the real experts
who solve interesting problems using Java, Ruby, and C# —
people who use those languages to the hilt, and do a great job of&nbsp;it.</li>
<li>Being evangelized by smart people is interesting and humbling,
if you let down your guard and really listen.
And hearing Ruby and C# people explain the glories of their languages
will remind me of how I must sound
when I hold forth on the advantages of Python,
or <a class="reference external" href="http://www.gnu.org/software/emacs/">Emacs</a>,
or <a class="reference external" href="http://www.vibramfivefingers.com/">Vibram Fivefingers</a>.</li>
</ul>
<div class="dropshadow alignright">
   <a href="http://www.flickr.com/photos/alan_barber/4277668868/" title="CodeMash 2010 by Alan.Barber, on Flickr"><img src="http://farm5.staticflickr.com/4055/4277668868_002df3fc6e_m.jpg" width="240" height="180" alt="CodeMash 2010"></a>
</div><ul class="simple">
<li>Brian wants to make CodeMash more popular for Python programmers —
and a few luminaries like Bruce Eckel, Mike Pirnat, and Mark Ramm
are already on the schedule this year.
Next year I might offer to speak.
But first I wanted to show up and just listen,
figure out the vibe of the conference,
and learn more about what is happening
outside of the Python community.</li>
<li>Finally, it does sound like great fun:
eating, drinking, and being merry
at an <a class="reference external" href="http://www.kalahariresorts.com/oh/">indoor water park resort</a>
in the middle of an Ohio winter.
Kalahari, here I come!</li>
</ul>
<p>(Images are Creative Commons licensed from Flickr photographers
<a class="reference external" href="http://www.flickr.com/photos/irisphotos/">iriskh</a>
and
<a class="reference external" href="http://www.flickr.com/photos/alan_barber/">Alan.Barber</a>.)</p>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Mounting WSGI Applications Under CherryPy</title>
      <link>http://rhodesmill.org/brandon/2011/wsgi-under-cherrypy/</link>
      <pubDate>Wed, 04 May 2011 23:00:28 EDT</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/2011/wsgi-under-cherrypy/</guid>
      <description>Mounting WSGI Applications Under CherryPy</description>
      <content:encoded><![CDATA[<div class="document">
<p>Today I got stuck between a rock and hard place —
or, more specifically, stuck between the assumptions
of Robert Brewer and those of Ian Bicking.
In case you ever try mounting a WSGI application
underneath a larger CherryPy application,
here is the story.</p>
<div class="section" id="simple-wsgi-grafting">
<h1>Simple WSGI grafting</h1>
<p>Robert Brewer's <a class="reference external" href="http://www.cherrypy.org/">CherryPy</a> is a Python web framework
of the controllers-and-methods variety.
CherryPy has a long, solid track record,
and is especially well-known
for shipping with a built-in production-quality web server.
The server is so good that it is sometimes used standalone,
without the actual CherryPy framework behind it,
to serve other Python web applications through their WSGI callable:</p>


<div class="pygments_murphy"><pre><span class="c"># Easy: putting a WSGI `app` behind the CherryPy HTTP server</span>

<span class="n">server</span> <span class="o">=</span> <span class="n">CherryPyWSGIServer</span><span class="p">((</span><span class="s">&#39;0.0.0.0&#39;</span><span class="p">,</span> <span class="mi">8001</span><span class="p">),</span> <span class="n">app</span><span class="p">,</span> <span class="n">numthreads</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span>
<span class="n">server</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
</pre></div>



<p>Sometimes, however, it is nice to have the entire CherryPy
web framework running —
not merely its HTTP server —
in combination with an existing WSGI application.
This arrangement makes it easy to do things like
provide static resources
alongside more dynamic content generated in Python:</p>


<div class="pygments_murphy"><pre><span class="c"># More interesting: mounting `app` beneath a particular URL path</span>
<span class="c"># This works, but `app` gets no logging or error handling</span>

<span class="n">cherrypy</span><span class="o">.</span><span class="n">tree</span><span class="o">.</span><span class="n">graft</span><span class="p">(</span><span class="n">app</span><span class="p">,</span> <span class="s">&#39;/api&#39;</span><span class="p">)</span>
<span class="n">cherrypy</span><span class="o">.</span><span class="n">tree</span><span class="o">.</span><span class="n">mount</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="s">&#39;/static&#39;</span><span class="p">,</span> <span class="p">{</span><span class="s">&#39;/&#39;</span> <span class="p">:</span> <span class="p">{</span>
    <span class="s">&#39;tools.staticdir.dir&#39;</span><span class="p">:</span> <span class="n">static_root</span><span class="p">,</span>
    <span class="s">&#39;tools.staticdir.on&#39;</span><span class="p">:</span> <span class="bp">True</span><span class="p">,</span>
    <span class="p">}})</span>
<span class="n">cherrypy</span><span class="o">.</span><span class="n">engine</span><span class="o">.</span><span class="n">start</span><span class="p">()</span>
<span class="n">cherrypy</span><span class="o">.</span><span class="n">engine</span><span class="o">.</span><span class="n">block</span><span class="p">()</span>
</pre></div>



<p>Although this arrangement works,
I soon received some unpleasant surprises.
When an exception was thrown inside of <tt class="docutils literal">app</tt>
the server never returned a response to the browser —
no <tt class="docutils literal">500 Internal Server Error</tt>,
no pretty traceback in development mode;
just a closed connection.
And neither errors nor successful requests inside of <tt class="docutils literal">app</tt>
resulted in access log messages;
CherryPy was completely silent about them.</p>
<p>This made it necessary for me to adjust my mental model
for how CherryPy operates.</p>
<p>I had always thought of the CherryPy framework
as having great big arms that wrapped around
my entire set of active controllers and applications,
so that it could catch exceptions and log HTTP requests
regardless of where in my tree they originated.
Now, however, I was forced to recognize
that the CherryPy <tt class="docutils literal">try…except</tt> exception catcher
and its logging handlers
must only get involved
when invoking a controller inside of a real CherryPy app.
If an HTTP request is instead being handed off
to a WSGI application of my own devising,
then CherryPy took no further responsibility
for what happened —
I was on my own.</p>
</div>
<div class="section" id="finding-wsgi-components">
<h1>Finding WSGI components</h1>
<p>Well, okay, I was not <em>really</em> on my own —
thanks to the wonderful Python community,
I sit surrounded by the rich and vibrant WSGI ecosystem
of well-supported interchangeable parts.
And logging and exception handling are standard features
that everyone needs, right?</p>
<p>Alas, the reality turned out to be far more murky.
After scouting about for some applicable WSGI
<a class="reference external" href="http://wsgi.org/wsgi/Middleware_and_Utilities">middleware and utilities</a>,
I started to sympathize with Python newbies
who complain about getting lost in the vast sea of broken software.
My long experience in the Python community
means that I often already know the “right tool” for the right situation,
which shields me from remembering what a mess Python newcomers face
when searching for even a simple solution.</p>
<p>For example, the popular <tt class="docutils literal">flup</tt> package's documentation
promised that <tt class="docutils literal">middleware.error</tt> contained an application
for catching WSGI application errors.</p>


<div class="pygments_murphy"><pre>$ pip install flup
Downloading/unpacking flup...
$ python -c &#39;import flup.middleware&#39;
Traceback (most recent call last):
  File &quot;&lt;string&gt;&quot;, line 1, in &lt;module&gt;
ImportError: No module named middleware
</pre></div>



<p>Drat, that must not be released yet.
What about this logging module listed on the WSGI wiki?</p>


<div class="pygments_murphy"><pre>$ pip install wsgilog
Downloading/unpacking wsgilog...
ImportError: No module named ez_setup
</pre></div>



<p>Wow, it does not even install.
Well, what about Werkzeug?</p>
<p>Armin Ronacher's <a class="reference external" href="http://werkzeug.pocoo.org/">Werkzeug</a>
is renowned for its WSGI debugging middleware,
and it did actually install.
But when wrapped around my application,
it simply displayed a traceback of its <em>own</em> failure
to parse and display the error my application was encountering!</p>
<p>(If you want to know my guess as to the problem:
it appears that some of my Python code
is Unicode rather than plain ASCII.
To display it,
Werkzeug encodes it as UTF-8, prepends a BOM marker,
and passes it to the Standard Library's <cite>compiler.parse()</cite> function&nbsp;—
which then promptly explodes
because in Python&nbsp;2.7 the AST represents a BOM using a new node type,
304, which other Standard Library code is not yet prepared to accept.
I have
<a class="reference external" href="https://github.com/mitsuhiko/werkzeug/issues/51">opened an issue</a>
to see whether Armin thinks my guess makes sense
before I try reporting it in the Python bug tracker.)</p>
<p>And so I wound up using <a class="reference external" href="http://pythonpaste.org/">Python Paste</a>
which installs and works quite cleanly,
and which let me add both basic logging
and error catching using just a few lines of code:</p>


<div class="pygments_murphy"><pre><span class="c"># Transform bare `app` into one that logs and 500s on exceptions</span>

<span class="kn">from</span> <span class="nn">paste.exceptions.errormiddleware</span> <span class="kn">import</span> <span class="n">ErrorMiddleware</span>
<span class="kn">from</span> <span class="nn">paste.translogger</span> <span class="kn">import</span> <span class="n">TransLogger</span>

<span class="n">app</span> <span class="o">=</span> <span class="n">ErrorMiddleware</span><span class="p">(</span><span class="n">app</span><span class="p">,</span> <span class="n">debug</span><span class="o">=</span><span class="n">debug_flag</span><span class="p">)</span>
<span class="n">app</span> <span class="o">=</span> <span class="n">TransLogger</span><span class="p">(</span><span class="n">app</span><span class="p">,</span> <span class="n">setup_console_handler</span><span class="o">=</span><span class="n">debug_flag</span><span class="p">)</span>

<span class="c"># Now we proceed as before to build our CherryPy application.</span>

<span class="n">cherrypy</span><span class="o">.</span><span class="n">tree</span><span class="o">.</span><span class="n">graft</span><span class="p">(</span><span class="n">app</span><span class="p">,</span> <span class="s">&#39;/api&#39;</span><span class="p">)</span>
<span class="o">...</span>
</pre></div>



<p>So far, so good.</p>
</div>
<div class="section" id="the-rock-and-the-hard-place">
<h1>The rock and the hard place</h1>
<p>The Paste error handler let me diagnose and repair
my WSGI application in development mode.
When I started to switch things back over to production,
however, I received a surprise:
exceptions were always printed to <tt class="docutils literal">sys.stderr</tt>
even if I turned on every single option I could find,
in both CherryPy and Paste, for logging to actual files.</p>
<p>What was going on?</p>
<p>It turns out that I had run into a pair of hard-coded assumptions
that could not be solved by mere configuration.</p>
<p>In Ian Bicking's Paste project,
the traceback is directed to the <tt class="docutils literal">wsgi.error</tt> file
provided in the WSGI environment:</p>


<div class="pygments_murphy"><pre><span class="c"># from paste/exceptions/errormiddleware.py</span>

<span class="k">class</span> <span class="nc">ErrorMiddleware</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
    <span class="o">...</span>
    <span class="k">def</span> <span class="nf">exception_handler</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">exc_info</span><span class="p">,</span> <span class="n">environ</span><span class="p">):</span>
        <span class="o">...</span>
        <span class="k">return</span> <span class="n">handle_exception</span><span class="p">(</span>
            <span class="n">exc_info</span><span class="p">,</span> <span class="n">environ</span><span class="p">[</span><span class="s">&#39;wsgi.errors&#39;</span><span class="p">],</span>
            <span class="o">...</span><span class="p">)</span>
</pre></div>



<p>The logic within <tt class="docutils literal">handle_exception()</tt> unfortunately insists
on sending at least a little text
to the stream provided as its second argument,
even if you have turned on some of its other kinds of logging
(like sending an email or writing to a log).</p>
<p>And the identity of that <tt class="docutils literal">wsgi.errors</tt> stream —
one of the few “live” objects inside of the WSGI environment,
whose dictionary values are mostly immutable objects like strings —
is hard-coded by Robert Brewer
inside of the module that invokes WSGI applications:</p>


<div class="pygments_murphy"><pre><span class="c"># from cherrypy/wsgiserver/__init__.py</span>

<span class="k">class</span> <span class="nc">WSGIGateway_10</span><span class="p">(</span><span class="n">WSGIGateway</span><span class="p">):</span>

    <span class="k">def</span> <span class="nf">get_environ</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="sd">&quot;&quot;&quot;Return a new environ dict targeting the given wsgi.version&quot;&quot;&quot;</span>
        <span class="o">...</span>
        <span class="n">env</span> <span class="o">=</span> <span class="p">{</span>
            <span class="o">...</span>
            <span class="s">&#39;wsgi.errors&#39;</span><span class="p">:</span> <span class="n">sys</span><span class="o">.</span><span class="n">stderr</span><span class="p">,</span>
            <span class="o">...</span>
            <span class="p">}</span>
         <span class="o">...</span>
         <span class="k">return</span> <span class="n">env</span>
</pre></div>



<p>His definition of WSGI 1.0, then, sets <tt class="docutils literal">wsgi.errors</tt>
without (so far as I can see) any hope of amendment or recourse.
Thus the rock and the hard place:
Robert insisted that the default stream be <tt class="docutils literal">stderr</tt>,
and Ian's logging module insisted that something be written there.</p>
</div>
<div class="section" id="cutting-the-gordian-knot">
<h1>Cutting the Gordian knot</h1>
<p>One of the great satisfactions of Python,
in the last analysis,
is that when you find yourself trapped in a situation like this
there are generally several ways to escape
and get back to more productive tasks,
like writing code of your own.</p>
<ul class="simple">
<li>An ugly possibility, always available as a last resort:
I could simply monkey-patch,
replacing one of the offending routines in Paste or CherryPy
with a slightly different version of my own.</li>
<li>I could update <tt class="docutils literal">cherrypy.wsgiserver.wsgi_gateways</tt>,
a global dictionary mapping versions of the WSGI protocol
to classes that implement them,
so it offers my own subclass of <tt class="docutils literal">WSGIGateway_10</tt> instead.</li>
<li>I could globally replace <tt class="docutils literal">sys.stderr</tt> when running as a daemon
so that errant error messages get written to a file,
and let Paste and CherryPy run without modification.</li>
</ul>
<p>But each of the above ideas
has the disadvantage of making me adjust something big and global
to fix a problem which, in my program, is small and specific.</p>
<p>At the moment, therefore, I have
added my own tiny piece of WSGI middleware
between Robert's class and Ian's code
which overwrites <tt class="docutils literal">wsgi.errors</tt> with something more appropriate:</p>


<div class="pygments_murphy"><pre><span class="c"># Adding three middlewares: error, logging, and my own</span>

<span class="n">app</span> <span class="o">=</span> <span class="n">ErrorMiddleware</span><span class="p">(</span><span class="n">app</span><span class="p">,</span> <span class="n">debug</span><span class="o">=</span><span class="n">debug_flag</span><span class="p">)</span>
<span class="n">app</span> <span class="o">=</span> <span class="n">TransLogger</span><span class="p">(</span><span class="n">app</span><span class="p">,</span> <span class="n">setup_console_handler</span><span class="o">=</span><span class="n">debug_flag</span><span class="p">)</span>

<span class="n">errlog</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">&#39;http-tracebacks.log&#39;</span><span class="p">,</span> <span class="s">&#39;a&#39;</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">app2</span><span class="p">(</span><span class="n">environ</span><span class="p">,</span> <span class="n">start_response</span><span class="p">):</span>
    <span class="n">environ</span><span class="p">[</span><span class="s">&#39;wsgi.errors&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">errlog</span>
    <span class="k">return</span> <span class="n">app</span><span class="p">(</span><span class="n">environ</span><span class="p">,</span> <span class="n">start_response</span><span class="p">)</span>

<span class="n">cherrypy</span><span class="o">.</span><span class="n">tree</span><span class="o">.</span><span class="n">graft</span><span class="p">(</span><span class="n">app2</span><span class="p">,</span> <span class="s">&#39;/api&#39;</span><span class="p">)</span>
</pre></div>



<p>And my daemonized application is finally humming along
without the least desire to write to standard error!
To me, this is a great little example
of why a pluggable architecture like WSGI is so powerful
in a language like Python that makes it easy
to create and manipulate functions as first-class objects.</p>
<p>All of which leaves me with three thoughts.</p>
<p>First — looking at the install errors,
and how my attempt to use Werkzeug apparently revealed a bug
in Python's Standard Library itself —
I was painfully reminded of what a mess the Python ecosystem
must look like to those not familiar with its landscape.
If only we could communicate how rare experiences like this are,
once you develop a solid personal tool set
and learn your way around what works and what doesn't!</p>
<p>Second, I wish that CherryPy were willing to do logging
and exception handling for mounted WSGI applications.
I will have to ask Robert whether my approach here is even correct,
or whether there is some other way to call my own applications
without turning off so many features.</p>
<p>Finally, it occurs to me that instead of choosing Paste
and then spending far too long to make it work,
I should have tried out the competing middleware components
that Chris McDonough has produced as part of his
<a class="reference external" href="http://repoze.org/repoze_components.html">Repoze project</a>.
I&nbsp;had not even thought of Repoze until writing this blog entry,
probably because of an unconscious assumption
that installing anything from the Zope world
would probably install a half-dozen dependencies.
But I just tried installing <tt class="docutils literal">repoze.errorlog</tt>
and it only requires a small package called <tt class="docutils literal">meld3</tt>
and, oddly enough, its competitor <tt class="docutils literal">paste</tt> itself!
I&nbsp;should try it out before closing this issue.</p>
<p>Anyway, I hope this write-up helps someone else
who needs to use WSGI middleware
to backfill the features that are normally provided
as part of a large Python web framework.
And, of course, I look forward to comments from the community
about how my approach here could have been more elegant!</p>
</div>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Visible Indentation in Python Publishing</title>
      <link>http://rhodesmill.org/brandon/2011/visible-indentation/</link>
      <pubDate>Sun, 20 Feb 2011 23:00:28 EST</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[document processing]]></category>
      <category><![CDATA[books]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/?p=445</guid>
      <description>Visible Indentation in Python Publishing</description>
      <content:encoded><![CDATA[
<p>It suddenly occurred to me that I managed to write <a href="http://rhodesmill.org/brandon/2011/foundations-of-python-network-programming/">an entire blog post about my new book last month</a> without so much as mentioning that it represents a landmark, so far as I know, in Python publishing.</p>
<div class="dropshadow alignright">
  <a>
    <img src="http://rhodesmill.org/brandon/static/2011/chevron-sample.png"
         alt="Sample Python code with indentation marked"
         width="320" height="296" />
  </a>
</div>
<p>What was my big idea? That in printed Python code, indentation should be visible.</p>
<p>Any of you who have had to read many Python listings printed in books will immediately recognize the problem that I wanted solved. When a listing is long enough to run on to a second page, it is often less than clear whether the code there is continuing at the same level of indentation, or whether the page break has just happened to correspond to a point in the code where it dedented out to the previous level. Languages with braces or explicit “end” statements to close blocks never have to worry about this. But in Python — especially where a script or code snippet ends without ever returning to the outermost level of indentation — the last few lines of the script feel as though they are left hanging if they stand alone at the top of a new page of text.</p>
<p>Of course, there were several practical considerations that had to be settled. A symbol for indentation had to be chosen, for example. I selected the Unicode double-chevron because it is a character that is never valid in actual Python code. Then the publisher had to be convinced to try the experiment; it helped that I had the full support of my editor, <a href="http://www.liveandletwrite.com/">Laurin Becker</a>, who also prepared the layout people for the fact that these chevrons were <i>not</i> part of the code and would need to remain visually distinct from it. Finally — because I had no desire to insert and color each chevron by hand — I had to write an <a href="http://lxml.de/">lxml</a> script to insert the chevrons into my OpenOffice documents, then go back and ruefully remove by hand the chevrons that got nonsensically inserted into snippets of other languages like HTML.</p>
<p>But now I want to hear what readers think! I have not yet seen a review that mentions whether the visible indentation helps, hurts, or is simply irrelevant to the Python listings. If you happen to have seen the new edition of <a href="http://www.amazon.com/gp/product/1430230037?ie=UTF8&tag=letsdisthemat-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1430230037">Foundations of Python Network Programming</a><img src="http://www.assoc-amazon.com/e/ir?t=letsdisthemat-20&l=as2&o=1&a=1430230037" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />, let me know what you think. While creating the effect cost some time and effort, I will happily do it again in my next book if turns out to have actually helped readers scan the program listings.</p>]]></content:encoded>
    </item>
    <item>
      <title>Grin and search it</title>
      <link>http://rhodesmill.org/brandon/2011/grin-and-search-it/</link>
      <pubDate>Wed, 26 Jan 2011 23:15:42 EST</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/?p=415</guid>
      <description>Grin and search it</description>
      <content:encoded><![CDATA[
<p>
I want to admit how much I am enjoying the
<a href="http://pypi.python.org/pypi/grin">grin search tool</a>
that lets you use Python regular expressions
in recursive searches across your filesystem,
automatically ignoring directories like <tt>.hg</tt>
and files with extensions like <tt>.pyc</tt> and <tt>.jpg</tt>.
</p>
<p>
Over the years I had already built a stable full
of shell functions for doing <i>grep(1)</i> and <i>find(1)</i>
in various combinations,
and my functions even defaulted to using <i>pcregrep(1)</i> when available
because Perl-style REs (which have nearly the same format used by Python)
seem to require so many fewer backslashes for common cases.
It hardly seemed worth the effort to move to another tool —
and the <i>grin</i> output format seemed glarish and offensive at first glance,
not at all like the austere output of traditional <i>grep</i>:
</p>
<pre>
$ pip install grin
$ cd Python-3.2b2
$ grin autoGIL
<span style="color: #080">./Doc/whatsnew/2.6.rst:</span>
 3167 :   :mod:`<span style="color: #990">autoGIL</span>`,
<span style="color: #080">./Misc/HISTORY:</span>
 5577 : - There's a new module called "<span style="color: #990">autoGIL</span>", which
</pre>
<p>
But I forced myself to try using it again a few weeks later,
and when actually doing real work I could not help but notice
that the results were far easier for my eyes to scan
than <i>grep</i> output.
This is an important point: so often, the issue of whether
something <i>looks</i> like it will be easy to read is a quite different matter
than whether it is actually difficult to read
the moment you stop looking <i>at</i> it
and start trying to look <i>through</i> it
to see your data!
Now I can clearly see where the output from one file ends
and the next begins — which traditionally is quite difficult
if you are looking through a series of files with very similar names.
Long file paths no longer push matching lines to the right,
splitting them across the right edge
of my standard 80-column terminal window.
In fact, one of my first objections to its layout
had been the line consumed by each stand-alone file name;
but in practice, I found, far more lines are gained
because of the matches that are not split across the right edge.
</p>
<p>
You can run <i>grin</i> against individual files or directories
by naming them on the command line;
adjust the set of files to which it pays attention;
and even revert its output to a traditional <tt>file:line</tt> format
if you want to pass the output to another program.
You can, in other words, ask it to behave more like traditional <i>grep</i>.
But its default behavior, I gradually realized
after running it several hundred times, is by far the most common
way that I had been using <i>grep</i> anyway —
to search the source files beneath my working directory for a pattern.
</p>
<p>
The author, Robert Kern, has answered emails promptly
and was happy to accept a few patches.
I now have a shell script that I run whenever I set up a new Unix account
that builds a virtual environment,
includes its <tt>bin</tt> directory in my <tt>PATH</tt>,
and installs <i>grin</i> as one of the most important
Python tools that I need always at my fingertips.
Try it out!
</p>
]]></content:encoded>
    </item>
    <item>
      <title>Foundations of Python Network Programming: A Last Hurrah For Python 2</title>
      <link>http://rhodesmill.org/brandon/2011/foundations-of-python-network-programming/</link>
      <pubDate>Mon, 24 Jan 2011 15:03:19 EST</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[books]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/?p=391</guid>
      <description>Foundations of Python Network Programming: A Last Hurrah For Python 2</description>
      <content:encoded><![CDATA[
<div class="dropshadow alignright">
  <a>
    <img src="http://rhodesmill.org/brandon/static/2011/fopnp-cover280.jpg"
         alt="Book cover of Foundations of Python Network Programming"
         width="280" height="346" />
  </a>
</div>
<p>
My long labor is at last at an end: Apress has
<a href="http://www.amazon.com/gp/product/1430230037?ie=UTF8&tag=letsdisthemat-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1430230037">published my rewrite</a><img src="http://www.assoc-amazon.com/e/ir?t=letsdisthemat-20&l=as2&o=1&a=1430230037" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />
of John Goerzen's
<i>Foundations of Python Network Programming</i> (2004)
and I have released the example programs
as both Python 2 and Python 3 code in
<a href="https://bitbucket.org/brandon/foundations-of-python-network-programming/">a Bitbucket repository</a>
and also as a zipfile on the Apress web site.
I can finally return to more casual writing —
this is my first blog post in many months! —
and can be more active in Python projects again.
</p>
<p>
I was amazed, as I studied the book's popular
<a href="http://www.amazon.com/gp/product/1590593715?ie=UTF8&tag=letsdisthemat-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1590593715">first edition</a><img src="http://www.assoc-amazon.com/e/ir?t=letsdisthemat-20&l=as2&o=1&a=1590593715" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />,
at how far Python has come since 2004.
None of the modern Python web frameworks existed;
the book offered one chapter on CGI programming,
another on <a href="http://www.modpython.org/">mod_python</a>,
and never even mentioned Zope.
Python 2.3 was the most recent version of Python.
The book could cite FTP as one of the
“most widely used protocols on the Internet.”
Parsing HTML involved stepping through the markup with
<a href="http://docs.python.org/library/htmlparser.html">HTMLParser</a>
and the recommended approach to processing XML was the awkward
<a href="http://docs.python.org/library/xml.dom.html">xml.dom</a>.
JSON was not yet popular enough to even warrant mention.
The author recommended the dependable
<a href="http://docs.python.org/library/syslog.html">syslog</a> 
module over the newfangled
<a href="http://docs.python.org/library/logging.html">logging</a>
package in the Standard Library,
and remote administration tools like
<a href="http://www.lag.net/paramiko/">paramiko</a>
and <a href="http://docs.fabfile.org/0.9.3/">fabric</a>
either did not exist or were not yet on the radar.
</p>
<p>
Looking back, it appears that 2005 —
the year following the first edition's publication —
was a turning point for Python web technology in particular.
<a href="http://www.cherrypy.org/">CherryPy</a> saw
its first release in late 2004,
and over the next year we were introduced to
<a href="http://www.djangoproject.com/">Django</a>,
<a href="http://pylonshq.com/">Pylons</a>,
and <a href="http://turbogears.org/">Turbogears</a>,
all on the heels of the Ruby on Rails “framework” phenomenon.
And both of the web scraping technologies
covered in my new edition of the book,
<a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a>
and
<a href="http://codespeak.net/lxml/">lxml</a>,
also came out in 2005.
In the face of all these changes, the web chapters
received a full rewrite based on the latest technology.
</p>
<p>
I also rewrote the introductory chapters
so that UDP and TCP each get their own chapter,
focused on the protocol itself,
instead of leaving the information about each protocol split
between a client socket chapter and a server socket chapter.
I have replaced the separate chapters on forking, threading,
and event-driven servers with a single chapter
that takes an example network server,
rewrites that same server six different ways —
including Twisted and a raw polling event loop —
and then uses <a href="http://funkload.nuxeo.org/">funkload</a>
to study the resulting performance.
And even though an editor objected
that it lacked unity,
I insisted that a chapter be dedicated to
<a href="http://memcached.org/">memcached</a>,
message queues, and map-reduce,
since they are so important to scaling modern Internet services.
</p>
<p>
Apress had already slated the project for Python 2
based on the advice of a well-known Python programmer,
and I heartily supported his recommendation.
His wisdom is bourne out by the result:
as you can see from
<a href="https://bitbucket.org/brandon/foundations-of-python-network-programming/src"
   >the <tt>README.txt</tt> that I included with the source code bundle</a>,
fully one-quarter (22/86) of the programs
will simply not run today under Python 3 because of missing dependencies.
I see this new <i>Foundations of Python Network Programming</i>
as the perfect companion for Python 2.7 —
which was released as I was writing —
since the book will bring you up to date
with the final features of the Python 2 series
and the third party libraries that support it.
</p>
<div class="alignright">
  <a>
    <img src="http://rhodesmill.org/brandon/static/2011/apress-type-size.png"
         alt="Type size comparison"
         width="280" height="283" />
  </a>
</div>
<p>
I enjoyed working with the Apress team;
<a href="http://www.liveandletwrite.com/">Laurin Becker</a> is a great
managing editor who was always on my side,
and <a href="http://www.michaelbernstein.com/">Michael Bernstein</a> as technical editor gave great feedback which made this an even better book.
There were only two annoyances that I encountered with the book's release.
First, as the accompanying image illustrates,
Apress used a much more compact layout than in the first edition.
This not only makes the book appear much shorter than its predecessor,
but also makes it (to my eyes) visually crowded and more tiring to read.
Second, Apress was not prompt about changing the project title
after their decision to target Python 2,
with the result that Amazon had “Python 3” in the title
all the way through the book's release in late December —
resulting in 1-star reviews for the first few weeks!
</p>
<p>
I will, by the way, soon be blogging
about my experience rewriting the book's
example code to work under Python 3.
This should help two groups of people:
anyone porting a Python 2 network application to Python 3
will probably run into the same issues I did;
and people who are already experimenting with Python 3
will want to know how to apply the book's techniques
to the newer version of the language.
</p>
<p>
Feel free to submit questions about the book
or its example programs to the
<a href="https://bitbucket.org/brandon/foundations-of-python-network-programming"
   >Bitbucket source code repository</a>
where the code is stored.
The revision took far more effort than I had expected,
but I am very happy to have found a format
into which I could distill so much of my own —
and the Python community's —
network programming experience,
so that it will help programmers for years to come.
Enjoy!</p>]]></content:encoded>
    </item>
    <item>
      <title>CherryPy and running out of file descriptors during development</title>
      <link>http://rhodesmill.org/brandon/2010/cherrypy-and-running-out-of-file-descriptors-during-development/</link>
      <pubDate>Tue, 19 Oct 2010 22:45:44 EDT</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/?p=378</guid>
      <description>CherryPy and running out of file descriptors during development</description>
      <content:encoded><![CDATA[
<p>
Well, that was interesting! A fellow developer complained that, following my recent introduction of <a href="http://www.zeromq.org/">ØMQ</a> into our project, he could only go through a few cycles of editing the source code, saving his changes, and watching <a href="http://www.cherrypy.org/">CherryPy</a> automatically restart before it would fail with:
</p>
<pre>
IOError: [Errno 24] Too many open files
</pre>
<p>
An <tt>lsof</tt> against the process confirmed that more files were open after every restart. My guess was that leftover references were keeping a few Python file objects alive from one CherryPy reload to the next, and I remembered having once used a neat utility for exploring the object heap. After some investigation I re-discovered that it was <a href="http://guppy-pe.sourceforge.net/">Heapy</a>. So I added a <tt>pdb</tt> breakpoint, did some investigating, and was puzzled to find that after each restart only four Python file objects existed — one for each log file in our application.
</p>
<p>
(I also tried using <a href="http://mg.pov.lt/objgraph/">objgraph</a>, which is far easier to use than Heapy, but it could not tell that there were file objects in memory at all.  I have no idea why.)
</p>
<p>
Well, this was a puzzle. How could the number of open file descriptors increase without bound when Python was clearly deleting all of the old file objects? The answer, once I finally tried reading the source code to the <tt>Autoreloader</tt> plug-in, was of course very simple: CherryPy performs each restart by doing an <tt>exec()</tt> to completely wipe out the process image and replace it with a new instance of the CherryPy application. Which is certainly a very thorough approach! Except for one thing: file descriptors in Unix are set, by default, to survive an <tt>exec()</tt> call, but the new instance of Python that spins up inside of the process does not know that they are there, so they never get closed. It appeared that suddenly calling ØMQ out of existence with an <tt>exec()</tt> also left a few sockets lying around.
</p>
<p>
Several possible solutions came to mind. What if a more thorough effort was made to delete all Python objects before running the <tt>exec()</tt> call? That sounded daunting, though — it might take a lot of effort to march through all of the application object trees closing everything down. I could have focused my efforts just on finding the file objects, but that approach felt fragile; what would happen the next time one of our developers wrote a new module that opened a log file?
</p>
<p>
Monkey patching the <tt>open()</tt> built-in to create files with their close-on-exec flag already set is, of course, too terrible a solution even to contemplate.
</p>
<p>
In the end, the simplest solution seemed to be the creation of a little CherryPy plug-in that, as the very last shutdown action, would loop over all existing file descriptors and set their close-on-exec flag. Here is the plug-in, in case the pattern helps anyone else:
</p>

<div class="pygments_murphy syntax_highlight"><pre><span class="sd">&quot;&quot;&quot;Make sure file descriptors close when CherryPy exec&#39;s.&quot;&quot;&quot;</span><br/><br/><span class="kn">import</span> <span class="nn">os</span><br/><span class="kn">import</span> <span class="nn">sys</span><br/><span class="kn">from</span> <span class="nn">cherrypy.process.plugins</span> <span class="kn">import</span> <span class="n">SimplePlugin</span><br/><br/><span class="k">class</span> <span class="nc">CloexecPlugin</span><span class="p">(</span><span class="n">SimplePlugin</span><span class="p">):</span><br/>    <span class="k">def</span> <span class="nf">stop</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span><br/>        <span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">sys</span><span class="p">,</span> <span class="s">&#39;getwindowsversion&#39;</span><span class="p">):</span><br/>            <span class="k">return</span><br/>        <span class="kn">import</span> <span class="nn">fcntl</span>  <span class="c"># not available under Windows</span><br/>        <span class="nb">max</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">sysconf</span><span class="p">(</span><span class="s">&#39;SC_OPEN_MAX&#39;</span><span class="p">)</span> <span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">os</span><span class="p">,</span> <span class="s">&#39;sysconf&#39;</span><span class="p">)</span> <span class="k">else</span> <span class="mi">1024</span><br/>        <span class="k">for</span> <span class="n">fd</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="nb">max</span><span class="p">):</span>  <span class="c"># skip stdin/out/err</span><br/>            <span class="k">try</span><span class="p">:</span><br/>                <span class="n">flags</span> <span class="o">=</span> <span class="n">fcntl</span><span class="o">.</span><span class="n">fcntl</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">fcntl</span><span class="o">.</span><span class="n">F_GETFD</span><span class="p">)</span><br/>            <span class="k">except</span> <span class="ne">IOError</span><span class="p">:</span><br/>                <span class="k">continue</span><br/>            <span class="n">fcntl</span><span class="o">.</span><span class="n">fcntl</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">fcntl</span><span class="o">.</span><span class="n">F_SETFD</span><span class="p">,</span> <span class="n">flags</span> <span class="o">|</span> <span class="n">fcntl</span><span class="o">.</span><span class="n">FD_CLOEXEC</span><span class="p">)</span><br/><br/>    <span class="n">stop</span><span class="o">.</span><span class="n">priority</span> <span class="o">=</span> <span class="mi">99</span><br/></pre></div>


<p>
Of course, I suspect that this problem was happening all along, even before we added extra logging and then integrated ØMQ into our application. But back then, with maybe only one or two stray file descriptors surviving each restart, it would have taken five hundred or a thousand CherryPy restarts for the problem to be noticed — and, apparently, none of us developers ever reached that impressive total. Now we know to be careful!
</p>
]]></content:encoded>
    </item>
    <item>
      <title>Grok has book. Book good!</title>
      <link>http://rhodesmill.org/brandon/2010/grok-has-book-book-good/</link>
      <pubDate>Thu, 10 Jun 2010 22:32:52 EDT</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[grok]]></category>
      <category><![CDATA[books]]></category>
      <category><![CDATA[zope]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/?p=350</guid>
      <description>Grok has book. Book good!</description>
      <content:encoded><![CDATA[
<p>
  I little suspected the great chasm that lies
  between the simple act of agreeing to review a book,
  and the actual exercise of sitting down later to write the review.
  It feels quite pleasant, really,
  to jot off a positive reply to the publisher's polite question.
  One feels magnanimous for agreeing to help advance our civilization
  by reviewing a book about Python,
  and for helping out the publisher in what,
  after all, are such hard economic times.
  It is fun when the free copy arrives,
  crisp and smartly bound.
</p>
<p>
  But then, eventually, one has to write the actual review.
</p>
<div class="dropshadow alignright"> 
  <a href="http://www.amazon.com/gp/product/1847197485?ie=UTF8&tag=letsdisthemat-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1847197485"><img border="0" src="http://rhodesmill.org/brandon/static/2010/grok-book.jpg"></a><img src="http://www.assoc-amazon.com/e/ir?t=letsdisthemat-20&l=as2&o=1&a=1847197485" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />
</div>
<p>
  And so,
  a full four months after that friendly email from Packt Publishing,
  it is time that I sit down
  and put together some thoughts
  about Carlos de la Guardia's first book,
  <a href="http://www.amazon.com/gp/product/1847197485?ie=UTF8&tag=letsdisthemat-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1847197485"
     >Grok 1.0 Web Development</a>.
  Carlos is a long-time veteran of the Zope and Plone communities,
  and <a href="http://grok.zope.org/">Grok</a>,
  of course,
  is the web framework
  that places a simple and agile convention-driven engine
  atop the otherwise notoriously XML-ridden Zope application framework.
  Grok is an important project,
  because it packages the technology of Python's oldest
  and most experienced community of web developers
  in a way that makes it easy to extend and use.
</p>
<!--more-->
<p>
  First, one notices that the book does exhibit the kind of typos
  for which Packt are well known;
  but not so many, in this case, as to distract me from the text.
  The book tackles increasingly complicated topics
  in roughly the same order as every book in the web development genre:
  first installation, then views, models, forms, and so forth.
  But even here one notices an initial difference!
  Because this is the land of Zope,
  the search engine chapter appears quite early in the book —
  in Chapter 6, in fact.
  While some web frameworks can only support search
  through the installation of a special database extension,
  or perhaps through an entirely separate third-party application,
  Grok comes with search already integrated into its object database.
  This lets the concept be introduced early
  as a natural part of the framework,
  rather than being a difficult exercise in product integration
  best left for after the user has learned deployment.
</p>
<p>
  Not that third-party products are bad!
  In fact, I am happy to see that in many cases Carlos teaches his readers to use them.
  Barely twenty pages into the book,
  the new developer is being taught about using
  <a href="http://pypi.python.org/pypi/virtualenv">virtualenv</a>
  to maintain an installation of Python packages;
  and near the end,
  Carlos teaches deployment using the modern and sleek
  <a href="http://code.google.com/p/modwsgi/">mod_wsgi</a> module.
  Some books feel myopically focused on one software community;
  this one, instead, feels expansive,
  a place where best practices from all across the Python ecosystem meet.
</p>
<p>
  It is particularly notable
  that at one point Carlos steps outside of Zope entirely
  and, for an entire chapter, describes how Grok integrates support
  for relational databases through
  <a href="http://www.sqlalchemy.org/">SQLAlchemy</a>,
  Python's most outstanding ORM.
  This both reflects credit upon the flexibility of Grok —
  whose abstractions must be working extremely well
  to build atop an entirely different database model,
  one that is not even native to Zope —
  and it evidences Carlos's determination
  to teach a very wide set of skills in his book.
  In fact, he even goes in the other direction at one point,
  and teaches the interested developer
  how to use the ZODB <i>without</i> Grok
  in case they want to access it from another application!
</p>
<p>
  The book tends to be quite clear in what it explains,
  but it sometimes feels as though a few more side trips
  in its examples would be welcome —
  it was not always clear to me whether a beginner,
  after plowing through an example,
  would understand what alternatives Carlos faced at each step,
  and why Carlos chose to build the solutions in the way he did.
</p>
<p>
  Though the book does cover testing,
  it is not test <i>driven</i> —
  testing is confined to its own chapter near the end of the book,
  rather than motivating the design of each sample application.
  A more interesting omission
  is that the book says nothing about web services,
  despite the fact that XML-RPC and REST are two things that Grok does particularly well!
  A chapter on building web APIs would be a great subject
  for a second edition of this book,
  especially if it then proceeded
  to show how easily Grok can support the Ajax design pattern.
</p>
<p>
  It did strike me as decidedly old-fashioned
  that Carlos repeats the weary canard
  that the Zope template language's awkwardness and verbosity
  are worth it,
  because it allows your templates to remain valid HTML,
  because your designers might, at any moment,
  want to open up and tweak a raw template in their browsers!
  This, of course, only works if your templates are each an entire page —
  if you fail to avail yourself, in other words, of <i>any</i> of the power
  of the macros, slots, and viewlets
  that Carlos goes on to describe later.
  And, in fact, many of his later templates
  are indeed mere un-styled HTML snippets
  that no designer in their right mind would view in isolation.
  All of which leads to the question of why
  <a href="http://genshi.edgewall.org/">Genshi</a>,
  another template language popular with Grok developers,
  does not warrant any mention in the book —
  especially since the book in so many other cases
  is careful to mention more widely deployed Python technologies
  that Grok is able to use.
</p>
<p>
  Finally, I can attest that a concept
  I never managed to learn when I was active with Grok —
  the tangle of concepts that surround
  viewlets, viewlet managers, layers, and skins —
  became quite clear as I sat down
  and read Chapter 8 in close detail.
  On the one topic which I could truly approach as a newcomer, therefore,
  I found Carlos's approach to be very direct and understandable.
</p>
<p>
  Grok has good online documentation,
  but unless you are already an experienced developer
  you will have difficulty putting the pieces together
  into a story that runs smoothly from model to view to deployment.
  If you need help getting the whole picture
  of how Grok apps fit together,
  you will certainly want this book.
  You can purchase it from either
  <a href="http://www.amazon.com/gp/product/1847197485?ie=UTF8&tag=letsdisthemat-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1847197485"
     >Amazon</a>
  or directly from its publisher,
  <a href="http://www.packtpub.com/grok-1-0-web-development/book?utm_source=rhodesmill.org&utm_medium=bookrev&utm_content=blog&utm_campaign=mdb_002483"
     >Packt Publishing</a>.
</p>]]></content:encoded>
    </item>
    <item>
      <title>Python multiprocessing is different under Linux and Windows</title>
      <link>http://rhodesmill.org/brandon/2010/python-multiprocessing-linux-windows/</link>
      <pubDate>Fri, 14 May 2010 12:27:05 EDT</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/?p=333</guid>
      <description>Python multiprocessing is different under Linux and Windows</description>
      <content:encoded><![CDATA[
<p>
One of the great recent advances in the Python Standard Library
is the addition of the
<a href="http://docs.python.org/library/multiprocessing.html"
   >multiprocessing</a> module,
maintained by <a href="http://jessenoller.com/">Jesse Noller</a>
who has also blogged and written
about several other concurrency approaches for Python —
<a href="http://jessenoller.com/2009/01/29/a-gentle-overview-of-kamaelia-or-its-axon-stupid/"
   >Kamaelia</a>,
<a href="http://jessenoller.com/2009/01/31/circuits-event-driven-components/"
   >Circuits</a>,
and
<a href="http://jessenoller.com/2009/02/23/stackless-you-got-your-coroutines-in-my-subroutines/"
   >Stackless</a> Python.
</p>
<p>
I have wanted to try the multiprocessing module out for some time,
and now have a consulting project that will really benefit from multiple processes:
they will let our application run third-party plugins
without having to worry that any bugs or indiscretions which they commit
might damage or hang our main server,
which can remain safe in another process.
</p>
<p>
First, one can only stand in awe at the achievement —
and the amount of work —
that the multiprocessing module represents.
I cannot imagine the time that it would have taken our team
to figure out all of the differences between Linux and Windows
when it comes to processes, shared memory, and concurrency mechanisms.
In fact, the approach we are taking might not even have been feasible
under those circumstances.
By figuring out how to get locks, queues, and shared data structures
all working cleanly on such different architectures,
the multiprocessing authors
save Python programmers out on the street like me
from reinventing a dozen wheels
when we need to support multi-platform concurrency.
</p>
<p>
Well, <i>almost.</i>
</p>
<p>
There is one rather startling difference
which the multiprocessing module does <i>not</i> hide:
the fact that while every Windows process must spin up
independently of the parent process that created it,
Linux supports the <i>fork(2)</i> system call
that creates a child processes already in possession
of exactly the same resources as its parent:
every data structure, open file, and database connection
that existed in the parent process
is still sitting there, open and ready to use, in the child.
Consider this small program:
</p>

<div class="pygments_murphy syntax_highlight"><pre><span class="kn">from</span> <span class="nn">multiprocessing</span> <span class="kn">import</span> <span class="n">Process</span><br/><span class="n">f</span> <span class="o">=</span> <span class="bp">None</span><br/><br/><span class="k">def</span> <span class="nf">child</span><span class="p">():</span><br/>    <span class="k">print</span> <span class="n">f</span><br/><br/><span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">&#39;__main__&#39;</span><span class="p">:</span><br/>    <span class="n">f</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">&#39;mp.py&#39;</span><span class="p">,</span> <span class="s">&#39;r&#39;</span><span class="p">)</span>                                                      <br/>    <span class="n">p</span> <span class="o">=</span> <span class="n">Process</span><span class="p">(</span><span class="n">target</span><span class="o">=</span><span class="n">child</span><span class="p">)</span><br/>    <span class="n">p</span><span class="o">.</span><span class="n">start</span><span class="p">()</span><br/>    <span class="n">p</span><span class="o">.</span><span class="n">join</span><span class="p">()</span><br/></pre></div>


<p>
On Linux, the open file <tt>f</tt> keeps its value in the child process;
the child has inherited an open connection from its parent:
</p>
<pre>
$ python mp.py
&lt;open file 'mp.py', mode 'r' at 0xb7734ac8&gt;
</pre>
<p>
Under Windows, however, where the multiprocessing module
has to spawn a fresh copy of the Python interpreter
to which it gives special instructions
to just run the function <tt>f()</tt>,
the module is a clean slate without an open file inside:
</p>
<pre>
C:\Users\brandon\dev>python mp.py
None
</pre>
<p>
Now, my complaint is not exactly
that the multiprocessing documentation is misleading on this point;
under its section on
<a href="http://docs.python.org/library/multiprocessing.html#programming-guidelines"
   >Programming guidelines</a>,
it makes it quite clear that:
</p>
<blockquote>
On Unix a child process can make use of a shared resource created in a
parent process using a global resource. However, it is better to pass
the object as an argument to the constructor for the child process.
</blockquote>
<p>
I have no quarrel with this advice;
if I am careful to pass everything the child needs
in its list of arguments,
then I can be sure that my code will work under both Linux and Windows.
</p>
<p>
But I do wish that the multiprocessing module
provided more support for testing this condition
more rigorously under Linux.
In particular, I wish that there were some way of turning
the simple forking logic <i>off</i> —
of saying, “Yes, I know that Linux will let you create a child process
very simply using <i>fork(2)</i>, but for my sanity would you please
create the child process from scratch like you do under Windows so
that I can test whether my code accidentally depends on residual
state from the parent process that I did not see that I was using?”
I looked at the multiprocessing "forking.py" module
to see whether I could turn on the Windows-style process spawning
even from inside of Linux,
but the mechanism is chosen
by a bare module-level check of "sys.platform"
and if I overwrite that variable with 'win32'
the code then dies when it tries to import "msvcrt"
which is available only under Windows.
</p>
<p>
There is, thus, even in principle, no way
that I can test my multiprocessing application under Linux
which will give me any assurance that my child processes
are not accidentally taking advantage of data structures
and open connections left lying around by the parent process;
only by actually moving over to Windows itself
can I see how my child code really behaves on its own.
I have <a href="http://bugs.python.org/issue8713">created a feature request</a>
in the Python bug tracker to see whether this situation can be improved.
</p>
<p>
But even with this one inconvenience —
which is troubling me much less, now that I at least understand
why my application was behaving so differently under Windows —
the multiprocessing module is still a huge leap forwards
for Python programmers who need to run code
in heavyweight processes with all of the isolation and safety
that they provide.
Thanks again to Jesse and the multiprocessing team!
</p>
]]></content:encoded>
    </item>
    <item>
      <title>Sphinx + Mercurial = My favorite CMS</title>
      <link>http://rhodesmill.org/brandon/2010/sphinx-mercurial-cms/</link>
      <pubDate>Sun, 21 Mar 2010 22:58:28 EDT</pubDate>
      <category><![CDATA[python]]></category>
      <category><![CDATA[document processing]]></category>
      <category><![CDATA[computing]]></category>
      <guid>http://rhodesmill.org/brandon/?p=318</guid>
      <description>Sphinx + Mercurial = My favorite CMS</description>
      <content:encoded><![CDATA[
<p>
Though I write and maintain some of the content for our <a href="http://pyatl.org/">Python Atlanta web site</a>, updates and additional content often come in from other users. For example, our Plone interest group — headed up by <a href="http://www.ifpeople.net/about/people/cjj">Christopher Johnson</a> — has their <a href="http://pyatl.org/plone">own page on our web site</a>. And the information about our <a href="http://pyatl.org/bookclub">book club</a> is both written and regularly updated by <a href="http://www.doughellmann.com/">Doug Hellmann</a>.
</p>
<div class="dropshadow alignright">
  <a href="http://pyatl.org/">
    <img src="http://rhodesmill.org/brandon/static/2010/pyatl-thumb.png"
         alt="Python Atlanta web site"
         width="240" height="135" />
  </a>
</div>
<p>
How can a collaborative site like ours best be edited and updated? Well, I would like to report some modest initial success with an experimental approach: I now maintain the site as a <a href="http://sphinx.pocoo.org/">Sphinx-powered</a> documentation system stored in a <a href="http://bitbucket.org/brandon/pyatl.org/">BitBucket repository</a> into which I pull changes made by my collaborators. The advantages are several.
</p>
<ul>
<li>The change management tools supported by traditional CMS systems, even at their best, seem somehow anemic when compared to the toolkit provided by a good DVCS like <a href="http://mercurial.selenic.com/">Mercurial</a>. Where, for example, does even a capable CMS like Plone provide anything like Mercurial's “backout” or “blame” commands?
</li>
<li>Markup as well-designed as <a href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> is not only a lot of fun to use, bit it also very cleanly separates content from design. Authors working in plain text tend to produce clean, readable content without the messy markup often associated with visual HTML editors, or, worse yet, the disaster that is Microsoft Word.
</li>
<li>Staging — a feature I find essential, but which seems missing from many default CMS configurations — occurs automatically! Each author can see locally how the site will look with their changes, and after doing a pull I can review the site's appearance on my laptop before finally deploying the new content to the production site.
</li>
</ul>
<p>
To top it all off, authors get to use their own editor-of-choice when making contributions, and we all get extra practice cloning and merging in my favorite DVCS. I am optimistic about this direction, but I will post again if we wind up hitting snags in the future. Finally, of course, feel free to clone our repository if you want to see how Sphinx looks when running a generic web site.
</p>]]></content:encoded>
    </item>
  </channel>
</rss>

