|Date:||1 August 2009|
|Tags:||computing, python, web notes|
Thanks to more than an hour of work today, I have a pretty list of a few dozen commands that make it easy for a WebFaction account holder to install the powerful lxml Python package for parsing HTML and XML under their hosting account. You can read Ian Bicking's wonderful blog post “lxml: an underappreciated web scraping library” for more information on why you want to be using lxml instead of any of its alternatives.
So, why do I say “drat”?
First, because I just tried out my instructions on another of my WebFaction accounts, and there the extra steps weren't even necessary; this other server of theirs already had lxml's dependencies installed! I suppose, had I been a bit more patient, that this support ticket that I glanced over this morning would have inspired me to ask WebFaction to install the libraries lxml needs on the server where I myself was working. But it felt like some sort of offense against symmetry to rely on something that WebFaction doesn't install everywhere, and I was perhaps just in too big of a hurry. Which, of course, cost more time in the end.
The other reason I say “drat” is because, now that I look at Ian's post again after all these months, I see that he has instructions for making the package install its own dratted copies of the system libraries it needs! Too bad that lxml's own installation instructions omit this crucial piece of information.
How typical, and how predictable. It turns out that I just needed to listen to Ian Bicking more carefully. How often we fail to do that, as individuals and as a Python community. Listen to Ian Bicking, everyone. Listen.
In the meantime, here are some successful and unsuccessful ways of installing lxml under your WebFaction account. Consider the following to be a set of choose-your-own adventure scenarios.
If the WebFaction host your account lives on already has libxml and libxslt installed, then installation is simple:
$ easy_install lxml Searching for lxml Reading http://pypi.python.org/simple/lxml/ ... Finished processing dependencies for lxml
If your WebFaction host lacks libxml, but you listen to Ian Bicking and download the source code yourself, then your install will succeed:
$ wget http://pypi.python.org/.../lxml-2.2.2.tar.gz $ tar xfz lxml-2.2.2.tar.gz $ cd lxml-2.2.2 $ STATIC_DEPS=true python setup.py install ... Finished processing dependencies for lxml==2.2.2
If your WebFaction host lacks libxml, but you listen to Ian Bicking, but you rely on easy_install to fetch the package, then your install will fail because it tries building inside of a temporary directory that, on WebFaction, you apparently cannot access:
$ STATIC_DEPS=true easy_install lxml Searching for lxml Reading http://pypi.python.org/simple/lxml/ ... Running "./configure --without-python --disable-dependency-tracking --disable-shared --prefix=/tmp/easy_install-81ufo5/lxml-2.2.2/buil d/tmp/libxml2" in build/tmp/libxml2-2.7.3 error: Permission denied
If your WebFaction host lacks libxml, and you fail to listen to Ian Bicking, then you can at least install lxml and its dependencies manually using the following commands, as I worked out this morning. The trick is that instead of trying to tell setup.py where you have installed the libraries by using CC= at the beginning of the command line or something like that, you need to make sure that the special command xslt-config is on your path somewhere:
$ cd ~ $ mkdir usr $ mkdir usr/src $ cd usr/src $ wget ftp://xmlsoft.org/.../libxml2-2.7.3.tar.gz $ wget ftp://xmlsoft.org/.../libxslt-1.1.24.tar.gz $ tar xfz libxml2-2.7.3.tar.gz $ tar xfz libxslt-1.1.24.tar.gz $ cd libxml2-2.7.3 $ ./configure --prefix ~/usr $ make install $ cd .. $ cd libxslt-1.1.24 $ ./configure --prefix ~/usr $ make install $ cd .. $ PATH=$HOME/usr/bin:$PATH $ wget http://pypi.python.org/.../lxml-2.2.2.tar.gz $ tar xfz lxml-2.2.2.tar.gz $ cd lxml-2.2.2 $ python setup.py install ... Finished processing dependencies for lxml==2.2.2
But, as I mentioned, Ian's technique is faster. :-)comments powered by Disqus