Unicode output from Zope 3

The Creative Commons licene engine has gone through several iterations, the most recent being a Zope 3 / Grok application. This has actually been a great implementation for us1, but since the day it was deployed there’s been a warning in README.txt:

If you get a UnicodeDecodeError from the cc.engine (you'll see this if it's
running in the foreground) when you try to access the http://host:9080/license/
then it's likely that the install of python you are using is set to use ASCII
as it's default output.  You can change this to UTF-8 by creating the file
/usr/lib/python<version>/sitecustomize.py and adding these lines:
  import sys
  sys.setdefaultencoding("utf-8")

This always struck me as a bit inelegant—having to muck with something outside my application directory. After all, this belief that the application should be self-contained is the reason I use zc.buildout and share Jim’s belief in the evil of the system Python. Like a lot of inelegant things, though, it never rose quite to the level of annoyance needed to motivate me to do it right.

Today I was working on moving the license engine to a different server2 and ran into this problem again. I decided to dig in and see if I could track it down. In fact I did track down the initial problem—I was making a comparison between an encoded Unicode string and without specifying an explicit codec to use for the decode. Unfortunately once I fixed that I found it was turtles all the way down.

Turns out the default Zope 3 page template machinery uses StringIO to collect the output. StringIO uses, uh, strings—strings with the default system encoding. Reading the module documentation, it would appear that mixing String and Unicode input in your StringIO will cause this sort of issue.

Andres suggested marking my templates as UTF-8 XML using something like:

 < ?xml version="1.0" encoding="UTF-8" ?>

but even after doing this and fixing the resulting entity errors, there’s still obviously some 8 bit Strings leaking into the output. In conversations on IRC the question was then asked: “is there a reason you don’t want a reasonable system wide encoding if your locale can support it?”

I guess not3.

UPDATE Martijn has a tangentially related post which sheds some light on why Python does/should ship with ascii as the default codec. At least people smarter than me have problems with this sort of thing, too.


1 Yes, I may be a bit biased—I wrote the Zope3/Grok implementation. Of course, I wrote the previous implementation, too, and I can say without a doubt it was… “sub-optimal”.

2 We’re doing a lot of shuffling lately to complete a 32 to 64 bit conversion; see the CC Labs blog post for the harrowing details.

3 So the warning remains.

Ubuntu Netbook Remix on the Eee PC

Last year when Asus released the original Eee PC 7xx series, a colleage and I made a lunch-time trek to Central Computers down the street and each picked up a 701 with 4 GB SSD and Linux. The stock distribution is Xandros based. That’s great since Xandros is Debian based itself, but not so great since it seemed configured specifically to resemble Windows in many ways. Progress, right?

Shortly after purchasing my Eee I installed eeeXubuntu on it. This configuration actually worked pretty well. Combined with an additional 4 GB of storage in the form of an SD card I carried the Eee with me as my sole computer for a week in Europe in January. Upon my return, however, the Eee saw less and less usage. In retrospect I’m not sure that the decline had anything to do with the Eee at all—all my non-work computing declined dramatically during the first half of the year. The small form factor of the Eee still called out for use, so I dabbled with it periodically. One weekend I tried installing a Sugar shell (successfully, for some definition of success, I guess). Another I tried updating my eeeXubuntu installation from 7.10 to 8.04, without success (disk space issues). When I saw Ubuntu Netbook Remix, I decided I wanted to try that on the Eee. The combination of a focused, single window user interface and specialized launcher seemed like a good combination for the space constrained display.

Today I successfully installed Ubuntu 8.04 and the Netbook Remix on my Eee.

Ubuntu Netbook Remix

The steps were actually pretty straight forward:

  1. I installed Ubuntu 8.04 using a USB stick. When it came time to select tasks, I didn’t select anything to get a minimal installation.
  2. Added the Array.org repository and installed a kernel with Eee-specific customizations.
  3. Added the Netbook Remix repositories and fired up aptitude. At this point I just picked my way through the packages in the ubuntu-desktop task, picking those I wanted. In particular I omitted things related to Bluetooth or CD support (since I have hardware for neither).
  1. Installed the ume-launcher and other Netbook packages.

    If these instructions seem a little thin it’s because I mostly just followed the instructions of others, both found in the excellent Eee User wiki.

  • Installing Netbook Remix

    I’m heading to OSCON next week so I’m going to play with the installation this week to determine whether I can use it as my sole machine for that trip.

Readonly Attachments for Thunderbird 2

Last seen here two years ago, I’ve just updated Readonly Attachments for Thunderbird 2. It still does pretty much exactly what the last post describes, bugs and all.

Right now I’m just releasing a preview. At this point I’ve only tested it with Thunderbird 2.0.0.14 on Mac OS X 10.5.3; I’ll test with Linux tomorrow1 and if all goes well I’ll update addons.mozilla.org at that point.

UPDATE: (2008-07-13) Things seem to work fine on Linux, so we’re just waiting for it to clear the review queue at AMO.

1 Unfortunately I won’t be testing on Windows. My work laptop dual boots but frankly it’s so painful to load Windows these days that I can’t bring myself to do it. I’m trying to do more with my spare time these days that I enjoy and testing for Windows just doesn’t pass that test. Sorry.

Technology Summit

Yesterday was the first ever Creative Commons Technology Summit, hosted at Google. My photos and better ones taken by Joi.

Nerd Van

I drove the Nerd Van (myself, Asheesh and the interns) to Google.

I’m still recovering (and inflicting pain—CC board meeting today) and collecting feedback, but I think it was a really successful day. We learned some things we’ll do differently next time (yes, there will be a next time). Anyway, special recognition to the CC interns for live blogging the event and for generally doing anything asked of them. I feel like I should write more about the event, but I’m feeling pretty brain dead at the moment.

Avoiding git PTSD

In an attempt to prevent additional git (or maybe just git-svn?) induced PTSD, Asheesh kindly created a git phrasebook. If you, too, are a Subversion deserter and want to figure out how the whole branching thing works in git, this may be useful to you.

Someday I’ll write up my thoughts on distributed version control and “convention versus configuration”, which seem to overlap in this deployment. But not today.

Creative Commons Attribution-ShareAlike 3.0 United States
Creative Commons Attribution-ShareAlike 3.0 United States