If you’re interested in converting maildir to mbox, you may want to check out my blog post with an updated, simplified version of this script.
UPDATE (6 June 2010): Thanks to everyone who pointed out the errors introduced when I imported this from my old wiki software. The curly quotes have been removed and I replaced "n" with "\n" in line 15.
UPDATE (3 Feb 2010): Corrected a bug in the code; the output file is sys.argv[-1], not sys.argv[1].
I recently moved a large (20,000 messages) Archive mail folder from my IMAP server to my local workstation. The goal was to reduce load on the server, and improve search performance. I use Thunderbird as my mail client, and so needed to convert from Maildir (used by the IMAP server) to mbox (used by Thunderbird). Python includes some libraries for manipulating mailboxes and messages, and so I was able to put together a short script for doing the conversion.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
#!/usr/bin/python # -*- coding: utf-8 -*- import mailbox import sys import email mdir = mailbox.Maildir(sys.argv [-2], email.message_from_file) outfile = file(sys.argv[-1], 'w') for mdir_msg in mdir: # parse the message: msg = email.message_from_string(str(mdir_msg)) outfile.write(str(msg)) outfile.write('\n') outfile.close() |
This script could be used as follows:
$ python mailconv.py Maildir output.mbox
To get the newly created Mbox file into Thunderbird, it’s usually easiest to create a new local folder in Thunderbird, shut down the application and replace the file for the Folder in your profile directory with the new Mbox file.
Doesn’t work for me:
[fons@betsy ~]$ python ./mailvonv.py Maildir output.mbox
Traceback (most recent call last):
File “./mailvonv.py”, line 6, in ?
outfile = file(sys.argv[1], “w”)
IOError: [Errno 21] Is a directory: ‘Maildir’
Sorry about that; the code had a bug in the
sys.argvindexing. I’ve updated the code to correct that issue.still doesn’t work bro :(
File “/home/baumi/bin/maildir2mbox.py”, line 6
SyntaxError: Non-ASCII character ‘\xe2′ in file /home/baumi/bin/maildir2mbox.py on line 6, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details
Chris: that’s because copying and pasting from the webpage brings in “special” quotes. just manually retype ‘w’ and the ‘\n’
Many people reported issues with the curly quotes that seem to have crept in when I imported this post from my old wiki software to WordPress. I think it should work now, Chris.
You have a bug in the code
‘n’ should really be ‘\n’
also, can you get rid of the non-standard quotes?
It seems to bite people in the butt (like me and chris) when copying and pasting the code.
Oh, and thanks for the script :) it worked for me after doing the modifications I mentioned.
I got the same error as that chris had. The Python source above contains wrong single quote characters. By fixing them, it works well. Thanks for useful code.
The problems in the script are due to the syntax highlighter used — single quotes got converted to “smart quotes” (e.g. the lines with ‘w’ and ‘n’). When you fix those, the script works flawlessly.
Thanks!
its the quotes around ‘w’ and ‘n’ on lines 6 and 12. his blog software must be screwing up the single quotes.
also, should line 12 be ‘\n’ rather than ‘n’?
Yup, line 12 should have been
\n. Fixed.Works like a treat – even better than some commercial programs like Emailchemy and Aid4Mail. Yes, OK, I have to go through each folder one at a time but I think I manage that!
Thanks!
Fantastic !! Have now run it all through my kmail directories and I can import into Thunderbird, no problems.
however, being on the lazy side, I put together a script to run through my current email directory and process all sub-directories. i noticed that all my mail files are always held in a /cur sub-folder. So this script des a find for “cur” sub-folders and processes the parent directory. it doesn’t attempt to replicate the same structure on the out put as the input, it just uses the parent directory name. i suppose this could cause duplicate problems/over-writing if you have the same name sub-folders in different main folders eg: nov-2010/clients and dec-2010/clients
#!/bin/bash
# created by: Chris Backhouse
# date: June 2010
# web: http://www.sudwebdesign.com
# Please feel free to copy and duplicate but would appreciate leaving the link alone. thanks
IN_DIR=$1
OUT_DIR=$2
if [ ! -d $IN_DIR ]; then
echo “WARNING: Cannot find Input Directory – exiting”
exit
fi
if [ ! -d $OUT_DIR ]; then
mkdir $OUT_DIR
fi
# Loop through files in current directory
# looking for /cur sub folders .
# Then take the name of the parent
find $IN_FILE -name ‘cur’ | while read filename
do
testfn=`cd “$filename”; cd ..;pwd`
fn=$(basename “$testfn”)
if [ -d "$testfn" ]; then
echo “>>convert : $OUT_DIR/$fn”
python maildir2mbox.py “$testfn” “$OUT_DIR/$fn”
fi
done