Previous Page
Next Page

6.5. Parsing UU Encoded Attachments

MIME was proposed in 1992, when email was already 21 years old. In the time before MIME, people wanted to send each other files via email, and so a system was created where binary files could be embedded straight into the message body. This system is called UUEncoding, short for Unix to Unix, and looks a lot like this:

hello bob,
here's the file you asked for!
begin 644 cat.txt

The first line of the embedded section of the format is <permissions> [<filename>], with the permissions specified in typically cryptic Unix octal format. The (optional) filename can contain spaces and other weird characters, so it always continues until the end of the line. The embedded section ends with the word end on a line by itself.

Mail::mimeDecode doesn't extract uuencoded attachments. This is curious because, at the time of this writing, the code exists inside the modules but is undocumented and never called by the general decode( ) method. Extracting it is fairly trivial, though. The following regular expression should suffice:

$body = preg_replace_callback("!\nbegin ([0-7]{3}) ([^\n]+)\n(.*?)\nend\n!",
                'extract_uuencoded_body', $body);

Our callback function exTRact_uuencoded_body( ) will be called for each matched block. We can return a blank string to remove the block from the mail body, and take the strings we were passed (permissions, filename, and contents) to create the attachments as further mail bodies.

PHP 5 makes this very easy by giving us the built-in function convert_uudecode( ), which does exactly what you might suspect. We simply pass the attachment contents through convert_uudecode( ) and add a new chunk to the list of parsed chunks, exactly as if it we're a regular MIME chunk. If you're using PHP4, then you'll have to write the function yourself, although it's a reasonably trivial exercise. We can use the filename we were passed to create an attachment header, so that as far as our program is concerned, MIME and uuencoded attachments look the same. By hiding things like this in the mail parser, we nicely separate the parsing from the processing, and can modify each without worrying about the otheragain, interfaces between layers make our lives easier.

If you're using Perl, it's even easier. Before calling the parse( ) method, you simply need to enable the uuencoding extraction option:


Previous Page
Next Page