Re: [dev] Suckless ML archiver?

From: Scott Lawrence <bytbox_AT_gmail.com>
Date: Sat, 17 Mar 2012 16:06:42 -0400 (EDT)

On Sat, 17 Mar 2012, Anselm R Garbe wrote:

> On 17 March 2012 20:56, Scott Lawrence <bytbox_AT_gmail.com> wrote:
>> On Sat, 17 Mar 2012, Anselm R Garbe wrote:
>>> The mlmmj output format is a directory consisting of files (1-n) where
>>> each contains a single message in mbox format. The number (1-n) is
>>> incremented for each message. For instance the dev_AT_suckless.org
>>> mailing list directory contains 11359 message files as of now. You
>>> could extend your archiver to work on such a directory structure. Once
>>> done, I would give it a go on the dev_AT_suckless.org messages.
>>
>>
>> A single message in mbox format? Or a single message in RFC5322 format (as
>> typically found in mboxes)? Or single message in not-quite-standard format
>> (such as used by pipermail behind the scenes)?
>
> Sorry for the confusion, it is rfc5322 format.
>
>> If the former, a call to `cat` would suffice to "extend" my archiver.
>
> Ok, will give it a try.

Oh, if it's just rfc5322, then a simple 'cat' won't do (slark expects an
actual mbox ATM). I'll patch it to handle a sensible directory layout in the
next few days. (Sorry about being so slow to make improvements - I'm somewhat
overloaded for a few weeks.)

Other improvements needed (in case anybody wants to learn go by patching the
go-mail library): handle multipart and the common message encodings, handle
HTML messages elegantly (sanitize but leave basic styling when available?),
and handle UTF headers. (Actually, #2 might be best done in slark, not
go-mail.)

-- 
Scott Lawrence
Linux jagadai 3.2.9-1-ARCH #1 SMP PREEMPT Thu Mar 1 09:31:13 CET 2012 x86_64 Intel(R) Core(TM)2 Duo CPU P8700 _AT_ 2.53GHz GenuineIntel GNU/Linux
Received on Sat Mar 17 2012 - 21:06:42 CET

This archive was generated by hypermail 2.3.0 : Sat Mar 17 2012 - 21:12:06 CET