parsemail rewrite

---------

From: John Finlay (finlay@moeraki.com)
Date: Fri Apr 23 1999 - 08:04:34 CDT


Just to follow up on my previous message. My thinking is that the
strategy of parsemail is to extract each email message as a whole
breaking it into header and body sections but not doing further
processing of the body until later. This is simple and straightforward.
The headers can then be decoded to extract the sorting info. Then the
headers of each message are canonicalized and the message is added to
the hash tables. At this point the determination of whether the message
is to be saved could be done based on the msgid and other criteria.

The messages are added to a growable (in chunks of 512) array of
messages that are sorted using quicksort, if necessary (the addheader
function is deleted). The incremental case would employ a header file
which contains the cache of header info including the sort info which
is read in (or created on the fly is not existing) and used to populate
the message table before processing any new messages. In the incremental
case, if the header file is available a merge sort of the new messages
would be done.

During the printing of the messages, the message decoding and MIME
decoding of the message bodies would be done (though this can be done
any time after it is decided to save the message.

I'm not sure I completely understand the handling of multipart messages
especially when they are recursive (e.g. multipart/digest). Is it the
intention to handle these fully i.e. multipart messages inside multipart
messages? the comments seem to recognize the problem but it's unclear to
me whether full recursion has been implemented.

John


---------

This archive was generated by hypermail 2.1.5.