![]()
From: jose.kahan@w3.org
Date: Fri Aug 06 1999 - 14:29:58 CDT
Hello Daniel and hypermail folks,
(I'm addressing this message to Daniel, as he seems to be the main
developer).
I also patched hypermail to do the same attachment functionality. I
just upgraded my code to integrate your patches and compare them with mine.
The result of my merge is online browsable at:
http://dev.w3.org/cgi-bin/cvsweb/hypermess/hypermail/
Here are some comments:
[att- file prefix]
As now hypermail is storing the attachments in a special attachment
directory, prefixing the attachments with "att-" becomes redundant.
Hence, I removed it from my code.
[emptydir]
This function empties the attachment directory.
I find it quite dangerous. If the user makes a mistake when
configuring hypermail, he may erase something that he's not expecting.
My solution to avoid using this function is given in the next point.
[unique names for attachments]
parse.c goes thru lots of problems to find a unique name. Also, if
the mail headers don't provide a fname, it'll generate a temporary random
name. If you regenerate an archive, links that point to those random
names will be broken.
My solution is to always generate the same names in the same way. My
attachment names have the following format:
nn-filename, if the mail headers give a filename
nn-part, otherwise
nn is a counter. I increment it each time I find a new attachment. I'm only
using two digits, as I'm not sure that we can have more than 99 attachments
in one message (I could port it to 3 digits, or infinte digits if needed).
I reset the counter when dealing with a new message.
As my attachment names are always the same, I don't need to call emptydir
anymore and this cuts a bit on the overhead of running hypermail (and removes
the risk of erasing something that you don't want).
[newdir variable name]
It'd be clearer if you call it att_dir (as that's what it's being used for :)
[creation of the newdir name]
It's been doing systematically for each new message. To have a
better performance, I do it only when I'm sure were dealing with an attachment.
This saves syscalls to allocate/deallocate memory.
[minor C bug]
In some parts of the code, you're doing a:
attachname[0]=0;
As attachname is a char array, you should rather do a
attachname[0]='\0';
[html in-line attachments]
According to the HTML DTD, it's not valid embeed an HTML document
inside another one. To make it compliant, you have to remove the
<HTML><HEAD> and all inside it. Of course, most browsers show it
correctly ... today! but tomorrow? As parsing the HTML is quite a pain,
my solution is to never put it in-line, and just add a link to the
document.
[Other bug fixes, improvements]
These are too numerous to cite here in detail:
* correct (according to me) handling of alternate handling
* handling of Content-Disposition, for inline or attachment text
* support for Content-Description and using it instead of
picture, stored, or in ALT tags whenever possible
* Handling of message/rfc822 (there were some minor bugs concerning
a <PRE>. I'll mail the patch.
* fixed a bug when the attachment="something" was given as
attachment=something. It's now working. I'll mail the patch.
I'm also enclosing a torture test from RFC 2049 (it substitutes RFC 1806),
which you can run with hypermail to see how it breaks down without the fixes
I did :)
So, the big question is, is it worth it that I contribute my patches?
It takes some of my time to make a diff between my version and yours. I'll
continue to make the effort if they are taken into acocunt or refused with
a reason. I don't like having two different hypermails, but I need to have
one running up at our site, so I'm forced to have my own cvs base to develop
it in a correct environment. I'm trying to commit often whenever possible,
describing the changes I'm making so that other people can profit from them.
Some time soon, I'll start doing more proprietary changes to the code, to
adapt it to our environment. I'd like to have at least the same code base
as you have at that moment to avoid diverging as long as possible.
My next work items are:
* Continue testing against MIME attachments
* Handling of MH mailboxes (one file per message) + use of a black list to
omit some of the messages when regenerating the archive
* Use of metadata in a separate file to give the characteristics of attachments
(mostly specific to the Apache server, and useful for avoiding adding
extensions to filenames)
* Adding CSS to the archives and modifying the html output to what we're using
today at W3C.
Again all these modifs are available on-line at the URL given at the
beginning of this message (there's a big message there too stating that
it's not the official hypermail base).
Your comments are welcome.
-Jose
![]()