![]()
From: Craig A Summerhill (craig@cni.org)
Date: Wed May 05 1999 - 17:06:42 CDT
On Wed, 5 May 1999, Paul Haldane <paul.haldane@newcastle.ac.uk> wrote:
>
> On Wed, 5 May 1999, Daniel Stenberg <daniel.stenberg@sth.frontec.se> wrote:
> >
> > On Tue, 4 May 1999, Paul Haldane wrote:
> > >
> > > Any more thoughts on what (if anything) we should do with messages with
> > > duplicate message ids?
> > >
> > > My inclination is to stick with what we do now - don't try to add the
> > > duplicates to the web archive but put out a warning message to that a
> > > human can fix things by hand.
> >
> > I think we could start with trying to think of reasons why this happens in
> > the first place. How do you add several mails to the arcive using the same
> > Message-ID? Does it ever actually occur that two different mails have the
> > same ID?
>
> It does (see my previous messages). It _shouldn't_ happen but (because of
> broken mail systems) it does. In a previous existence I did some work on
> loop avoidance in a mailing list manager. One of the techniques I used
> was suppressing messages with the same msgid. I soon found that there
> were some (a few) systems out there that don't generate unique msgids. We
> made the decision to recognise those messages and skip the check.
>
> We're talking about a small number of messages here (in the test mailboxes
> I'm using at the moment, perhaps 4-5 out of 1000) but obviously this
> depends on the MUAs in use by people sending the mail that hypermail is
> archiving.
Daniel, Paul, et al.
My personal preference would be to include messages with duplicate IDs
into the HTMLed archive. In my case, we are using a mailing list agent
which does a check for duplicates before the message end up in the mbox
files which we are archiving. Personally, that check is plenty enough
for me. (The MLA we use employs a combination of Message-Id: check and
MD5 checksum on the body of the message, as well as a few other things
like SMTP envelope address to determine if a message is a duplicate.)
In my case, if a message gets through that check, I *want* it in my
archive even if it has a message id matching one already there.
If hypermail is going to do a duplicate check, I would prefer a switch
to turn the feature on and off.
-- Craig A. Summerhill, Systems Coordinator and Program Officer Coalition for Networked Information 21 Dupont Circle, N.W., Washington, D.C. 20036 Internet: craig@cni.org AT&Tnet (202) 296-5098
![]()