XML indexer, experimental, patch+python script

---------

From: Bernhard Reiter (bernhard@uwm.edu)
Date: Tue Nov 30 1999 - 02:18:32 CST


Hello hypermailers,

I know I shouldn't have done this,
but I learned about python's XML handling in the process.
This is my first attempt in XML file mangeling, so bear with me.

I was always talking and thinking about how to get montly archives done
and not only done, but the index files created, too:

                The problem:
                ------------

Hypermail in combination with the archive scripts, creates
a bunch of directories for each year and each month when mails come in.

        a) How do you make a top page,
        linking all the scattered index files?

        b) Subproblem: If one mail comes in,
        do you really want to rebuild the complete index overviewfiles?
        Of course not.

        c) What if I want my top index page to have the number of mails
        grouped by week or so. :) (Hi egroups.)

Solution: A python script strangles the problem.

Part 1: I patched hypermail so that it creates an archive overview file
        complying with the haof.dtd in each directory it operates in.

Oh, back to Part 0:
        Wrote a dtd for the Hypermail Archive Overview Format (hoaf).

Part 2a: Wrote a little python module, which creates a HTML snipplet from this
        overviewfile and leaves it in the directory above. But only,
        if the overviewfile exists and is newer as the the snipplet.

Part 2b: Wrote another python script, which runs through a directory,
        and checks each year and month and runs the module from 2a a
        couple of times.

Results attached.

Left for the interested reader: Beautify the output.

Interesting research topics:
* Only the mail references are missing in the
  hoaf, otherwise threading could be done on that level.
* Well we could write this data into a little database. http://www.dbxml.org/ ?
 Or Postgres or MySql-GPL?

Enjoy,
        Bernhard
ps:This contribution to hypermail shall be free software under the GPL.









---------

This archive was generated by hypermail 2.1.5.