Re: XHTMLized hypermail available for testing

---------

From: Jose Kahan (jose.kahan@w3.org)
Date: Wed Apr 02 2003 - 05:38:58 CST


Peter,

Thanks for your confirmation.

I did some additional tests with more mailboxes. The convertion
to XHTML has some effects, because we're trying to produce
valid dcouments:

1) The <u> (underline) element is deprecated. We were only using it
   in the tables. I removed it.
   
2) winlatin1 charset. This is a bit more complex. I've some messages
   that have an ISO-8859-1 charset. However, they include some
   characters that come from a Windows codepage. This makes them
   invalid. The solution here is to convert the winlatin1 characters
   into Unicode ones, when found. This will require a convertion
   table and examining all the characters. Something akin to the
   entity and smart link convertions we do already. I've not yet
   written this code but hope to do this afternoon (we already did it
   in Amaya).

   On the other hand, many browsers don't care about the charset
   and would open and display it anyway, with the good glyphs. The
   problem is when browsers parse the document through an XML parser.
   The parser will complain and stop when reaching that character.

   Well, there is a solution and it's compatible with all the browsers
   I know. So let's do the char translation here.

3) Invalid HTML attachments or alternatives.
   Let me say it right away. The HTML markup that's being produced by
   mail clients or web forms is rarely valid. Just including their
   content to show it in-line produces invalid documents. Some cases
   are:

    The attachment defines a <head>, <dtd> or other things that can
    only occur once in a valid document and which are already defined
    by the markup that hypermail produced before including the body.

    The attachment is written in HTML and we can't include it anymore
    using XHTML because it has deprecated elements, or the tags are in
    upper case (XHTML requires them to be in lower case).

    The markup is invalid.

   There are some solutions we can take here. I need your feedback
   and opinion to know which one is best:

   1) Stop doing the inline display of HTML attachments. Store them
      in the attachment directory and add a link to them. Include a new
      customization option so that the maintainer authorizes inline
      display of HTML, even if it produces invalid XHTML.

   2) Give more priority to the plaintext alternatives rather than
      the first alternative. We can add an option saying how to
      interpret the priority of alternatives.

   3) If the body of the message is only given in HTML, then store it
      in the attachment directory and add a link. Add an option saying
      wheter we want to do this or not.

I will wait until we discuss this invalid HTML in messages to code
a solution. This only affects us when we want to show valid XHTML
(which I think should be our goal). And there is a backwards
compatibility with HTML anyway.

Is it Ok to go ahead with my commit for XHTML regardless of the
points I stated here above?

Thanks for your feedback,

-jose


---------

This archive was generated by hypermail 2.1.7.