![]()
From: Jose Kahan (jose.kahan@w3.org)
Date: Wed Apr 02 2003 - 05:38:58 CST
Peter,
Thanks for your confirmation.
I did some additional tests with more mailboxes. The convertion
to XHTML has some effects, because we're trying to produce
valid dcouments:
1) The <u> (underline) element is deprecated. We were only using it
in the tables. I removed it.
2) winlatin1 charset. This is a bit more complex. I've some messages
that have an ISO-8859-1 charset. However, they include some
characters that come from a Windows codepage. This makes them
invalid. The solution here is to convert the winlatin1 characters
into Unicode ones, when found. This will require a convertion
table and examining all the characters. Something akin to the
entity and smart link convertions we do already. I've not yet
written this code but hope to do this afternoon (we already did it
in Amaya).
On the other hand, many browsers don't care about the charset
and would open and display it anyway, with the good glyphs. The
problem is when browsers parse the document through an XML parser.
The parser will complain and stop when reaching that character.
Well, there is a solution and it's compatible with all the browsers
I know. So let's do the char translation here.
3) Invalid HTML attachments or alternatives.
Let me say it right away. The HTML markup that's being produced by
mail clients or web forms is rarely valid. Just including their
content to show it in-line produces invalid documents. Some cases
are:
The attachment defines a <head>, <dtd> or other things that can
only occur once in a valid document and which are already defined
by the markup that hypermail produced before including the body.
The attachment is written in HTML and we can't include it anymore
using XHTML because it has deprecated elements, or the tags are in
upper case (XHTML requires them to be in lower case).
The markup is invalid.
There are some solutions we can take here. I need your feedback
and opinion to know which one is best:
1) Stop doing the inline display of HTML attachments. Store them
in the attachment directory and add a link to them. Include a new
customization option so that the maintainer authorizes inline
display of HTML, even if it produces invalid XHTML.
2) Give more priority to the plaintext alternatives rather than
the first alternative. We can add an option saying how to
interpret the priority of alternatives.
3) If the body of the message is only given in HTML, then store it
in the attachment directory and add a link. Add an option saying
wheter we want to do this or not.
I will wait until we discuss this invalid HTML in messages to code
a solution. This only affects us when we want to show valid XHTML
(which I think should be our goal). And there is a backwards
compatibility with HTML anyway.
Is it Ok to go ahead with my commit for XHTML regardless of the
points I stated here above?
Thanks for your feedback,
-jose
![]()