Proposed change: always leave the charset in generated HTML

---------

From: Jose Kahan (jose.kahan@w3.org)
Date: Thu Jan 24 2002 - 19:10:57 CST


Peter and other hypermail developers,

When hypermail converts a message that has an ISO-8859-1 or US-ASCII
charset, the charset is dumped. All other charsets are conserved inside HTML
META tags.

Martin Duerst, leader of the Internationalization Activity at W3C says that
we should conserve it all the time. The HTML 4.0 specs says we cannot
assume a charset for a document unless it is explicitly stated, either
inside the HTTP headers or in an HTML META tag. The specification does not
say what should be done when the charset is missing.

In practice, when this happens, the browser usually assumes a default
charset, according to the user preferences. We have a case where browsers
in the USA interpret a document using ISO-8859-1, while users in Japan
see the same document, but interpreted with another charset (thus, badly
rendered).

The solution is quite easy. I found there was an exception in parse.c
to avoid storing the charset for the above two charsets. I removed the
exception and it is now working correctly... and users in Japan can now
see this document correctly too.

Any problems with my commiting this patch?

Thanks,

-jose


---------

This archive was generated by hypermail 2.1.5.