[OT] Re: seeking webguru advice on html character encoding
[OT] Re: seeking webguru advice on html character encoding
- Subject: [OT] Re: seeking webguru advice on html character encoding
- From: Helmut Fuchs <email@hidden>
- Date: Thu, 20 Dec 2001 20:15:57 +0100
At 13:20 Uhr -0500 20.12.2001, Arthur J Knapp wrote:
> ... I'm avoiding some of the obvious
pratfalls (none of this –/— for en/em dashes crap as in
DreamWeaver, for example) but I'm lost elsewhere:
I'm sorry that you consider #150 and #151 to be "crap", but as a
website production guy, it's tough to find an alternative that
works in both Mac and Windows, Netscape and Microsoft, going back
to at least the level 3 browsers. Especially for much of the work
that we do, where "book" content is being transferred to the web.
It's hard to explain to a company like Prentice Hall that there is
no "formal, correct, legitemate" way to produce an en or em dash
that most browsers will render correctly.
How about trying – and —? These are the correct numeric
entities to use nowadays - and they are supported for quite a while
now. Or just use binary ISO 8859 encoding.
[see
http://www.w3.org/TR/1998/REC-html40-19980424/charset.html#h-5.2.2 ]
Fact is that one can perfectly legally encode documents in BINARY ISO
8859, but for numerical entities it's Unicode (or ISO 10646) - and
they carry different encodings in the range from 128 to 159. That
constructs like – work is actually pretty annoying, because
people continue to use it. It's an exception to the encoding rules of
HTML 4. These exceptions (which are there in abundance) just serve to
make Browsers the bloated and instable beasts they are. Every stupid
workaround has to be carried around for a gazillion years. And
everyone wonders why standard definitions like CSS 2 take eternities
to finally be implented - why? All the Browser project teams'
workload seems to be spent on work-arounds and auto-correct-bad-HTML
"features".
Sorry, but – IS crap...
Regards,
Helmut
P.S. Harsh recation caused by having to hack around bad "XML"
encoding YET AGAIN. Will they ever learn?