Re: Write Unicode characters to HTML
Re: Write Unicode characters to HTML
- Subject: Re: Write Unicode characters to HTML
- From: "Mark J. Reed" <email@hidden>
- Date: Tue, 10 May 2005 12:25:23 -0400
On 5/10/05, Emile Schwarz <email@hidden> wrote:
I second what Mark wrote below.
I have a simple modification:
א
^
Mark forgot the '#' character in the decimal NCR (Numeric Character References).
Whups! Thanks for the correction.
Cheers,
Emile
BTW: Alef was not unknow from my side, but my memory does not remind me that
this was the a letter (thanks to the Sunday morning French TV) in Hebrew.
Yup. The letter alef is
direct descendent of the very first "a" letter, although it doesn't
represent the "a" sound, since vowels don't have letters in Semitic
languages. It was the Greeks who introduced the use of letters
for vowels when they borrowed the alphabet from the Semites (changing
the name alef to alfa in the process). In Hebrew, the alef represents a glottal stop.
> So if those bytes are then treated as if they were Mac Roman instead of
And the above is because you use a Macintosh which will not be the same if you
use another platform (Windows for example).
This is why IMHO Unicode and Encodings were created.
> Content-Type: text/plain; charset="iso-8859-1"
The above seems to be MacRoman encoding.
Yeah. For encoding Western European languages (including
English), Windows machines tend to use Windows Code Page 1252, while
Macintoshes use MacRoman, even though neither of those is the same as
ISO-8859-1 ("Latin 1"), which is what the International Standards
Organization says you should use for those langauges if you must use a
single-byte encoding rather than Unicode.
In particular, a web page encoded in Mac Roman is likely to be
completely non-displayable on any non-Macintosh browser.
Non-Windows support for Windows 1252 is somewhat more widespread, but
it's still a bad idea. For a web page, you should use
UTF-8-encoded Unicode (or, for maximum compatibility, ISO-8859-1 if it
contains all the characters you want to use).
MacRoman predates ISO 8859, so Apple can be forgiven the
incompatibility. :) Windows 1252 does not predate it, but it's
also mostly compatible with it: from the point of view of printable
characters, it's a superset, having sacrificed some of the non-printing
ISO-2022 "control' characters (which are used for code switching among
the ISO family of character sets).
--
Mark J. Reed <
email@hidden>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden