• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Write Unicode characters to HTML
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Write Unicode characters to HTML


  • Subject: Re: Write Unicode characters to HTML
  • From: "Mark J. Reed" <email@hidden>
  • Date: Tue, 10 May 2005 12:25:23 -0400



On 5/10/05, Emile Schwarz <email@hidden> wrote:
I second what Mark wrote below.

I have a simple modification:

&#1488;
  ^

Mark forgot the '#' character in the decimal NCR (Numeric Character References).

Whups!  Thanks for the correction.

Cheers,

Emile

BTW: Alef was not unknow from my side, but my memory does not remind me that
this was the a letter (thanks to the Sunday morning French TV) in Hebrew.

Yup.  The letter alef is direct descendent of the very first "a" letter, although it doesn't represent the "a" sound, since vowels don't have letters in Semitic languages.  It was the Greeks who introduced the use of letters for vowels when they borrowed the alphabet from the Semites (changing the name alef to alfa in the process).   In Hebrew, the alef represents a glottal stop.


> So if those bytes are then treated as if they were Mac Roman instead of
And the above is because you use a Macintosh which will not be the same if you
use another platform (Windows for example).

This is why IMHO Unicode and Encodings were created.

> Content-Type: text/plain; charset="iso-8859-1"
The above seems to be MacRoman encoding.


Yeah.  For encoding Western European languages (including English), Windows machines tend to use Windows Code Page 1252, while Macintoshes use MacRoman, even though neither of those is the same as ISO-8859-1 ("Latin 1"), which is what the International Standards Organization says you should use for those langauges if you must use a single-byte encoding rather than Unicode.

In particular, a web page encoded in Mac Roman is likely to be completely non-displayable on any non-Macintosh browser.  Non-Windows support for Windows 1252 is somewhat more widespread, but it's still a bad idea.  For a web page, you should use UTF-8-encoded Unicode (or, for maximum compatibility, ISO-8859-1 if it contains all the characters you want to use).

MacRoman predates ISO 8859, so Apple can be forgiven the incompatibility. :)  Windows 1252 does not predate it, but it's also mostly compatible with it: from the point of view of printable characters, it's a superset, having sacrificed some of the non-printing ISO-2022 "control' characters (which are used for code switching among the ISO family of character sets).

--
Mark J. Reed <email@hidden>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: Write Unicode characters to HTML
      • From: "Mark J. Reed" <email@hidden>
References: 
 >Re: Write Unicode characters to HTML (From: Emile Schwarz <email@hidden>)

  • Prev by Date: Re: System Events Info
  • Next by Date: Re: Write Unicode characters to HTML
  • Previous by thread: Re: Write Unicode characters to HTML
  • Next by thread: Re: Write Unicode characters to HTML
  • Index(es):
    • Date
    • Thread