Mailing Lists: Apple Mailing Lists
Image of Mac OS face in stamp
Re: Writing to file as UTF8 with BOM ?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Writing to file as UTF8 with BOM ?



On 10/26/06, Emmanuel <email@hidden> wrote:
And the BOM would provide a handy way to tell a UTF8 from an ASCII.
It's rather unfortunate that the UTF8 BOM was not widely adopted,
because reading a UTF8 as ASCII is a bad experience which happens
rather frequently, I think.

Reading UTF-8 as ASCII, when properly done, gets you a lot of question marks. Reading it as an 8-bit encoding (Latin-1 or Latin-9 or Windows-1252 or MacRoman or...) is a lot more troublesome since there's no such thing as an invalid byte value in those encodings. So every byte is interpreted as some character, and you get a lot more gobbledygook.

For instance, the  « and  » and  ¬  characters probably show up in
AppleScript, and therefore on this mailing list,  more than any other
non-ASCII chars  Assuming this message is going out in UTF-8 . .

well, here, let me just make certain of that: ☺

... those characters are each encoded with a two-byte sequence.  « is
(194,171), » is (194,187), and ¬ is (194,172).  They happen to fall in
the range where the UTF-8 consists of a single extra character in
front of the correct Latin-1 byte value, so if your mail client
assumes Latin-1 (hi, Yahoo!) you just get an  in front of the desired
character - ugly but still readable.  If, instead, it interprets them
as MacRoman they come out as a ¬ in front of ´, ª, and ¨,
respectively, with no visible clue to their actual identity...







--
Mark J. Reed <email@hidden>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/mailman//archives/applescript-users

This email sent to email@hidden

References: 
 >Re: Writing to file as UTF8 with BOM ? (From: Richard Rönnbäck <email@hidden>)
 >Re: Writing to file as UTF8 with BOM ? (From: "Mark J. Reed" <email@hidden>)
 >Re: Writing to file as UTF8 with BOM ? (From: Yvon Thoraval <email@hidden>)
 >Re: Writing to file as UTF8 with BOM ? (From: "Mark J. Reed" <email@hidden>)
 >Re: Writing to file as UTF8 with BOM ? (From: Yvon Thoraval <email@hidden>)
 >Re: Writing to file as UTF8 with BOM ? (From: "Mark J. Reed" <email@hidden>)
 >Re: Writing to file as UTF8 with BOM ? (From: Yvon Thoraval <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2011 Apple Inc. All rights reserved.