Re: As Text Work-Around Broken
Re: As Text Work-Around Broken
- Subject: Re: As Text Work-Around Broken
- From: Christopher Nebel <email@hidden>
- Date: Mon, 16 Jan 2006 13:37:29 -0800
On Jan 14, 2006, at 11:26 AM, has wrote:
One comment about that page: it incorrectly states that an AS
string consists of "Mac-Roman" characters; AS strings actually use
the user's primary encoding, as determined from their International
system preferences. For most US, western European and Antipodean
users this will be MacRoman, but will often be different for folks
in other parts of the world.
You know, I thought that too, but then I read a little closer and
realized that it's mostly correct. In fact, it defines its own term
"Mac-encoded" to mean "text data in your primary encoding". It does,
however, subtly assume that the primary encoding is MacRoman by
referring to un-encodable characters as "non-Roman".
The only other problem (encoding-wise) is in its definition of the
"string" contents, where it says
"The string class basically stores one byte ([0..255]) per character.
The 128 first values are rendered according to the ASCII standard ...
The 128 larger values are rendered using a macintosh encoding, the
one that goes with the first language listed in your International
preference pane."
In fact, the "string" class stores data encoded using the primary
encoding (which is indeed determined by the first language listed in
your International preference pane; that bit is fine.) Some
encodings are one-byte-per-character, some are mixed-one-and-two
(MacJapanese, for instance), and I don't know if there are any pure-
multi-byte encodings allowed these days.
The trick is that most of them are isomorphic to ASCII: bytes 0
through 127 mean the same thing everywhere. (Well, almost
everywhere. As the page points out, MacJapanese is not strictly
isomorphic -- 0x5C is a yen sign, not a backslash.) Some older Mac
encodings are completely different, such as MacArabic, but those
aren't supported these days except for import and export; the system
uses Unicode instead.
--Chris Nebel
AppleScript and Automator Engineering
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden