Re: Unicode 'as string' = unicode?
Re: Unicode 'as string' = unicode?
- Subject: Re: Unicode 'as string' = unicode?
- From: John Delacour <email@hidden>
- Date: Thu, 5 Sep 2002 13:21:15 +0100
At 1:28 am -0700 5/9/02, Christopher Nebel wrote:
An interesting kink in testing this sort of thing is that Script
Editor can't really display Unicode. What it's doing is coercing it
into styled text and then displaying that. Since most characters
most people are likely to type (e.g., any Roman, Cyrillic, classical
Greek, or CJK) are representable in "styled text", this works. Try
some of the weirder characters, though, and it'll fall down.
This is a typical American statement. If it ain't US-ascii it's
"garbage" or "weird" and is probably a threat to national security.
Not only is it impossible to represent classical Greek in styled
text, it is impossible to represent modern Greek. The unaccented
Greek letters are "represented" with glyphs from the incomplete greek
set included in the Japanese characters, unless the characters (eg.
pi, capital Omega) happen to be in the Mac set.
The script below opens a file containing the modern Greek words "to'
gra'mma" (the letter), where o' and a' are those letters with the
tonos accent.
You will see that in the TextEdit file that opens, the string is
properly represented in Unicode (unless, as a good American, you have
resisted the temptation to install any suspicious fonts) but that the
string in the result window is the true garbage.
set f to "" & (path to temporary items) & "togramma.txt"
open for access file f with write permission
repeat with i in words of "254 255 3 196 3 204
0 20 3 179 3 193 3 172 3 188 3 188 3 177"
write (ASCII character i) to file f
end repeat
close access file f
tell application "Finder" to open file f
activate
read file f as Unicode text
JD
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.