Re: Producing Unicode-only characters
Re: Producing Unicode-only characters
- Subject: Re: Producing Unicode-only characters
- From: "Nigel Garvey" <email@hidden>
- Date: Wed, 26 Oct 2005 22:41:18 +0100
Mark J. Reed wrote on Wed, 26 Oct 2005 10:58:06 -0400:
>On 10/26/05, Nigel Garvey <email@hidden> wrote:
>>
>> Mmm. That's true. But then _all_ Unicode numbers are less than 65536 to
>> AppleScript:
>
>
>Not true. It's just that it uses UTF-16, which means that numbers higher
>than 65536 have to be encoded using surrogate pairs.
>... but in UTF-16, the character U+28CCA is encoded as the two code
>points D863 + DCCA:
>
>count «data utxtD863DCCA»
>--> 1 -- one character, UTF-16 encoded.
Aha, right. Thanks for that information! (Jaguar displays two question
marks for that Unicode character, but returns a count of one. Tiger, as
you probably know, displays just one question mark.)
Based on that, here's an update of the "temporary file" scripts (if kai
hasn't done one already). I think I've got the maths right...:
on unicodeText(l) -- l is a list of integers
set fref to (open for access file ((path to temporary items as
Unicode text) & "utxt scratch.txt") with write permission)
try
set eof fref to 0
repeat with i from 1 to (count l)
set n to item i of l
if (n < 65536) then
write n as small integer to fref
else
write ((n - 65536) div 1024 + 55296) as small integer to fref
write (n mod 1024 + 56320) as small integer to fref
end if
end repeat
set u to (read fref as Unicode text from 1)
on error msg
display dialog msg
end try
close access fref
return u
end unicodeText
on unicodeNumbers(u) -- u is some Unicode text
set fref to (open for access file ((path to temporary items as
Unicode text) & "utxt scratch.txt") with write permission)
try
set eof fref to 0
write u to fref
set l to (read fref as small integer from 1) as list
end try
close access fref
set len to (count l)
repeat with i from 1 to len
set n to item i of l
if (n is missing value) then
else
set n to (65536 + n) mod 65536
if (n div 1024 is 54) and (i < len) then
set n2 to (65536 + (item (i + 1) of l)) -- mod 65536
if (n2 div 1024 is 55) then
set n to n mod 1024 * 1024 + 65536 + n2 mod 1024
set item (i + 1) of l to missing value
end if
end if
set item i of l to n
end if
end repeat
return l's integers
end unicodeNumbers
«data utxt0020D863DCCA000D0041» as Unicode text
unicodeNumbers(result)
--unicodeText(result)
--unicodeNumbers(result)
NG
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden