• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Producing Unicode-only characters
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Producing Unicode-only characters


  • Subject: Re: Producing Unicode-only characters
  • From: "Mark J. Reed" <email@hidden>
  • Date: Wed, 26 Oct 2005 10:55:10 -0400

On 10/26/05, Emmanuel <email@hidden> wrote:
At 11:01 PM +0100 10/25/05, Nigel Garvey wrote:
>   on unicodeCharacter(n)
>     run script "«data utxt" & {n div 4096, n mod 4096 div 256, n mod 256
>div 16, n mod 16} & "»"
>   end unicodeCharacter
>etc

Nigel, I read your post quickly (and then Paul's modification
following Dave's remark), maybe I'm missing something: doesn't that
work (and your other handlers, too) only for characters with a code
lower than 65536?

It appears that the underlying encoding used for unicode text is UTF-16.  Which means the literal form of e.g. U+10000 (the Linear B symbol for "a")  has to be constructed using surrogate pairs as «data utxtD800DC00».

Here's my solution - not sure how much of this I could have done without rolling my own:
on hex(n)
    local digitList, hexString
    set digitList to {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F"}
    set hexString to ""
    repeat while n > 0
        set hexString to (item (n mod 16 + 1) of digitList) & hexString
        set n to n div 16
    end repeat
    repeat while length of hexString < 4
        set hexString to "0" & hexString
    end repeat
    return hexString
end hex

on unicodeCharacter(scalarValue)
    local scriptString
    set scriptString to ""
    if (scalarValue > 65535) then
        local excess, highSurrogate, lowSurrogate
        set excess to scalarValue - 65536
        set highSurrogate to excess div 1024 + 55296
        set lowSurrogate to excess mod 1024 + 56320
        set scriptString to hex(highSurrogate) & hex(lowSurrogate)
    else
        set scriptString to hex(scalarValue)
    end if
    set scriptString to "«data utxt" & scriptString & "» as unicode text"
    return (run script scriptString)
end unicodeCharacter

--
Mark J. Reed < email@hidden>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References: 
 >Re: Producing Unicode-only characters (From: "Nigel Garvey" <email@hidden>)

  • Prev by Date: Re: Producing Unicode-only characters
  • Next by Date: Re: Producing Unicode-only characters
  • Previous by thread: Re: Producing Unicode-only characters
  • Next by thread: Re: Producing Unicode-only characters
  • Index(es):
    • Date
    • Thread