• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Producing Unicode-only characters
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Producing Unicode-only characters


  • Subject: Re: Producing Unicode-only characters
  • From: "Nigel Garvey" <email@hidden>
  • Date: Wed, 26 Oct 2005 22:41:18 +0100

Mark J. Reed wrote on Wed, 26 Oct 2005 10:58:06 -0400:

>On 10/26/05, Nigel Garvey <email@hidden> wrote:
>>
>> Mmm. That's true. But then _all_ Unicode numbers are less than 65536 to
>> AppleScript:
>
>
>Not true. It's just that it uses UTF-16, which means that numbers higher
>than 65536 have to be encoded using surrogate pairs.


>... but in UTF-16, the character U+28CCA is encoded as the two code
>points D863 + DCCA:
>
>count «data utxtD863DCCA»
>--> 1 -- one character, UTF-16 encoded.

Aha, right. Thanks for that information! (Jaguar displays two question
marks for that Unicode character, but returns a count of one. Tiger, as
you probably know, displays just one question mark.)

Based on that, here's an update of the "temporary file" scripts (if kai
hasn't done one already). I think I've got the maths right...:

  on unicodeText(l) -- l is a list of integers
    set fref to (open for access file ((path to temporary items as
Unicode text) & "utxt scratch.txt") with write permission)
    try
      set eof fref to 0
      repeat with i from 1 to (count l)
        set n to item i of l
        if (n < 65536) then
          write n as small integer to fref
        else
          write ((n - 65536) div 1024 + 55296) as small integer to fref
          write (n mod 1024 + 56320) as small integer to fref
        end if
      end repeat
      set u to (read fref as Unicode text from 1)
    on error msg
      display dialog msg
    end try
    close access fref

    return u
  end unicodeText

  on unicodeNumbers(u) -- u is some Unicode text
    set fref to (open for access file ((path to temporary items as
Unicode text) & "utxt scratch.txt") with write permission)
    try
      set eof fref to 0
      write u to fref
      set l to (read fref as small integer from 1) as list
    end try
    close access fref

    set len to (count l)
    repeat with i from 1 to len
      set n to item i of l
      if (n is missing value) then
      else
        set n to (65536 + n) mod 65536
        if (n div 1024 is 54) and (i < len) then
          set n2 to (65536 + (item (i + 1) of l)) -- mod 65536
          if (n2 div 1024 is 55) then
            set n to n mod 1024 * 1024 + 65536 + n2 mod 1024
            set item (i + 1) of l to missing value
          end if
        end if
        set item i of l to n
      end if
    end repeat

    return l's integers
  end unicodeNumbers

  «data utxt0020D863DCCA000D0041» as Unicode text
  unicodeNumbers(result)
  --unicodeText(result)
  --unicodeNumbers(result)


NG

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: Producing Unicode-only characters
      • From: kai <email@hidden>
References: 
 >Re: Producing Unicode-only characters (From: "Nigel Garvey" <email@hidden>)
 >Re: Producing Unicode-only characters (From: "Mark J. Reed" <email@hidden>)

  • Prev by Date: Re: Global Find and Replace
  • Next by Date: Re: QuickTime - Save Export Settings - script help needed
  • Previous by thread: Re: Producing Unicode-only characters
  • Next by thread: Re: Producing Unicode-only characters
  • Index(es):
    • Date
    • Thread