Re: Unicode Character in File Name
Re: Unicode Character in File Name
- Subject: Re: Unicode Character in File Name
- From: Jon Pugh <email@hidden>
- Date: Tue, 7 Apr 2009 09:04:29 -0700
While not a canonical source, I think I know the answers:
At 9:29 AM -0600 4/7/09, Doug McNutt wrote:
>Unicode has the concept of a "code point" or, to avoid confusion, codepoint. A codepoint was initially an unsigned 16 bit value but it has been extended to support up to 32 bits even though there are very few graphemes that use the extension at this time. In any case the concept of codepoint is well defined by those who claim to be the standards committee for unicode.
>
>1) What is the official AppleScript term that describes a code point? The word "file" is used in AppleScript to declare the following lexical item to be an alias. What is the similar lexical item that declares something to be a codepoint? Is it "id", "character id", or something else?
An AppleScript "character" is a codepoint. The "character id" is the numeric value of that codepoint.
>2) What is the same term for use while talking to application Finder? It appears that in some cases "character id" as sent to Finder is a command rather than a variable type. Lexical item "id" sounds an awful lot like a window id as used in Terminal.app.
The Finder uses "id" to refer to files, and I suspect it gets confused in some cases. In all instances, the reference "id of character foo" and "character id N" are object specifier data structures sent from a script to the Finder for parsing. AppleScript parses them properly, the Finder does not in some cases.
>3) "set mypoint to codepoint 1234" or "set mypoint to 1234 as codepoint" would make sense to an expert in unicode. Exactly what command lines would do that in AppleScript? Various posters have suggested a bunch of things but I still don't understand which one is politically correct.
Use "character id 1234". Don't send it to the Finder.
>4) Does AppleScript support 32 bit codepoints? Finder?
I expect that AppleScript, because of the layers below it, uses the ICU libraries ultimately and that would determine whether or not it allows 32 bit codepoints, but I suspect it knows how to cope.
Once again, this entire thread seems to spin around the Finder's confusion over "id" which it expects to be file ids, not character ids. I believe this is a bug in the Finder's object specifier coding.
Jon
_______________________________________________
Do not post admin requests to the list. They will be ignored.
AppleScript-Users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
Archives: http://lists.apple.com/archives/applescript-users
This email sent to email@hidden