Re: Unicode search [was Re: the Holy Grail of AppleScript lists]
Re: Unicode search [was Re: the Holy Grail of AppleScript lists]
- Subject: Re: Unicode search [was Re: the Holy Grail of AppleScript lists]
- From: John Delacour <email@hidden>
- Date: Fri, 21 Mar 2003 12:16:06 +0000
- Mac-eudora-version: 6.0a11
At 11:49 am +0100 21/3/03, Emmanuel wrote:
But UTF-8 is a transformation of Unicode using a speial algorithm,
just as binhex or uuencode. Every character in Unicode proper
consists of two bytes (or 4 in the case of UTF-32) and the length
is not variable.
That's often true, but it is not a rule and it is sometimes false.
Not all glyphs are coded into 2 bytes under UTF-16. As you certainly
noticed UTF-16 allows for displaying much more than 32,767
characters.
By definition UTF-16 is two bytes. 256 * 256 = 65536, so that's the
limit. In practice there are fewer code points assigned than that.
Of course, the most common characters are coded into 2 bytes under UTF-16.
All of them. Give me an example of a character in UTF-16 that is not
two bytes.
JD
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.