Re: Unicode search
Re: Unicode search
- Subject: Re: Unicode search
- From: Helmut Fuchs <email@hidden>
- Date: Fri, 21 Mar 2003 15:38:04 +0100
At 14:09 Uhr +0000 21.03.2003, John Delacour wrote:
And this link says, that the Unicode standard allows for 21 bits
to encode characters: <http://www.unicode.org/faq/utf_bom.html#9>.
21 bits clearly don't fit into two bytes.
That link says nothing of the kind. It says:
"both Unicode and ISO 10646 have policies in place that formally
limit even the UTF-32 encoding form to the integer range that can be
expressed with UTF-16 (or 21 significant bits)."
You are kidding, aren't you? This sentence says two things:
1. That Unicode and ISO 10646 policies are in place, that limit
UTF-32 to the range expressable in UTF-16.
2. That the range of characters expressable with UTF-16 is 21 bits.
And could you please elaborate on how, for example, the "CJK Unified
Ideographs Extension B" in range 20000-2A6DF from The Unicode
Standard 3.1 could possibly be expressed with a single UTF-16 16 bit
unit?
The entry "What is UTF-16" in the UTF and BOM FAQ
<
http://www.unicode.org/faq/utf_bom.html#6> explains:
UTF-16 allows access to 63K characters as single Unicode 16-bit
units. It can access an additional 1M characters by a mechanism
known as surrogate pairs. Two ranges of Unicode code values are
reserved for the high (first) and low (second) values of these
pairs. Highs are from 0xD800 to 0xDBFF, and lows from 0xDC00 to
0xDFFF. In Unicode 3.0, there are no assigned surrogate pairs. Since
the most common characters have already been encoded in the first
64K values, the characters requiring surrogate pairs will be
relatively rare (see below).
At 14:09 Uhr +0000 21.03.2003, John Delacour wrote:
And a pair of UTF-16 characters is two characters.
True. If it's a character it's a character. But two 16 bit units in
UTF-16 CAN be ONE character. Or how would you explain this FAQ entry
"Why are some people opposed to UTF-16"?
<
http://www.unicode.org/faq/utf_bom.html#8>
Best regards,
Helmut
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.