Re: does NSTextField always use UTF8 encoding
Re: does NSTextField always use UTF8 encoding
- Subject: Re: does NSTextField always use UTF8 encoding
- From: Alastair Houghton <email@hidden>
- Date: Wed, 18 Jun 2008 17:38:09 +0100
On 18 Jun 2008, at 07:25, Andrew Farmer wrote:
NSStrings are encoding-independent. They represent strings, not
sequences of bytes.
Not *entirely*. The docs are a little sloppy on this, unfortunately,
both for Cocoa and Core Foundation; in both cases they talk about
"Unicode characters" and suggest that these may be 16-bits in size.
There was a point in the past where Unicode (as opposed to ISO10646,
which later merged with it if I've got my history right) was indeed a
16-bit per "character" encoding, which is probably the reason the docs
read the way they do, but it isn't really true today and so it's best
not to think of it that way.
Perhaps more accurately, NSString is a sequence of UTF-16 code units,
which is not the same thing at all (in fact, the word "character" is
generally one to avoid because it's often unclear what you mean when
you use it).
In particular, -characterAtIndex: can return either half of a
surrogate pair (e.g. if you have a string containing a non-BMP code
point like MUSICAL SYMBOL G CLEF U+1D11E, which is encoded D834 DD1E
according to Character Palette, you might get 0xD834 or 0xDD1E, but
you won't ever get 0x1D11E). Nor is that the only trap for the
unwary; you can also get various types of Unicode control codes as
well as several kinds of combining characters (though the most common
group is probably accents).
The String Programming Guide does warn about this to some extent:
"If you need to access string objects character-by-character, you must
understand the Unicode character encoding—specifically, issues related
to composed character sequences."
Anyway, this is often not a big deal, but in some applications it can
be so it's worth bearing in mind.
Kind regards,
Alastair.
--
http://alastairs-place.net
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden