Re: characters in cocoa
Re: characters in cocoa
- Subject: Re: characters in cocoa
- From: "Clark Cox" <email@hidden>
- Date: Mon, 10 Sep 2007 08:21:05 -0700
On 9/10/07, Uli Kusterer <email@hidden> wrote:
> On 10.09.2007, at 13:38, Gerriet M. Denkmann wrote:
> > One might add that -[NSString length], which the documentation says
> > "Returns the number of Unicode characters in the receiver." does
> > nothing like this, but returns the number of shorts used with
> > NSUnicodeStringEncoding (aka Utf-16).
> > For example: [[NSString stringWithUTF8String: "ð €" ] length] = 2
> > (if someone cannot handle Unicode (like the mail digest software at
> > Apple) : this is a DESERET CAPITAL LETTER LONG I) - although the
> > string clearly contains one character.
> >
> > And one should also note that "characterAtIndex:" does not do what
> > the name indicates, but returns the short at the index in utf-16.
> >
> > getCharacters: "Returns by reference the characters from the
> > receiver." - the documentation really should mention in which
> > encoding these characters will be copied.
> >
> > Maybe the documentation could be slightly improved: it is confusing
> > if it says "character" when it means "unsigned short int in a
> > specific (but unspecified) encoding".
>
> Well, it *is* a character: It is the UTF16 character that would be
> at the specified index in the normalized form, I guess...?
Ah, but UTF-16 code units are not characters; the term "UTF-16
character" is meaningless. For the BMP, there *is* a one-to-one
correspondence between UTF-16 code units and Unicode code points, but
this is not true in the general case. Outside of the BMP, it takes two
UTF-16 code units to represent a single Unicode code point.
--
Clark S. Cox III
email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden