Re: characters in cocoa
Re: characters in cocoa
- Subject: Re: characters in cocoa
- From: "Gerriet M. Denkmann" <email@hidden>
- Date: Mon, 10 Sep 2007 17:04:32 +0200
On 10 Sep 2007, at 15:36, Uli Kusterer wrote:
On 10.09.2007, at 13:38, Gerriet M. Denkmann wrote:
One might add that -[NSString length], which the documentation
says "Returns the number of Unicode characters in the receiver."
does nothing like this, but returns the number of shorts used with
NSUnicodeStringEncoding (aka Utf-16).
For example: [[NSString stringWithUTF8String: "𐐀" ] length] = 2
(if someone cannot handle Unicode (like the mail digest software
at Apple) : this is a DESERET CAPITAL LETTER LONG I) - although
the string clearly contains one character.
And one should also note that "characterAtIndex:" does not do what
the name indicates, but returns the short at the index in utf-16.
getCharacters: "Returns by reference the characters from the
receiver." - the documentation really should mention in which
encoding these characters will be copied.
Maybe the documentation could be slightly improved: it is
confusing if it says "character" when it means "unsigned short int
in a specific (but unspecified) encoding".
Well, it *is* a character: It is the UTF16 character that would be
at the specified index in the normalized form, I guess...?
Well, when I hear "character" I think of something like what the
LayoutManager calls "glyph": something what can be seen. As opposed
to bits representing such a "character" according to some encoding.
So I would say that the strings "𐐀" and "∂" both have one
character; even if one is encoded in utf-16 as 2 shorts, the other as
1 short.
Maybe this is the reason for my confusion.
Kind regards,
Gerriet.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden