Re: How to count composed characters in NSString?
Re: How to count composed characters in NSString?
- Subject: Re: How to count composed characters in NSString?
- From: Peter Edberg <email@hidden>
- Date: Sun, 28 Sep 2008 15:05:36 -0700
On Sep 28, 2008, at 12:02 PM, email@hidden wrote:
----------------------------------------------------------------------
Message: 1
Date: Sun, 28 Sep 2008 20:17:26 +0200
From: David Niemeijer <email@hidden>
Subject: Re: How to count composed characters in NSString?
To: Cocoa-Dev List <email@hidden>
Message-ID: <email@hidden>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Michael,
On 28 sep 2008, at 14:41, Michael Gardner wrote:
Upon further investigation, I may be wrong. I based my assertion
upon Apple's NSString documentation ("Returns the number of Unicode
characters in the receiver"), and upon some quick tests I ran. But
this reply made me look into the issue in greater depth.
I re-did my tests more throughly, and it does appear that -length
returns the number of 16-bit words (code units), not the number of
Unicode characters (code points), in the string. If this is true, I
would call it a bug either in the code or in the documentation,
which David should submit to Apple.
i think the docs are clear. In the discussion section for "length" it
says: "The number returned includes the individual characters of
composed character sequences, so you cannot use this method to
determine if a string will be visible when printed or how long it will
appear."
I did file a bug (ID 6253075) as you suggested, because I think there
should be a simple API for this.
I apologize for the apparent misinformation in my previous, hasty
reply.
Well, I mad an error too. i suggested that on 10.5 the
CFStringTokenizer could be used, but only now noticed that it only
supports larger units (words and up). Thus there is no easy API to
count the number of characters in a way that surrogate pairs or other
"long" unicode characters are treated as a single character.
David,
Check out CFStringGetRangeOfComposedCharactersAtIndex. It finds the
kinds of text boundaries that I think you are interested in. You would
just need to iterate over the string calling this for each iteration
to find the next boundary.
-Peter Edberg, Apple
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden