• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: How to count composed characters in NSString?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to count composed characters in NSString?


  • Subject: Re: How to count composed characters in NSString?
  • From: David Niemeijer <email@hidden>
  • Date: Sun, 28 Sep 2008 20:17:26 +0200

Michael,

On 28 sep 2008, at 14:41, Michael Gardner wrote:
Upon further investigation, I may be wrong. I based my assertion upon Apple's NSString documentation ("Returns the number of Unicode characters in the receiver"), and upon some quick tests I ran. But this reply made me look into the issue in greater depth.

I re-did my tests more throughly, and it does appear that -length returns the number of 16-bit words (code units), not the number of Unicode characters (code points), in the string. If this is true, I would call it a bug either in the code or in the documentation, which David should submit to Apple.

i think the docs are clear. In the discussion section for "length" it says: "The number returned includes the individual characters of composed character sequences, so you cannot use this method to determine if a string will be visible when printed or how long it will appear."


I did file a bug (ID 6253075) as you suggested, because I think there should be a simple API for this.

I apologize for the apparent misinformation in my previous, hasty reply.

Well, I mad an error too. i suggested that on 10.5 the CFStringTokenizer could be used, but only now noticed that it only supports larger units (words and up). Thus there is no easy API to count the number of characters in a way that surrogate pairs or other "long" unicode characters are treated as a single character.


In the meanwhile, David, perhaps you can find a library that can work with UTF-8 strings. What are you using the length values for?

I need to be able to display the number of characters to the user in a way that makes sense to them. If they see 3 I should report 3. I also need it to cut-off certain input to the number of "real" characters and should not generate results that only make sense for a language like English where each 16 bits equals a single character.


Using some kind of UTF-8 library may be possible, but that would require converting all the time between UTF-16 and UTF-8, which is not efficient for a program that has to do a lot of these kind of calculations.

david.
_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: How to count composed characters in NSString?
      • From: Douglas Davidson <email@hidden>
    • Re: How to count composed characters in NSString?
      • From: Michael Gardner <email@hidden>
    • Re: How to count composed characters in NSString?
      • From: "Kyle Sluder" <email@hidden>
References: 
 >Re: How to count composed characters in NSString? (From: "Gerriet M. Denkmann" <email@hidden>)
 >Re: How to count composed characters in NSString? (From: Michael Gardner <email@hidden>)

  • Prev by Date: Re: showing window causes EXC_BAD_ACCESS
  • Next by Date: Re: How to count composed characters in NSString?
  • Previous by thread: Re: How to count composed characters in NSString?
  • Next by thread: Re: How to count composed characters in NSString?
  • Index(es):
    • Date
    • Thread