• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: How to count composed characters in NSString?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to count composed characters in NSString?


  • Subject: Re: How to count composed characters in NSString?
  • From: David Niemeijer <email@hidden>
  • Date: Tue, 30 Sep 2008 09:20:53 +0200

Hi Peter,

On Sep 30, 2008, at 7:58 AM, Peter Edberg wrote:
CFStringGetRangeOfComposedCharactersAtIndex and -[NSString rangeOfComposedCharacterSequenceAtIndex:] are the modern replacements for UCFindTextBreak with kUCTextBreakClusterMask and indeed they now are closer to the original intent of kUCTextBreakClusterMask that the current implementation of kUCTextBreakClusterMask is (since UCFindTextBreak was converted to follow Unicode/ICU default text segmentation rules).

The modern functions treat all of the following as a cluster:
- A surrogate pair (of course, since it is a single character);
- A base character followed by a sequence of combining marks (whether or not this is something that would be composed under NFC);
- A Hangul syllable expressed as a sequence of conjoining jamo;
- An Indic consonant cluster such as consonant + virama + consonant + vowel matra. It is this latter cluster that is no longer treated as a single entity by UCFindTextBreak with kUCTextBreakClusterMask.

Ok, understood. This looks good. Based on the discussion I have updated my bug report 6253075. I think a "convenience" method that returns the cluster count would be very useful as it is probably faster than if we manually role a counter method using repeated calls to rangeOfComposedCharacterSequenceAtIndex and because it will, by its simple availability, reduce some of the confusion that I sense on this list as to what the most appropriate way is to count "characters". There would be "length" to count the number of UTF-16 units and a "numberOfCharacters" to count the clusters that are closest to the human conception of characters.


Thanks,

david.
_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


References: 
 >Re: How to count composed characters in NSString? (From: "Gerriet M. Denkmann" <email@hidden>)
 >Re: How to count composed characters in NSString? (From: Michael Gardner <email@hidden>)
 >Re: How to count composed characters in NSString? (From: David Niemeijer <email@hidden>)
 >Re: How to count composed characters in NSString? (From: Douglas Davidson <email@hidden>)
 >Re: How to count composed characters in NSString? (From: David Niemeijer <email@hidden>)
 >Re: How to count composed characters in NSString? (From: Peter Edberg <email@hidden>)

  • Prev by Date: Hotkeys Query
  • Next by Date: ld: duplicate symbol
  • Previous by thread: Re: How to count composed characters in NSString?
  • Next by thread: Re: How to count composed characters in NSString?
  • Index(es):
    • Date
    • Thread