• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: rangeOfString behaves wierd
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: rangeOfString behaves wierd


  • Subject: Re: rangeOfString behaves wierd
  • From: "Stephen J. Butler" <email@hidden>
  • Date: Mon, 09 Dec 2013 03:53:35 -0600

Would converting each string to NFD (decomposedStringWithCanonicalMapping)
be an acceptable work around in this case?


On Mon, Dec 9, 2013 at 3:43 AM, Stephen J. Butler
<email@hidden>wrote:

> OK, you are right. Copy+paste didn't preserve the compatibility character.
> Does look like a bug of sorts, or at least something a unicode expert
> should explain.
>
>
> On Mon, Dec 9, 2013 at 3:20 AM, Gerriet M. Denkmann <email@hidden>wrote:
>
>>
>> On 9 Dec 2013, at 16:00, Stephen J. Butler <email@hidden>
>> wrote:
>>
>> > I don't get the same result. 10.9.0, Xcode 5.0.2. I created an empty
>> command line utility, copied the code, and I get NSNotFound.
>> >
>> > 2013-12-09 02:50:19.822 Test[73850:303] main "见≠見" (3 shorts) occurs in
>> "见=見見" (4 shorts) at {9223372036854775807, 0}
>>
>> Copying might invoke another bug.
>> Better check the characters, like:
>>
>> - (void)printString: (NSString *)line
>> {
>>         NSLog(@"%s \"%@\" has characters:",__FUNCTION__, line);
>>
>>         [ line  enumerateSubstringsInRange:     NSMakeRange( 0, [ line
>> length ] )
>>                                 options:
>>                        NSStringEnumerationByComposedCharacterSequences
>>                                 usingBlock: ^(NSString *currChar, NSRange
>> currCharRange, NSRange enclosingRange, BOOL *stop)
>>                                 {
>>                                         (void)enclosingRange;
>>                                         (void)stop;
>>
>>                                         #ifdef __LITTLE_ENDIAN__
>>                                                 NSStringEncoding encoding
>> = NSUTF32LittleEndianStringEncoding;
>>                                         #else
>>                                                 NSStringEncoding encoding
>> = NSUTF32BigEndianStringEncoding;
>>                                         #endif
>>                                         NSData *data = [ currChar
>> dataUsingEncoding: encoding ];
>>
>>                                         NSUInteger nbrBytes = [ data
>> length ];
>>                                         NSUInteger nbrChars = nbrBytes /
>> sizeof(unsigned int);
>>
>>                                         if ( nbrChars * sizeof(unsigned
>> int) != nbrBytes )      //      error
>>                                         {
>>                                                 NSLog(@"%s Error: strange
>> nbr of bytes %lu",__FUNCTION__, nbrBytes);
>>                                                 return;
>>                                         };
>>
>>                                         unsigned int codePoint[nbrChars];
>>                                         [ data getBytes: &codePoint
>>  length: nbrBytes ];
>>
>>                                         NSMutableString *s =    [
>> NSMutableString stringWithFormat: @"%@ = ",
>>
>>                                       NSStringFromRange(currCharRange)
>>
>>               ];
>>                                         for( NSUInteger i = 0; i <
>> nbrChars; i++ )
>>                                         {
>>                                                 [ s appendFormat: @"%#06x
>> ", codePoint[i] ];
>>                                         };
>>
>>                                         [ s appendFormat: @"= \"%@\"",
>> currChar ];
>>
>>                                         fprintf(stderr, "%s\n", [ s
>> UTF8String]);
>>                                 }
>>         ];
>> }
>>
>> and check for:
>> "见=見見" has characters:
>> {0, 1} = 0x89c1 = "见"
>> {1, 1} = 0x003d = "="
>> {2, 1} = 0xfa0a = "見"
>> {3, 1} = 0x898b = "見"
>> "见≠見" has characters:
>> {0, 1} = 0x89c1 = "见"
>> {1, 1} = 0x2260 = "≠"
>> {2, 1} = 0x898b = "見"
>>
>> >
>> > On Mon, Dec 9, 2013 at 2:43 AM, Gerriet M. Denkmann <
>> email@hidden> wrote:
>> >
>> > On 9 Dec 2013, at 15:05, Quincey Morris <
>> email@hidden> wrote:
>> >
>> > > On Dec 8, 2013, at 23:46 , Gerriet M. Denkmann <email@hidden>
>> wrote:
>> > >
>> > >> NSString *b = @"见≠見";                //      0x89c1  0x2260  0x898b
>> > >
>> > > So what are the results with:
>> > >
>> > >> NSString *b = @"见”;
>> > >> NSString *b = @"≠”;
>> > >> NSString *b = @"見”;
>> > > ?
>> > >
>> > >  Does specifying an explicit locale make any difference?
>> >
>> > Explicit specifying en_US (as probably the best tested and debugged)
>> makes no difference.
>> >
>>
>>
>
_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden


  • Follow-Ups:
    • Re: rangeOfString behaves wierd
      • From: "Gerriet M. Denkmann" <email@hidden>
References: 
 >rangeOfString behaves wierd (From: "Gerriet M. Denkmann" <email@hidden>)
 >Re: rangeOfString behaves wierd (From: Quincey Morris <email@hidden>)
 >Re: rangeOfString behaves wierd (From: "Gerriet M. Denkmann" <email@hidden>)
 >Re: rangeOfString behaves wierd (From: "Stephen J. Butler" <email@hidden>)
 >Re: rangeOfString behaves wierd (From: "Gerriet M. Denkmann" <email@hidden>)
 >Re: rangeOfString behaves wierd (From: "Stephen J. Butler" <email@hidden>)

  • Prev by Date: Re: rangeOfString behaves wierd
  • Next by Date: Video bit rate
  • Previous by thread: Re: rangeOfString behaves wierd
  • Next by thread: Re: rangeOfString behaves wierd
  • Index(es):
    • Date
    • Thread