Re: How to convert a UTF-8 byte offset into an NSString character offset?
Re: How to convert a UTF-8 byte offset into an NSString character offset?
- Subject: Re: How to convert a UTF-8 byte offset into an NSString character offset?
- From: "Stephen J. Butler" <email@hidden>
- Date: Tue, 06 May 2014 00:19:07 -0500
What's your next step after doing the UTF8 to UTF16 range conversion? If
it's just going to be -[NSString substringWithRange:] then I'd strongly
suggest just doing -[NSString initWithBytes:length:encoding:] on the UTF8
string. At least profile it and see what the penalty is. You've already
paid the UTF16 to UTF8 conversion price once. It's not clear that going
UTF8 to UTF16 again will be a big penalty vs. the range conversion.
But profile would show.
On Mon, May 5, 2014 at 2:06 PM, Jens Alfke <email@hidden> wrote:
> How can I map a byte offset in a UTF-8 string back to the corresponding
> character offset in the NSString it came from?
>
> I’m writing an Objective-C wrapper around a C text-tokenizer API that
> takes a UTF-8 string as input, and as part of its output returns byte
> ranges of words that it found. So my API takes an NSString, converts it to
> UTF-8, passes that to the C API, and then gets these byte offsets that it
> needs to convert into character offsets in the NSString.
>
> I’ve looked through both the NSString and CFString APIs and didn’t see
> anything relevant to this. I know UTF-8 isn’t rocket science and I could
> pretty easily write my own function to scan through it counting characters,
> but I suspect I’d run into the differences between Unicode characters and
> the UTF-16 code points that NSString actually considers “characters”. I’d
> much rather let CF do this for me in an internally-consistent way.
>
> —Jens
> _______________________________________________
>
> Cocoa-dev mailing list (email@hidden)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
>
>
> This email sent to email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden