Re: How to convert a UTF-8 byte offset into an NSString character offset?
Re: How to convert a UTF-8 byte offset into an NSString character offset?
- Subject: Re: How to convert a UTF-8 byte offset into an NSString character offset?
- From: Ken Thomases <email@hidden>
- Date: Tue, 06 May 2014 20:56:35 -0500
On May 5, 2014, at 2:06 PM, Jens Alfke wrote:
> How can I map a byte offset in a UTF-8 string back to the corresponding character offset in the NSString it came from?
I don't think there's a great way.
You can do the reverse, map a character (really a UTF-16 code unit) offset to a UTF-8 offset using CFStringGetBytes(). You'd pass in a range from 0 to the index you want to map and NULL for the buffer. It will fill in *usedBufLen with the length in bytes that would be required by the conversion.
You could build the reverse map by doing that repeatedly for each character index, but that would be expensive. You'd also have to tolerate failure in case a given character index can't be converted (if it references half of a surrogate pair, for example).
So, I suspect that your best bet will be to do the conversion to UTF-8 yourself and build the index map as you go.
Regards,
Ken
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden