• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
How to convert a UTF-8 byte offset into an NSString character offset?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How to convert a UTF-8 byte offset into an NSString character offset?


  • Subject: How to convert a UTF-8 byte offset into an NSString character offset?
  • From: Jens Alfke <email@hidden>
  • Date: Mon, 05 May 2014 12:06:14 -0700

How can I map a byte offset in a UTF-8 string back to the corresponding character offset in the NSString it came from?

I’m writing an Objective-C wrapper around a C text-tokenizer API that takes a UTF-8 string as input, and as part of its output returns byte ranges of words that it found. So my API takes an NSString, converts it to UTF-8, passes that to the C API, and then gets these byte offsets that it needs to convert into character offsets in the NSString.

I’ve looked through both the NSString and CFString APIs and didn’t see anything relevant to this. I know UTF-8 isn’t rocket science and I could pretty easily write my own function to scan through it counting characters, but I suspect I’d run into the differences between Unicode characters and the UTF-16 code points that NSString actually considers “characters”. I’d much rather let CF do this for me in an internally-consistent way.

—Jens
_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden


  • Follow-Ups:
    • Re: How to convert a UTF-8 byte offset into an NSString character offset?
      • From: Ken Thomases <email@hidden>
    • Re: How to convert a UTF-8 byte offset into an NSString character offset?
      • From: Quincey Morris <email@hidden>
    • Re: How to convert a UTF-8 byte offset into an NSString character offset?
      • From: "Stephen J. Butler" <email@hidden>
    • Re: How to convert a UTF-8 byte offset into an NSString character offset?
      • From: Charles Srstka <email@hidden>
  • Prev by Date: Re: Differences in string handling for NSString and NSAttributedString
  • Next by Date: Re: How to convert a UTF-8 byte offset into an NSString character offset?
  • Previous by thread: Re: Differences in string handling for NSString and NSAttributedString
  • Next by thread: Re: How to convert a UTF-8 byte offset into an NSString character offset?
  • Index(es):
    • Date
    • Thread