Re: Using Flex/Lex in a Cocoa project
Re: Using Flex/Lex in a Cocoa project
- Subject: Re: Using Flex/Lex in a Cocoa project
- From: Ricky Sharp <email@hidden>
- Date: Tue, 19 Aug 2008 15:54:24 -0500
On Aug 18, 2008, at 10:57 PM, Michael Ash wrote:
Note that depending on what kind of results you want, even if all of
your data is within the BMP, this *still* won't save you.
As a really basic example, consider a simple, obvious character like
é. (That's an e with an acute accent on it if you're having unicode
trouble in your e-mail client.) That can be represented as two
separate unicode code points, a plain old ASCII e followed by a
combining accent mark. If you should happen to split the string on the
accent mark, such that the e goes into the first half and the
combining accent mark goes into the second half, you get a really
unintuitive result. What appears to the user to be a single character
gets suddenly blown in two. Worse, if you happen to insert a string in
the middle, you could end up applying that acute accent to some
*other* letter instead.
Sorry, failed to mention that our UTF-16BE data was also normalized to
pre-composed Unicode. So this case was handled.
You mentioned Korean (which I have yet to play around with), but for
another grand 'ol time, try Arabic. You get into something called
"positional variants". But alas, that's outside the scope of this list.
I think the moral of the story here is that when working with Unicode
data, it's best to normalize such data and then ensure APIs operating
on the data are Unicode savvy.
Thankfully, as you've pointed out, the NSString etc. APIs shield folks
from much of the gory details.
___________________________________________________________
Ricky A. Sharp mailto:email@hidden
Instant Interactive(tm) http://www.instantinteractive.com
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden