Re: Unicode mapping of strings: worth the effort?
Re: Unicode mapping of strings: worth the effort?
- Subject: Re: Unicode mapping of strings: worth the effort?
- From: Dan Sugalski <email@hidden>
- Date: Mon, 21 Jul 2003 13:45:51 -0400
At 6:23 PM +0100 7/21/03, Pete French wrote:
Basically it comes down to you needing to define what 'the same' means for
your application. What you probably wannt to do is to use the decomposed
compatible forms of all the strings. What I do is hold all my strings in this
form interbaly, and I only make the conversion on anything which the
user types
in.
Using the decomposed form wont loose anything
This isn't quite true, though close enough for most people. The one
possible downside to going decomposed universally (though a minor
downside) is that if you read in a file of Unicode data that isn't
decomposed then write it back out after decomposing, the two files
will look different. Sizes and content (when treated as binary data)
will be different. Not a big deal, but it can throw off some of the
text processing tools that are Unicode-naive.
If that's not an issue, go decomposed, it makes life ever so much
easier as long as you're careful with what you do with the text.
(Which you really have to be anyway, as its possible to have a fully
composed Unicode text stream that *still* has combining characters in
it)
--
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
email@hidden have teddy bears and even
teddy bears get drunk
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.