NSString Unicode canonical decompositions?
NSString Unicode canonical decompositions?
- Subject: NSString Unicode canonical decompositions?
- From: Fritz Anderson <email@hidden>
- Date: Thu, 28 Mar 2002 20:43:27 -0600
I wonder if there is a builtin or readily-accessible method for
converting an NSString to its canonical decomposition? I know HFS+
stores file names as decomposed UTF-8, which suggests that this wheel
has already been invented, but a search of the documentation didn't turn
anything up for me.
But we all know how doc searches work -- you can't find anything before
it's embarrassing.
I'm writing a tutorial that touches on turning NSStrings into
decompositions, and I'd feel silly going on at length over something
that's already in the can.
(For those of you scoring at home (and if you are scoring at home, why
are you fooling around with the computer?), Unicode encodes almost every
imaginable combination of accents and Latin characters -- LATIN CAPITAL
LETTER U WITH DIAERESIS AND MACRON -- but also encodes the base
characters and accents separately -- LATIN CAPITAL LETTER U, COMBINING
DIAERESIS, COMBINING MACRON. It specifies that the first code I
mentioned can be considered a composite of the other three, in that
order. With the obvious other two combinations, that makes four ways to
represent the same character, but all can be knocked down to the
fully-decomposed form, which is considered canonical. The canonical form
is easier to work with for many purposes.)
-- F
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.