Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
- Subject: Re: [Q] UTF-8 stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding weirdness
- From: JongAm Park <email@hidden>
- Date: Wed, 18 Jun 2008 14:33:05 -0700
I even didn't think about the normalization. Wow.. it is quite
complicated.
I tried the 4 methods,
-precomposedStringWith[Canonical/Compatibility]Mapping and
-decomposedStringWith[Canonical/Compatibility]Mapping.
The result was that [NSString UTF8String] returns "precomposed" version
That's not quite accurate. Any given string will be in precomposed or
decomposed form (or it might not be normalized to either form, and
have a mix). Whatever form that string is in, -UTF8String will
maintain it. So, -UTF8String doesn't necessarily return "precomposed"
form, it just so happens that the string you got was already in
precomposed form.
You are right. It depends on an original string. The NSString is quite
smart...
, while the one used in the FCP was "decomposed".
The low-level file-system APIs on Mac OS X use what Apple calls
"file-system representation", which is mostly decomposed (NFD) with
some specific exceptions. So, any time you obtain a file name from
the file-system -- by enumerating a directory or from an NSOpenPanel,
for example -- it's likely to be mostly decomposed. This is true even
if the name originally used to create the file was passed in
precomposed form.
If you want the string in a specific normalization form for some
reason, you need to transform it using the above methods. Don't rely
on "file-system representation" being in any particular form. You can
compare strings without regard for normalization form using one of the
-compare:... methods and _not_ specifying NSLiteralSearch. Note that
isEqual: and isEqualToString: _do_ specify NSLiteralSearch (or the
equivalent) and so can report NO for two strings which display
identically.
Cheers,
Ken
I tested with the compare: method. It could return "Same" when a
decomposed string is compared with a composed string.
So, when Unicode is to be handled, it would be safer if the compare:
function is used instead of isEqual.
( NSString even provides comparison with localized strings. I'm
impressed!!! )
Thank you for the good information. Although I have used the NSString, I
didn't know what those methods really meant. But now, I opened my eyes!!!
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden