Re: Finder Info
Re: Finder Info
- Subject: Re: Finder Info
- From: Ken Thomases <email@hidden>
- Date: Mon, 27 Aug 2012 13:50:50 -0500
On Aug 27, 2012, at 12:43 PM, David Duncan wrote:
> On Aug 27, 2012, at 9:50 AM, Sean McBride <email@hidden> wrote:
>
>> On Sat, 25 Aug 2012 17:58:39 +0200, Uli Kusterer said:
>>
>>>>> const UInt8 *cpath = (const UInt8 *)[path
>>> cStringUsingEncoding:NSUTF8StringEncoding];
>>>>
>>>> -UTF8String is shorter.
>>>
>>> Both of these are wrong, though. You should *always* use -
>>> fileSystemRepresentation when you need a C-string representation of a
>>> path. Otherwise you might get decomposed characters that don't match the
>>> actual way the characters are stored on disk, and will create a second
>>> file with an almost-indistinguishable name.
>>
>> Could you provide an example filename where UTF8String and fileSystemRepresentation give something different? I'd like to run it through QA...
>
>
> The primary difference is that the fileSystemRepresentation uses a particular form of unicode composition (if I recall correctly, Decomposed form D). As such if you had a UTF-8 encoded Ä (Latin Capital Letter A with Diaresis) vs A¨ (Latin Capital Letter A with Combining Diaeresis) then you could get different results (and I believe the former would be an invalid file name).
>
> Note: this is primarily from memory so this exact example may or may not fail, but should give you a framework for finding ones that do.
For reference, from <https://developer.apple.com/library/mac/documentation/MacOSX/Conceptual/BPInternational/Articles/FileEncodings.html#//apple_ref/doc/uid/20002137-SW1>:
> All BSD system functions expect their string parameters to be in UTF-8 encoding and nothing else. Code that calls BSD system routines should ensure that the contents of all const *char parameters are in canonical UTF-8 encoding. In a canonical UTF-8 string, all decomposable characters are decomposed; for example, é (0x00E9) is represented as e (0x0065) + ´ (0x0301). To put things into a canonical UTF-8 encoding, use the “file-system representation” interfaces defined in Cocoa (including Core Foundation).
Also, of note, from <http://developer.apple.com/library/mac/qa/qa1173/_index.html#//apple_ref/doc/uid/DTS10001705-CH1-SECCOMPATIBILITYNOTES>, which is targeted at those implementing file systems for Mac OS X:
> In theory the techniques described above can cause compatibility problems for applications. For example, if an application creates a file using a precomposed name and then iterates through the directory looking for that file using a simple binary string comparison, it won't find the file. In practice this is rarely a problem.
If your code does this sort of thing, you should use -[NSString compare:] (or -compare:options:... without the NSLiteralSearch option) to compare strings. Don't use -isEqual: or -isEqualToString: because that does a literal comparison which treats precomposed and decomposed forms of the same character as unequal.
Regards,
Ken
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden