Re: Unicode filenames with Apple File System and UIManagedDocument
Re: Unicode filenames with Apple File System and UIManagedDocument
- Subject: Re: Unicode filenames with Apple File System and UIManagedDocument
- From: email@hidden
- Date: Thu, 30 Mar 2017 08:17:27 -0400
> On Mar 23, 2017, at 8:01 PM, email@hidden wrote:
>
>
>> On Mar 23, 2017, at 12:24 PM, David Duncan <email@hidden> wrote:
>>
>> I just want to remind everyone I’m *not* a file system’s engineer – I’m just trying to help Dave (and anyone else caught in this) make sure their app can find their files.
>>
>>> On Mar 23, 2017, at 1:53 AM, Alastair Houghton <email@hidden> wrote:
>>>
>>> On 22 Mar 2017, at 18:00, David Duncan <email@hidden> wrote:
>>>>
>>>> So there was another explanation posted on the bug that I’m not certain you got, but which I think may explain.
>>>>
>>>> Basically the concept is that since APFS doesn’t normalize file names, if you store file names in some other storage (say in your preferences) then what could happen is this:
>>>>
>>>> 10.2: File is saved with a file name handed to the file system in NFC form. File system converts the file name to NFD. You store it as NFC.
>>>> 10.3: File system is converted to APFS, and the file name is NFD. You try to look up the file as NFC, and it fails.
>>>
>>> This is going to cause problems, though, when things migrate from HFS+ to APFS, because the HFS normalisation *isn’t* a standard one. In particular, it certainly *isn’t* NFD for the current version of Unicode.
>>
>> Yes, that is the crux of Dave’s issue – HFS+ => APFS only translated the file names (from UTF-16 to UTF-8), it did not re-normalize them.
>>
>>> The only obvious solution for that would be to have the HFS+ to APFS migration tool *re-normalise* the filenames (maybe it does?), but that’s bound to break things in the (presumably quite common) case where the filename stored in e.g. a plist was originally obtained from the filesystem.
>>
>> Arguably there is no way for the file system converter to know how it should renormalize file names. This is akin to case sensitive vs case insensitive file systems. If you ran a converter from a case insensitive file system to a case sensitive one, you could preserve the capitalization during the conversion, but file lookups that used the wrong case would fail after the conversion. But the converter can’t know you want to look up “foo” via “FOO” or “Foo” to do any kind of normalization. The difference here is that for the most part unicode normalization is invisible to the developer.
>>
>>>
>>> Kind regards,
>>>
>>> Alastair.
>>>
>>> --
>>> http://alastairs-place.net
>>>
>>
>> --
>> David Duncan
>
>
> I appreciate the help you (and everyone else) has given. I should be able to add an option to rescan what files are there. And I'll make time this weekend to submit a DTS incident and see what answer I can get and share it here. I do suspect I won't be the only one bit by this.
>
> Thanks,
> Dave Reed
I received a reply from DTS with lots of information and references. The suggestion for me was to store both the name the user enters and a filename that won't have the issue in my plist file with the list of "courses" (I assume they mean use a ASCII name such as GUID). The other suggestion was to iterate over the files and apply the same decomposition to the filename and the user entered string to find the match - that's what I did over the weekend and the update is now available in the App Store.
Dave Reed
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden