Re: Unicode filenames with Apple File System and UIManagedDocument
Re: Unicode filenames with Apple File System and UIManagedDocument
- Subject: Re: Unicode filenames with Apple File System and UIManagedDocument
- From: Alastair Houghton <email@hidden>
- Date: Thu, 23 Mar 2017 18:19:11 +0000
On 23 Mar 2017, at 17:57, Ed Wynne <email@hidden> wrote:
>
>> Shouldn’t the VFS layer actually be doing this? It is part of its whole raison d’être, no? Just have -[NSURL fileSystemRepresentation] normalize things according to the correct Unicode rules, and let the VFS layer translate that to HFS+’s normalization style when dealing with HFS+.
>
> Yes, this.
>
> Having the conversion only available up in the Cocoa layer is an incredibly poor choice. It effectively means nothing at the BSD layer will be able to properly normalize file names. Having it at the VFS layer is the most sane option, even with the problems that causes.
It can’t really take place at the VFS layer, because the appropriate normalisation is filesystem specific - some filesystems don’t normalise, others do, and the exact rules differ.
It *could* take place in the filesystem driver, as happens currently for HFS+. The problem with that is that while your software will work fine on HFS+, it might break if given a different filesystem to run on, which is kind of what this thread is all about, no? (And we already had similar problems with case-sensitive HFS+ too, which usually breaks certain big brand-name applications software.) I have to say I’m generally in favour of APFS normalising Unicode names, but I can understand that there are reasons the APFS team might have decided not to (it’s really up to them to elucidate what those reasons were).
This is a rather horrible area of filesystem work, made worse by the fact that many historic filesystems don’t even bother storing what character encoding was used. Indeed, on such systems it’s even possible that users will use different encoding in different directories (:-()
Clearly, encoding detailed knowledge of appropriate normalisation on a per-filesystem basis in end-user applications is not a sensible approach here. Apple suggesting that we normalise filenames before passing them to the BSD layer wouldn’t be the end of the world, but it might result in some applications not being able to cope with some otherwise valid filenames because the name on disk differs from the chosen normalisation.
Another option might be to add some flags to the BSD open() API (for instance, O_UNICODE and O_CASEFOLD) that cause it to use a Unicode-aware comparison routine inside the filesystem implementation, the idea being that it will open a file with the exact name passed if it exists, or, if that file doesn’t exist, it will enumerate the containing directory looking for one that matches. Sadly, this enumeration would need to be recursive (since the directory name might have the same problem). The Foundation framework could then use the new flags to obtain reasonable behaviour.
Kind regards,
Alastair.
--
http://alastairs-place.net
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden