Re: Unicode filenames with Apple File System and UIManagedDocument
Re: Unicode filenames with Apple File System and UIManagedDocument
- Subject: Re: Unicode filenames with Apple File System and UIManagedDocument
- From: email@hidden
- Date: Wed, 22 Mar 2017 07:15:12 -0400
> On Mar 22, 2017, at 5:05 AM, Alastair Houghton <email@hidden> wrote:
>
> On 21 Mar 2017, at 20:49, Quincey Morris <email@hidden> wrote:
>>
>> On Mar 20, 2017, at 14:23 , email@hidden wrote:
>>>
>>> "iOS HFS Normalized UNICODE names , APFS now treats all file[ name]s as a bag of bytes on iOS . We are requesting that Applications developers call the correct Normalization routines to make sure the file name contains the correct representation."
>>
>> I’ve been letting this simmer for a couple of days now, and I’ve come to the conclusion that it’s — sincere apologies to the unnamed Apple engineer who wrote it — as dumb as dirt.
>>
>> — It’s not a "bag of bytes”, because bags of stuff are generally understood as unordered sets, and I doubt that’s what’s intended. It has to be a sequence of bytes.
>
> In the context of filesystems (and specifically filenames), the phrases “bag of bytes” and “bunch of bytes” have a fairly specific meaning. The point is that the filesystem doesn’t inspect the bytes it’s given, and doesn’t care what they represent (about the only exception is that it probably doesn’t support embedded NULs). It isn’t suggesting that the names are treated as an unordered set of bytes (that’d just be silly). It’s just expressing the fact that the filesystem doesn’t care what they are - it may compare them, and if it does so, it will use binary ordering (not some other collation sequence) and won’t worry about things like case or encoding at all.
>
>> — It’s not just a string, it has to be a string in a known encoding. Otherwise, how could you ever mount an external drive on a different computer? The encoding has to be pre-specified for APFS, or it has to be stored in metadata on each volume.
>
> Agreed, that’s where the “bunch of bytes” approach falls down.
>
>> — It’s not just going to be a string of known encoding, it’s going to be Unicode. That’s going to be true even if the fact is specified in volume metadata and it’s theoretically possible to create APFS volumes with non-Unicode file names. Anything other than Unicode would, at this point, be a crime against humanity.
>
> If I’d designed APFS, it probably would use Unicode names (and it’d store the version of Unicode it used in the filesystem header, to avoid having to hard-code it).
>
> But I didn’t design it - Dominic Giampaolo and his team did - and we still don’t have that much information about how APFS works. I’m sure they had their reasons for whatever decision they’ve made here.
>
>> Is *that* the bottom line? I doubt it. I don’t believe the above quoted statement can be correct. I could believe that normalization is being moved out of the file system code, but it would have to be moved to (e.g.) the Cocoa frameworks, still “downstream” of the file-handling APIs. It can’t go upstream of the public APIs without breaking an API contract that has existed for the 16+ years since OS X 10.0.
>
> This is a tricky area. The problem with what we have at the moment (-fileSystemRepresentation) is that it *assumes* HFS+ semantics. That isn’t always going to be correct for existing non-HFS+ filesystems, let alone in the future. Of course, if you’re using the NSURL or NSString methods, rather than calling the BSD or C library APIs yourself, this is all hidden from you anyway (you certainly shouldn’t, IMO, be required to do anything unusual at Cocoa level - the Foundation framework should just make this all work, rather in the same way it presently does for numerous other things).
>
> It’s also complicated by the fact that, unlike on DOS or Windows, UNIX-like systems use a unified filesystem - that is, other filesystems are joined on at mount points. Thus you could have a name like
>
> /Volumes/Foo/Bar/Baz/Blam
>
> where (say) both Foo and Baz are mount points, and the rules about filenames could differ markedly, at least in principle; that is, /Volumes/Foo would have to conform to HFS+ (or APFS) rules, Bar/Baz to whatever rules govern the filesystem mounted at Foo, and Blam to whatever rules govern the filesystem mounted at Baz. And remember, not every filesystem will be using a well known encoding - macOS already has code to add and remove percent escapes (I kid you not) for this very reason.
>
> I’d like to hear what Dominic has to say (at least what he *can* say) about this, since he’s likely in a position to shed some light on it - or at least to take on board that we’re worrying about it. At the very least it’d be nice to see some more detail about APFS published somewhere *soon*...
>
> Kind regards,
>
> Alastair.
>
> --
> http://alastairs-place.net
I think it should be taken care of by NSURL so developers don’t need to worry about it but that doesn’t appear to be the case, but, at this point I just want to know what the correct thing to do is. And maybe it does (which means there was a bug in the APFS conversion), but I can’t tell for certain.
I’ve uploaded different versions to TestFlight for the person to try but at this point the original version of my app and each of these different versions all allow the user to open files created on iOS 10.3 with Arabic names but none of them seem to allow the user to open files that were created on 10.2 unless the files are renamed to English. So either NSURL takes care of it and there was a bug in the APFS conversion or we do need to do something additional when sending NSStrings to the NSURL methods. I realize is isn’t official support channels but it would be really nice to hear from Apple. I’ll probably use one of my DTS incidents to ask when I have time to submit the request sometime this week. I’ll certainly report back if I get a definitive answer.
Thanks,
Dave Reed
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden