Re: Unicode filenames with Apple File System and UIManagedDocument
Re: Unicode filenames with Apple File System and UIManagedDocument
- Subject: Re: Unicode filenames with Apple File System and UIManagedDocument
- From: email@hidden
- Date: Mon, 20 Mar 2017 17:23:48 -0400
I received a polite reply from my bug stating:
"iOS HFS Normalized UNICODE names , APFS now treats all files as a bag of bytes on iOS . We are requesting that Applications developers call the correct Normalization routines to make sure the file name contains the correct representation."
Having trouble finding the canonical documentation from Apple stating what to do but by looking through NSURL documentation, I think the correct replacement for:
NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];
where [self courseDirectory] is a URL of a directory (with an English name created by the app) in the Documents folder. The variable "name" is a NSString that is from the user (with just basic sanitizing to replace "/" with "-"). Note: this is iOS.
is:
NSURL *url = [NSURL fileURLWithFileSystemRepresentation:[name fileSystemRepresentation] isDirectory:YES relativeToURL:[self courseDirectory]];
Can anyone confirm this is correct?
Thanks,
Dave
> On Mar 12, 2017, at 5:00 PM, David Reed <email@hidden> wrote:
>
>
> Hi Uli,
>
> The code to create the URL was using:
>
> NSURL *url = [[self courseDirectory] URLByAppendingPathComponent:name];
>
> where [self courseDirectory] is a URL of a directory (with an English name created by the app) in the Documents folder. The variable "name" is a NSString that is from the user (with just basic sanitizing to replace "/" with "-"). Note: this is iOS.
>
> So I wasn't using UTF8String or fileSystemRepresentation.
>
> Someone claimed I should be using fileSystemRepresentation and someone else said it shouldn't matter. If anyone has the definitive answer as to what I should change that to, I'm happy to use it (although it may be too late now).
>
> Thanks,
> Dave Reed
>
>
>> On Mar 12, 2017, at 8:25 AM, Uli Kusterer <email@hidden> wrote:
>>
>> I can't find the start of this thread, but this sounds a lot like you were using -UTF8String instead of -fileSystemRepresentation to save out your file names. That's the main difference between those two calls: -fileSystemRepresentation decomposes UTF8 the way HFS+ does, so should never adopt newer decompositions, and will instead guarantee the same string will decompose the same way — as long as you don't forget to use it somewhere.
>>
>> Of course, if you are using command line tools, they might not be properly normalizing the file names.
>>
>> Apologies if this was already covered in the lost beginning of this thread.
>>
>> Cheers,
>> -- Uli Kusterer
>> "The Witnesses of TeachText are everywhere..."
>> http://www.zathras.de
>>
>>> On 8 Mar 2017, at 22:35, Peter Edberg <email@hidden> wrote:
>>>
>>>
>>>> On Mar 8, 2017, at 12:00 PM, email@hidden wrote:
>>>>
>>>> Message: 1
>>>> Date: Tue, 07 Mar 2017 15:03:41 -0500
>>>> From: email@hidden
>>>> To: Alastair Houghton <email@hidden>, David Duncan
>>>> <email@hidden>
>>>> Cc: cocoa-dev list <email@hidden>
>>>> Subject: Re: Unicode filenames with Apple File System and
>>>> UIManagedDocument
>>>>
>>>>
>>>> ....
>>>> My app has the option to zip up the directories UIManagedDocument creates and email it (so users can back up their data or share it with others). The person sent it to me. Below is what I did in the Terminal so you can see what happens when I try to unzip it. If this doesn’t come through on the email list with the characters looking correct, I can screenshot it.
>>>>
>>>> This is one of the data files that was created on iOS 10.2 and then won’t open now on an iOS 10.3 device. It appears the directory name and zip file name do not match and it won’t unzip correctly. It does create a directory but the directory is empty instead of containing the StoreContent and persistentStore files. The zip file is 34KB so it may or may not actually have the data in it.
>>>>
>>>> $ ls
>>>> إعلام.zip
>>>
>>>
>>> It is probably worth noting that the first Arabic character in the above filename (i.e. the one that appears on the right, adjacent to the period) has a canonical decomposition, as per this line from UnicodeData.txt (http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt <http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt>):
>>> 0625;ARABIC LETTER ALEF WITH HAMZA BELOW;Lo;0;AL;0627 0655;...
>>>
>>> That is, in some cases this character 0625 (UTF8: D8 A5) will be converted to the sequence 0627 0655 (UTF8: D8 A7 D9 95).
>>>
>>> This decomposition was introduced in Unicode 3.0. If there are processes that use decomposition according to Unicode 9 versus Unicode 2.x, or processes that don't decompose versus ones that do, then the filename bytes will be different.
>>>
>>> - Peter E
>>>
>>>
>>>
>>> _______________________________________________
>>>
>>> Cocoa-dev mailing list (email@hidden)
>>>
>>> Please do not post admin requests or moderator comments to the list.
>>> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>>>
>>> Help/Unsubscribe/Update your Subscription:
>>>
>>> This email sent to email@hidden
>>
>>
>> _______________________________________________
>>
>> Cocoa-dev mailing list (email@hidden)
>>
>> Please do not post admin requests or moderator comments to the list.
>> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>>
>> Help/Unsubscribe/Update your Subscription:
>>
>> This email sent to email@hidden
>
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden