Re: Unicode filenames with Apple File System and UIManagedDocument
Re: Unicode filenames with Apple File System and UIManagedDocument
- Subject: Re: Unicode filenames with Apple File System and UIManagedDocument
- From: Peter Edberg <email@hidden>
- Date: Wed, 08 Mar 2017 14:56:20 -0800
> On Mar 8, 2017, at 1:44 PM, David Reed <email@hidden> wrote:
>
>
>> On Mar 8, 2017, at 4:35 PM, Peter Edberg <email@hidden> wrote:
>>
>>
>>> On Mar 8, 2017, at 12:00 PM, email@hidden wrote:
>>>
>>> Message: 1
>>> Date: Tue, 07 Mar 2017 15:03:41 -0500
>>> From: email@hidden
>>> To: Alastair Houghton <email@hidden>, David Duncan
>>> <email@hidden>
>>> Cc: cocoa-dev list <email@hidden>
>>> Subject: Re: Unicode filenames with Apple File System and
>>> UIManagedDocument
>>>
>>>
>>> ....
>>> My app has the option to zip up the directories UIManagedDocument creates and email it (so users can back up their data or share it with others). The person sent it to me. Below is what I did in the Terminal so you can see what happens when I try to unzip it. If this doesn’t come through on the email list with the characters looking correct, I can screenshot it.
>>>
>>> This is one of the data files that was created on iOS 10.2 and then won’t open now on an iOS 10.3 device. It appears the directory name and zip file name do not match and it won’t unzip correctly. It does create a directory but the directory is empty instead of containing the StoreContent and persistentStore files. The zip file is 34KB so it may or may not actually have the data in it.
>>>
>>> $ ls
>>> إعلام.zip
>>
>>
>> It is probably worth noting that the first Arabic character in the above filename (i.e. the one that appears on the right, adjacent to the period) has a canonical decomposition, as per this line from UnicodeData.txt (http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt <http://www.unicode.org/Public/9.0.0/ucd/UnicodeData.txt>):
>> 0625;ARABIC LETTER ALEF WITH HAMZA BELOW;Lo;0;AL;0627 0655;...
>>
>> That is, in some cases this character 0625 (UTF8: D8 A5) will be converted to the sequence 0627 0655 (UTF8: D8 A7 D9 95).
>>
>> This decomposition was introduced in Unicode 3.0. If there are processes that use decomposition according to Unicode 9 versus Unicode 2.x, or processes that don't decompose versus ones that do, then the filename bytes will be different.
>>
>> - Peter E
>
> Thanks Peter.
>
> I am going to try to find time in the next few days to file a bug report. I'll obviously include this information. Is there anything else you think I should include?
Nothing else leaps out at me other than the usuals (system version, exact steps to repro, etc.).
- Peter E
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden