Re: archiving report
Re: archiving report
- Subject: Re: archiving report
- From: "Gerriet M. Denkmann" <email@hidden>
- Date: Thu, 28 Feb 2013 13:46:55 +0700
On 28 Feb 2013, at 02:28, Tony Parker <email@hidden> wrote:
> On Feb 26, 2013, at 10:56 AM, Gerriet M. Denkmann <email@hidden> wrote:
>>
>> On 27 Feb 2013, at 01:00, Gwynne Raskind <email@hidden> wrote:
>>
>>>> 2. NSKeyedArchiver seems to be ok.
>>>> But it does create unnecessary data. E.g. in the case of an array containing identical objects, like:
>>>> NSArray *a = @[ @"a", @"a", ...., @"a"];
>>>> With 1 000 000 items it creates 10,000,395 bytes - my version creates only 1 000 332 bytes
>>>> and the output is still readable by NSKeyedUnarchiver.
>>>
>>> Are you sure this is happening? NSKeyedArchiver is documented as doing deduplication of objects. If this is true, it's definitely a bug and there is no reason Apple wouldn't want it fixed.
>>
>> Just try it yourself:
>> #define NBR 1000000
>> NSMutableArray *m = [ NSMutableArray array ];
>> for ( NSUInteger i = 0; i < NBR; i++ ) [ m addObject: @"a" ];
>> NSData *dataKeyed = [ NSKeyedArchiver archivedDataWithRootObject: m ];
>> NSLog(@"%s NSKeyedArchiver created lu bytes ", __FUNCTION__, [dataKeyed length]);
>> Then change NBR to 1000001 and compare.
>>
>
> Out of curiosity, what do you expect to happen if your string is @“ab” or something even longer, but repeated 1 million times? Your test implies that the answer is 2,000,000 but in fact the answer is that it only grows one more byte.
True.
But what happens when you increase the number of items in the array from 1 000 000 to 1 000 001?
Answer: the output of NSKeyedArchiver grows by 10 bytes.
There is just one StringThing like 0x61 'a' at index 0x9 (arbitrary number). Not a million of them. Good.
But then: for every item in the array there is one ObjectReference, which are all identical like 080 0x09, all referencing the thing at index 9, which is the StringThing "a" (actually referencing $objects[9] which I assume contains the index 9 of our StringThing).
These 1 000 000 ObjectReference things have the indices 10 ... 1000010.
Then there is an ArrayThing like 0xaf 12 00 0f 42 40 followed by 1 000 000 indices, which are all 0x00 00 00 0a (= 10), referencing the first ObjectReference at index 0xa, which points to the StringThing "a".
At the end there is a table with the offsets for all 1000011 things.
Note: There are 999 999 unused and useless ObjectReferences (each 2 bytes) (at indices 11 ... 1000010).
And there are 999 999 unused and useless offsets (each 4 bytes) for these.
Also: because of the useless 999 999 things, all 1 000 000 references in the array have to be 4 bytes long (otherwise 1 byte would be enough).
See bug: NSKeyedArchiver creates bloated archives. tracking number for this issue is Bug ID# 13303422.
And please note also that after removing these useless bytes the archive is still readable by the current NSKeyedUnarchiver.
>>
>> I have filed the $null bug. Got back as duplicate with a very low id-number. Meaning: this bug is known to Apple since several years. Still no fix.
>
> Thank you for your bug reports. Yes, we do get them and we do listen to them.
Listening is nice. Acting would be kind of better though.
Kind regards,
Gerriet.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden