Re: archiving report
Re: archiving report
- Subject: Re: archiving report
- From: Jens Alfke <email@hidden>
- Date: Wed, 27 Feb 2013 12:08:01 -0800
On Feb 27, 2013, at 11:28 AM, Tony Parker <email@hidden> wrote:
> Out of curiosity, what do you expect to happen if your string is @“ab” or something even longer, but repeated 1 million times? Your test implies that the answer is 2,000,000 but in fact the answer is that it only grows one more byte. The string is being de-duplicated but there is overhead associated with each object in the archive. The amount seems egregious for an object that is so small, (a string with one character), but real world archives are rarely 1-character strings repeated 1 million times. Could the overhead be improved? Probably, but there are many tradeoffs to make.
Also, this kind of repetition collapses down to nearly nothing when compressed with any generic data-compression algorithm like ZIP.
Most popular data formats have a lot of strictly-not-necessary repetition in them (view source on any web page and count how many times the string “div” or “href” appears!), but if size is an issue it’s generally a better idea to pipe them through compression, as HTTP can, rather than go to a lot of trouble in the codec to eliminate redundancy.
—Jens
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden