Re: Fast hash of NSData?
Re: Fast hash of NSData?
- Subject: Re: Fast hash of NSData?
- From: Marcel Weiher <email@hidden>
- Date: Mon, 02 Dec 2013 16:03:56 +0000
On Dec 2, 2013, at 15:16 , Scott Ribe <email@hidden> wrote:
> On Dec 2, 2013, at 7:57 AM, Marcel Weiher <email@hidden> wrote:
>
>> Then you can twiddle the hash to get you a good compromise of speed vs. collisions.
>
> You want to optimize the hash further? Only hash the first 1MB.
Yup, that’s one of the things I meant with twiddling the hash, could probably use even less than 1MB, maybe use more image metadata, thumbnail, ….
> One note: we're all saying that identity checks will be rare, so the amortized cost is very low. Well. That's true if the user is actually adding different files. The amortized cost is not so low if you have users who keep adding the same files over & over ;-)
Very true. My mental model was that duplicates would get rejected on addition, or at least be discarded once detected, which would mean that you do the comparison only once for actual duplicates. That may be inaccurate, but if it’s true than the comparison is not that much more expensive than a hash that hashes over the entire data (could actually be cheaper if the hash function is expensive to compute).
Marcel
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden