Re: Fast hash of NSData?
Re: Fast hash of NSData?
- Subject: Re: Fast hash of NSData?
- From: Marcel Weiher <email@hidden>
- Date: Mon, 02 Dec 2013 14:57:37 +0000
On Dec 1, 2013, at 15:36 , Graham Cox <email@hidden> wrote:
> Scanning my entire hard drive (excluding hidden files), which took several hours, sure I had plenty of collisions - but absolutely no false ones - they all turned out to be genuine duplicates of existing files. This is using the FNV-1a 64-bit hash + length approach.
>
> I’m thinking this is good enough, really. The odds of a particular user having two different image files that collide, and happening to add those exact images at once to our app must be astronomically low. Talk me out of it :)
IIRC, you were worried about the cost of a full compare. According to these data, the amortized cost of a full compare is effectively zero if you do a full compare when you get a collision. So do the full compare when you get a collision in order not to lose data. Then you can twiddle the hash to get you a good compromise of speed vs. collisions. Mike Abdullah’s suggestion of file size as a first check seems ideal to me (I’ve been using that technique with string lookups to very good effect, files would work much better). I wouldn’t use a straight hash table but a slightly more sophisticated data structure using multiple comparison levels.
On Dec 1, 2013, at 18:52 , Kyle Sluder <email@hidden> wrote:
> But as a matter of principle, it’s negligent to knowingly design a system that will silently drop user data in normal operation. There are plenty of times you can make a reasonable argument for “that’s good enough,” but as far as I’m concerned, preserving user data is never one of them.
Seconded, thirded, … Especially for a performance optimization when the effective performance cost of doing the final check is zero.
Marcel
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden