Re: Fast hash of NSData?
Re: Fast hash of NSData?
- Subject: Re: Fast hash of NSData?
- From: Jens Alfke <email@hidden>
- Date: Fri, 29 Nov 2013 18:40:38 -0800
On Nov 29, 2013, at 6:26 PM, Scott Ribe <email@hidden> wrote:
> I recently (last week, no kidding) investigated this question, and went with murmur hash.
Murmur is an excellent 32-bit hash function, but it’s not suitable for Graham’s use case where a collision would result in data loss.
>> In this scheme, if there is a hash collision, you lose user data. That should be a non-starter. You *must* do a full bytewise comparison in case of collision.
> a 1 in 2^64, or 2^128, or 2^256 chance, so no, it is just fine.
It’s worse than that, actually, because of what’s known as the “birthday paradox”[1]. In a group of 23 people, there’s a 50% chance that two of them have the same birthday. Similarly, as the number of objects you’re hashing grows, the chance of a collision grows much more rapidly, because it’s proportional to the number of _pairs_ of objects.
Generally with a 160-bit hash function the odds of a collision for any reasonably-sized data set are negligible, as long as the hash function is reasonably random. (The trust in the randomness of SHA-1 is eroding, though.)
—Jens
[1] http://en.wikipedia.org/wiki/Birthday_paradox
[2] http://en.wikipedia.org/wiki/Birthday_attack
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden