Re: Fast hash of NSData?
Re: Fast hash of NSData?
- Subject: Re: Fast hash of NSData?
- From: Maxthon Chan <email@hidden>
- Date: Mon, 02 Dec 2013 23:23:21 +0800
If you are indexing files, maybe a shorter hash (CRC32?), a sparse database file and mmap(2) can be your friend.
You can bucket files by its CRC32 (very fast) values (hence the sparse database file that needed to be mmap(2)’d and when CRC32 collides (somewhat likely) then SHA-512 (Slower, but less used) are used to identify files within the same bucket. A dual collision is very unlikely so that an enumeration can be used to solve that.
On Dec 2, 2013, at 23:16, Scott Ribe <email@hidden> wrote:
> On Dec 2, 2013, at 7:57 AM, Marcel Weiher <email@hidden> wrote:
>
>> Then you can twiddle the hash to get you a good compromise of speed vs. collisions.
>
> You want to optimize the hash further? Only hash the first 1MB. Except for odd cases, like two multi-image TIFFs sharing the exact same first n images (maybe the second one was created by appending to the first), that will still very rarely yield collisions--then compare collisions for identity.
>
> One note: we're all saying that identity checks will be rare, so the amortized cost is very low. Well. That's true if the user is actually adding different files. The amortized cost is not so low if you have users who keep adding the same files over & over ;-)
>
> --
> Scott Ribe
> email@hidden
> http://www.elevated-dev.com/
> (303) 722-0567 voice
>
>
>
>
>
> _______________________________________________
>
> Cocoa-dev mailing list (email@hidden)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden