• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Fast hash of NSData?
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fast hash of NSData?


  • Subject: Re: Fast hash of NSData?
  • From: Maxthon Chan <email@hidden>
  • Date: Mon, 02 Dec 2013 23:23:21 +0800

If you are indexing files, maybe a shorter hash (CRC32?), a sparse database file and mmap(2) can be your friend.

You can bucket files by its CRC32 (very fast) values (hence the sparse database file that needed to be mmap(2)’d and when CRC32 collides (somewhat likely) then SHA-512 (Slower, but less used) are used to identify files within the same bucket. A dual collision is very unlikely so that an enumeration can be used to solve that.

On Dec 2, 2013, at 23:16, Scott Ribe <email@hidden> wrote:

> On Dec 2, 2013, at 7:57 AM, Marcel Weiher <email@hidden> wrote:
>
>> Then you can twiddle the hash to get you a good compromise of speed vs. collisions.
>
> You want to optimize the hash further? Only hash the first 1MB. Except for odd cases, like two multi-image TIFFs sharing the exact same first n images (maybe the second one was created by appending to the first), that will still very rarely yield collisions--then compare collisions for identity.
>
> One note: we're all saying that identity checks will be rare, so the amortized cost is very low. Well. That's true if the user is actually adding different files. The amortized cost is not so low if you have users who keep adding the same files over & over ;-)
>
> --
> Scott Ribe
> email@hidden
> http://www.elevated-dev.com/
> (303) 722-0567 voice
>
>
>
>
>
> _______________________________________________
>
> Cocoa-dev mailing list (email@hidden)
>
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
>
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to email@hidden


_______________________________________________

Cocoa-dev mailing list (email@hidden)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden


References: 
 >Re: Fast hash of NSData? (From: Graham Cox <email@hidden>)
 >Re: Fast hash of NSData? (From: Marcel Weiher <email@hidden>)
 >Re: Fast hash of NSData? (From: Scott Ribe <email@hidden>)

  • Prev by Date: Re: Fast hash of NSData?
  • Next by Date: Re: Fast hash of NSData?
  • Previous by thread: Re: Fast hash of NSData?
  • Next by thread: Re: Fast hash of NSData?
  • Index(es):
    • Date
    • Thread