Re: Fast hash of NSData?
Re: Fast hash of NSData?
- Subject: Re: Fast hash of NSData?
- From: Graham Cox <email@hidden>
- Date: Sat, 30 Nov 2013 11:47:08 +0100
On 30 Nov 2013, at 4:29 am, Gary L. Wade <email@hidden> wrote:
> If it is possible to compare the attributes of your file besides length, such as modified date, created date, inode, etc., then you might be able to further catch the cases where the length is the same but the data changed is large enough that you would rather delay any full-file access whether by full-byte compare or computing a hash.
It’s possible, but not that easy. It’s also likely to be slow in itself - grabbing this sort of data from the file system requires hitting the disk, possibly multiple times. Right now, the relevant code gets handed a NSData object, which is presumably memory mapped. At that point I don’t have the references to the original file, only the lump of data in it. Of course computing the hash also requires reads from disk, so there’s no way to really win, except by doing something totally different, which is what Kyle was saying. But altering the architecture to do that is more work than I’m prepared to do right now - I’m just looking for a bit of a performance boost, not a major rewrite and all that implies. Doing that is sure to introduce more bugs than it fixes initially.
I suppose what would be useful is some way to compute the true probability of a collision given that a) the data is image data (of any format), b) it is compared by length before hash, and c) hashing using a 64-bit FNV-1a. Other than writing a testcase and throwing thousands of images at it, I wouldn’t really know how to go about calculating that.
—Graham
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden