Re: Fast hash of NSData?
Re: Fast hash of NSData?
- Subject: Re: Fast hash of NSData?
- From: "Gary L. Wade" <email@hidden>
- Date: Fri, 29 Nov 2013 19:29:57 -0800
If it is possible to compare the attributes of your file besides length, such as modified date, created date, inode, etc., then you might be able to further catch the cases where the length is the same but the data changed is large enough that you would rather delay any full-file access whether by full-byte compare or computing a hash.
If you can rely on file attributes, then only if the significant ones change would I use the new file. Assuming we're talking about graphics files, I doubt Photoshop (as an example app) will allow a user to invert all colors, a task that could theoretically keep the same length, and then save that file, and then subverting the OS keep the same modification and creation date as before. If a user does that (since Photoshop would not, it would have to be deliberate), then they're purposely trying to fool your app—or they're OCD and/or trying to prevent a quantum entanglement with a parallel universe (or it's an installer that would fix time stamps for some esoteric reason).
If you don't have access to any important file attributes, and your data is awfully big, you could try a preliminary compare of a random smaller set of blocks. On the other hand, since you already have local access to both files, you might compare the time it takes to do full-byte compares of memory mapped forms of the data against calculating the hash. Depending on vector functions, registers, and word-compare-size (128 bits at a time?) you might be able to more quickly ascertain equality. Hashes for large file equality are a whole lot more necessary when the latency of file access is orders of magnitude larger than local file access (consider across-network full-copying of files, something backup and sync products want to be fast at).
--
Gary L. Wade (Sent from my iPhone)
http://www.garywade.com/
> On Nov 29, 2013, at 1:49 PM, Kyle Sluder <email@hidden> wrote:
>
> You also have another, damn-quick "hash key" that takes zero disk access to compute: -[NSData length].
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden