Re: Best way of identifying duplicate files in Cocoa
Re: Best way of identifying duplicate files in Cocoa
- Subject: Re: Best way of identifying duplicate files in Cocoa
- From: Frank Reiff <email@hidden>
- Date: Wed, 21 Nov 2007 15:32:44 +0100
Hi Bill,
Thanks for code example. It always seems that this type of thing is
better done in a scripting language. It's something about the economy
of the language. I've been using Ruby for around a year now and it
really rocks for this type of stuff.
The MD5 clearly is the right solution for comparing large numbers of
files between each other. I'm not sure I'll be needing such a heavy-
duty solution, but you never know.. once you add a feature people find
ways of testing them to the limit..
What's more don't want to keep anything much in memory, so the tree
would need to built inside of a Core Data persistent store which might
negate any performance gains.
I haven't started implementing the feature yet, but I'll be sure to
have a look at your code before embarking on it and I'll see how much
of performance bottleneck it will be..
Best regards,
Frank
On 21 Nov 2007, at 10:55, Bill Bumgarner wrote:
On Nov 21, 2007, at 1:33 AM, Jean-Daniel Dupas wrote:
To get a MD5 you have to read the file, AND compute the digest. To
compare to file, you just have to read the file. What is the
benefit of the MD5 in this case?
I can MD5 the first N bytes and build up a shallow tree of all files
that are (a) of the same size and (b) are identical for the first
1024 bytes (as hashed by a checksum of said bytes).
From there, yes, calculating an md5 of full file contents is a
complete waste of time when comparing whole files. Given the
relative infrequency of files that are identical within the first
1K, the silliness of those particular lines of code were never
identified as a performance bottleneck. ;)
b.bum
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden