Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comparing two images



Scott Ellsworth <email@hidden> wrote:

A friend just asked how one would scan approximately 40k images for duplicates.. Anyone able to recommend a Java toolkit for same that works with reasonable speed on the Mac?


This is a terribly complicated problem to do right, but I wanted to try an "off the cuff" fast solution first. Something perhaps a bit more clever than a straight checksum of the image data, like a color histogram. This would let us decide whether something smart that
actually analyzes the images is a good idea.

I assume any duplicates might be at different resolutions, or saved with different compression settings? If the duplicates were identical then a simple checksum and file compare would be sufficient. Heck, you probably wouldn't even need to calculate a checksum, just see if any files have the same length and then compare those that do.


If the images might be re-encoded and you can't do a straight compare then you'll have to decode each image into memory and do some processing on the individual pixel values. It doesn't have to be particularly clever. Maybe create a 256-bin histogram for each RGB channel separately, normalise it (to allow for different resolution versions of the same image) then calculate the mean and standard deviation of each component. That would give you 6 numbers to store for each image.

Once you have candidates for equivalence based on this visual checksum you've calculated and want to compare two images with each other then you'd have to rescale them so they're the same size (perhaps scale them both down to thumbnail size) and compare them pixel by pixel. Sum the squares of the differences say, then compare it with some threshold.

Of course, it all depends on what it means for two images to be the "same". If one image is exactly the same as another image but has a different size, a different brightness level, gamma or contrast, a different colour cast, etc. is it the same or not?

-Rolf
--
Rolf Howarth, Square Box Systems Ltd, Stratford-upon-Avon UK.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Java-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/java-dev/email@hidden

This email sent to email@hidden


Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.