Re: How to uniquely determine MD5-sum of a dict?
Re: How to uniquely determine MD5-sum of a dict?
- Subject: Re: How to uniquely determine MD5-sum of a dict?
- From: "Michael Ash" <email@hidden>
- Date: Tue, 16 Sep 2008 15:30:13 -0400
On Tue, Sep 16, 2008 at 2:40 PM, Arthur C. <email@hidden> wrote:
> I have an NSDictionary that has to be written to disk, distributed and read in again.
> I would like to add an MD5 sum to the dictionary to make sure it has not been modified/corrupted on the way. That can be done by making NSData using NSArchiver and then passing it to MD5() from <openssl/md5.h>.
>
> But, the order in which keys/values are stored in the dict is not fixed. I would like to know if there is a simple way to get a unique (reproducable) MD5-sum.
That's an interesting problem.
My first instinct would be to not hash the dictionary at all, but
rather to hash its representation on disk. In other words, when you
write it to disk, first convert it into NSData (trivial using
NSArchiver or NSPropertyListSerialization), then hash it, and then
store the NSData and the hash together. To verify, you first check the
data against the hash, and then you load it into an NSDictionary using
NSUnrchiver or NSPropertyListSerialization. This prevents you from
having problems where equivalent dictionaries get serialized
differently, because you're only ever comparing a single serialized
representation.
(And note, the Keyed variants of the archivers are vastly preferred.)
However the question of hashing the dictionary directly in memory is
interesting. I'd suggest that you could define a recursive hash
function that knows how to compute the MD5 hash of any type of object.
Something like this, with pseudocode:
function hash(dictionary)
keys = [dictionary allKeys]
keys = [keys sorted]
objs = [dictionary objectsForKeys:keys notFoundMarker:@""] //
never not found
keyhash = hash(keys)
objhash = hash(objs)
return hash(keyhash + objhash)
function hash(array)
hashes = ""
for(obj in array)
hashes += hash(obj)
return hash(hashes)
function hash(string)
return hash([string dataUsingEncoding:NSUTF8StringEncoding])
function hash(data)
...the usual MD5 function goes here...
And of course add more specializations as needed.
Mike
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden