Re: Looking for help scanning entire drives
Re: Looking for help scanning entire drives
- Subject: Re: Looking for help scanning entire drives
- From: Andreas Grosam <email@hidden>
- Date: Wed, 23 Feb 2011 12:37:36 +0100
On Feb 23, 2011, at 7:08 AM, Laurent Daudelin wrote:
> I need to write an application that will scan entire drives and compare files between the 2 drives. I have already something working but in situations where there are a lot of files (hundreds of thousands), the memory consumption becomes a problem, leading to slow performance when virtual memory is used and, ultimately, sometimes to crashes in malloc.
>
> Of course, I could go with little chunks, comparing, but I need to present a window showing which copies of files are more recent on one drive and which ones are more recent on the other drive, so I need to keep a list of all the files on one drive that are more recent than their counterparts on the other drive, and vice versa. This preferably would have to be done in Cocoa, since I already have a working solution.
>
> Knowing that I have to support 10.5 but run under 10.6, what would be the best way to have a crack at this problem?
>
> All suggestions are welcome!
If there is no appropriate free or commercial tool which could solve your problem already, I would suggest the following:
1) Use a NSDictionaryEnumerator in order to recursively iterate through a specified dictionary at any volume.
2) For each file item,
3) skip it if you don't want to process it, or
4) retrive file name (url), modification date and whether it is a dictionary
4.1) process the file item
Your "process file item" method may do the following:
1) determine the path of the corresponding file item on the second volume
2) if it exists, retrieve file attributes from it and compare modification date or do whatever you need to compare the files
3) save the result in a list only if there are differences.
The following sample (a Console / Foundation program) shows how you can create a dictionary enumerator, retrieve certain file attributes and shows how you can process the item using a block.
It doesn't show how to retrieve attributes from a second file, or how you can compare files, though.
The sample walks recursively through the home directory and stores the name of each file it founds in a mutable array. Note that the array can get quite large and that the application may consume quite a bit of memory. The sample takes care not to excessively populate the autorelease pool in its loop.
In a real application, you would add error handling, and put the code into a NSOperation or use GCD, scheduling it into a background thread. If speed is more of a concern, there are various ways to improve this - but this is certainly out of the scope of this sample.
#import <Foundation/Foundation.h>
// Note: use URLs when possible (the preferred way) instead of path names when referring to file items.
typedef void (^DirectoryEnumerationResultsBlock)(NSURL* fileOrDirectory,
NSString* fileName,
BOOL isDirectory,
NSDate* modificationDate);
void enumerateDirectory(NSURL* directory,
DirectoryEnumerationResultsBlock block)
{
// allocate a local file manager (preferred way) instead using [NSFileManager defaultManager] (old way):
NSFileManager* fileManager= [[NSFileManager alloc] init];
NSDirectoryEnumerator* dirEnumerator =
[fileManager enumeratorAtURL:directory
includingPropertiesForKeys:[NSArray arrayWithObjects:NSURLNameKey,
NSURLIsDirectoryKey,
NSURLContentModificationDateKey,
nil]
options:NSDirectoryEnumerationSkipsHiddenFiles
errorHandler:nil];
NSAutoreleasePool* localPool = [[NSAutoreleasePool alloc] init];
int i = 0;
// Enumerate the dirEnumerator results, for each item (NSURL) the block will be called
for (NSURL* fileItemUrl in dirEnumerator)
{
// From the url's resource dictionary retrive the property values whose key has
// been specified in the array passed to -enumeratorAtURL:includingPropertiesForKeys:options:errorHandler
// Retrieve the file name, key NSURLNameKey:
NSString* fileName;
[fileItemUrl getResourceValue:&fileName forKey:NSURLNameKey error:NULL];
// Retrieve whether it is a directory, key NSURLIsDirectoryKey:
NSNumber* isDirectory;
[fileItemUrl getResourceValue:&isDirectory forKey:NSURLIsDirectoryKey error:NULL];
// Retrive the modification date (if the volume supports this property):
NSDate* modificationDate;
[fileItemUrl getResourceValue:&modificationDate forKey:NSURLContentModificationDateKey error:NULL];
// call the block for each file item (as URL):
block(fileItemUrl, fileName, [isDirectory boolValue], modificationDate);
// Every now and then, drain the pool:
if (i % 100) {
[localPool drain];
localPool = [[NSAutoreleasePool alloc] init];
}
++i;
}
[localPool drain];
[fileManager release];
}
int main (int argc, const char* argv[])
{
NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
NSMutableArray* fileList = [NSMutableArray arrayWithCapacity:10000];
NSURL* homeDirectoryUrl = [NSURL fileURLWithPath:NSHomeDirectory() isDirectory:YES];
enumerateDirectory(homeDirectoryUrl,
^(NSURL* fileOrDirectory, NSString* fileName, BOOL isDirectory, NSDate* modificationDate) {
if (!isDirectory) {
//NSLog(@"File: %@, modified: %@", [fileOrDirectory path], [modificationDate description]);
[fileList addObject:[fileOrDirectory path]];
}
}
);
NSLog(@"Number of files in directory: '%@': %d", NSHomeDirectory(), [fileList count]);
[pool drain];
return 0;
}
Regards
Andreas_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden