Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Directory size function: problem with iteration



On 29/10/2005 07:54 Pm, Lawrence Sanbourne wrote:

> On 10/29/05, Mike Kluev <email@hidden> wrote:
> 
>> Why do you think there is an advantage of using continuous memory and
>> vectors in this case (iterating directory structure)? I don't see the
>> need for random access here. Yes, arrays are typically considered
>> easier than linked lists, but there are cons as well.
> 
> Note that you're commenting on Laurence Harris's words, not mine.

Yes, I know.

>> WinAPI's FindFirst just allocate the whole directory structure in one
>> handle making a "snapshot" and the subsequent FindNexts just read from
>> this handle. This has the advantage of atomic (consistent) result: you
>> won't get inconsistent result if the directory changing while you
>> iterating over it. You may try doing the same by allocating memory for
>> all items and making a single GetBulkInfo call, thus eliminating the
>> need for reallocate or using linked list structures. Though, the
>> problem with this is that the number of files in the directory might
>> change after you've called FSGetCatalogInfo on the parent (to know
>> its valence = number of sub items in it) and before you call
>> FSGetCatalogInfoBulk to grab that many items.
> 
> Not all filesystems support valence, so that's a bit tough.

I think all do as far as the result of GetCatalogInfo is concerned
when you ask for valence. Though indeed, valence calculation could
be slow on those volumes that don't store valence natively.

Still doable, while not pretty: allocate a big chunk of memory,
like 1 GB, don't feel it anyway (to keep memory pure virtual),
call getinfobulk asking for, say, 10M items (*), getinfobulk will
return all the items in the directory (if there are not more than
10M of them), allocate another array for that many resulted items,
copy them from huge array and free huge array afterwards to free
reserved virtual space.

(*) Hopefully getinfobulk doesn't fill the whole array, say, with
zeroes on initialization. If it does, this method doesn't work.

> I'm unconcerned about getting inconsistent results if the directory
> changes while iterating over it. For my application, the directories
> are so small that this is very unlikely.

Beware that then, if the directory is changed even in a minor way
while you are iterating over it (e.g. an empty file is added or
deleted), you may skip some items or encounter them twice, etc.,
the items that could be potentially large (a big file or a folder
with thousand files) thus making the calculated size incorrect.

> And for larger directories,
> it just doesn't seem worth the extra memory footprint.

I was skeptical too and then amazed when saw that it works pretty
well this way in Windows.

Mike

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Carbon-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/carbon-dev/email@hidden

This email sent to email@hidden

References: 
 >Re: Directory size function: problem with iteration (From: Lawrence Sanbourne <email@hidden>)



Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.