• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Strange behavior iterating the directory tree of a volume
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Strange behavior iterating the directory tree of a volume


  • Subject: Re: Strange behavior iterating the directory tree of a volume
  • From: Quinn <email@hidden>
  • Date: Wed, 5 Mar 2008 22:10:19 +0000

At 10:35 -0500 12/2/08, Trainor, Chris wrote:
I am working on some code to iterate over the directory tree of an entire volume.

Chris opened up a DTS tech support incident for this issue <sonr://Request/42522047>, and that gave me a chance to investigate it in depth. The results were quite surprising, so I thought I'd share them with the group.


o There are two fundamental ways to iterate a directory, <x-man-page://2/getdirentries> and <x-man-page://2/getdirentriesattr>. Most BSDish APIs are based on the former (for example, <x-man-page://3/readdir>, <x-man-page://3/fts>, <x-man-page://3/scandir>), while Mac-style APIs (for example, FSGetCatalogInfo, FSCopyObjectSync, and the Finder) are based on the latter (if it's available).

o The bulk of the heavy lifting for these routines is done by the VFS plug-in. This discussion focuses on HFS Plus. It's likely that the details will be very different for other volume formats.

o If you apply the ostrich algorithm to mutation-during-iteration (that is, you ignore the problem entirely), getdirentries is more resilient than getdirentriesattr because of a bug in getdirentriesattr <rdar://problem/5762961>.

o OTOH, getdirentriesattr gives you a way to /check/ for mutation-during-iteration by way of its newState parameter. Unfortunately this mechanism has a number of issues:

- Prior to Mac OS X 10.5, the newState value returns by getdirentriesattr was actually the modification date of the directory. This can cause problems, as described below.

- In Mac OS X 10.5 and later, this value is an in-memory generation counter. This avoids the problems with modification dates.

- However, Mac OS X 10.5.x still has a bug in getdirentriesattr <rdar://problem/5781876> that can cause mutations to go undetected.

o FSGetCatalogInfoBulk uses the newState result from getdirentriesattr to calculate its containerChanged result. Beyond the problems, both historical and current, inherited from getdirentriesattr, FSGetCatalogInfoBulk has other historical problems:

- Mac OS X 10.0 through 10.1.x does not even initialise containerChanged; you should not look at this value on those systems.

- Mac OS X 10.2 through 10.4.x always sets containerChanged to false; thus, the value is not helpful on those systems.

- Mac OS X 10.5 and later implement containerChanged properly (modulo the problems with getdirentriesattr of course).

o One suggested workaround for this problem is to latch the modification date of the directory before you start iterating it and to check that date when you're done. This doesn't work reliably because the modification date for a directory (on HFS Plus, per my earlier point) has a resolution of one second, regardless of the resolution used in the API to get it. Now consider the following:

  1. You get the modification date for the directory.
  2. You start iterating the directory.
  3. Some other process changes the directory.
  4. You complete iteration of the directory.
  5. You get the modification date of the directory.
  6. You compare the values from step 1 and 6.

If all of these events happen in the same second, there's no way you can detect changes. Actually, it's worse than that. If events 1 through 3 happen in the same second, you'll miss the change regardless of how long it takes to iterate the entire directory.

o A better workaround is to use a <x-man-page://2/kqueue>. You can add the file descriptor you're using to iterate the directory to a kqueue and monitor that kqueue for changes. If you complete iteration without seeing any changes, you have an accurate snapshot of the directory's contents.

The following snippet shows this in practice:

    kq = kqueue();
    assert(kq >= 0);

    fd = open("/Test", O_RDONLY);
    assert(fd >= 0);

    EV_SET(&kev, fd, EVFILT_VNODE, EV_ADD, NOTE_WRITE, 0, 0);
    err = kevent(kq, &kev, 1, NULL, 0, NULL);
    assert(err >= 0);

    [... iterate the directory in the usual way ...]

    err = kevent(kq, NULL, 0, &kev, 1, &kZeroTimeout);
    assert(err >= 0);
    if (err == 1) {
        [... directory was changed ...]
    }

    EV_SET(&kev, fd, EVFILT_VNODE, EV_DELETE, NOTE_WRITE, 0, 0);
    err = kevent(kq, &kev, 1, NULL, 0, NULL);
    assert(err >= 0);

    err = close(fd);
    assert(err == 0);

kqueue support was introduced in Mac OS X 10.3. IIRC it had some reliability problems on 10.3, but it's pretty solid on 10.4 and later.

o Another option is to use the FSEvents framework to watch for changes. This is a great choice if you're already watching for changes globally, for example, if you're developing backup software.

Share and Enjoy
--
Quinn "The Eskimo!"                    <http://www.apple.com/developer/>
Apple Developer Relations, Developer Technical Support, Core OS/Hardware
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Prev by Date: Re: Tuning the unified buffer cache
  • Next by Date: I've messed up a partition....
  • Previous by thread: Re: Tuning the unified buffer cache
  • Next by thread: I've messed up a partition....
  • Index(es):
    • Date
    • Thread