• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: FSRef speed impacts and permission checks
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FSRef speed impacts and permission checks


  • Subject: Re: FSRef speed impacts and permission checks
  • From: Mark Day <email@hidden>
  • Date: Thu, 15 May 2008 10:25:27 -0700

On May 15, 2008, at 12:31 AM, Thomas Tempelmann wrote:

On 15.05.2008 2:29 Uhr, "Mark Day" <email@hidden> wrote:

In current versions of Mac OS X (I think 10.4 and later), the volfs
file system was removed. Now, volfs-style pathnames are parsed by VFS
and turned into a full pathname inside the kernel (by walking up the
vnodes' parent pointers). Then it walks down the path as if the
caller had passed a full pathname, which lets us leverage the existing
permission checks. (As Dominic pointed out, there is caching to
reduce the cost of those permission checks.)

Is there maybe an optimization that skips this whole permission checking if
the caller is root?

Being root usually means it doesn't have to fetch or evaluate the permissions of an object. So being root probably speeds things up a little, especially compared to callers who belong to multiple groups, or when files/directories have ACLs. (I'm no kauth expert, so I don't know how much more efficient the permission check is likely to be for root.)


I wonder because when I run my FSCatalogSearch over all items on a volume
that contains 5 million items, it does that in less than 4 minutes. That
looks to me like a straight walk over the directory file without random
reads for parent permissions.


OTOH, when I scan the same volume recursively, even with the
FSGetCatatalogInfoBulk function, scanning becomes much much slower.

I understand that the catalog file is a B*Tree and that it's ordered mainly
by a dir's NodeID.


So, while the order of visited dirs is different between CatSearch and
recursive search, the latter should actually be faster because its parent
dirs have been visited rather recently, making it more likely to be still
cached, while with catsearch, it's all pretty random, so reading the parent
dir entries is less likely to be cached as well.


Can you explain that?

There are a bunch of things going on that are different between CatalogSearch and a recursive walk.


First of all, CatalogSearch only has to check permissions on items that match your search criteria. A recursive walk has to check permissions for every object you visit. If the matches are a small fraction of the total items, that gives CatalogSearch less overhead. I wouldn't be surprised if CatalogSearch avoids building a path in order to check permissions (but again, I'm no kauth expert).

The recursive walk (using FSRefs) has to build and parse full paths for every item you visit. And it has to send you back enough data (from kernel to user space) for you to do matching in your app. CatalogSearch doesn't have to build/parse the paths (except perhaps for matches), and it has more efficient access to the fields you're matching against.

But one of the big wins for CatalogSearch is more efficient disk reads. Yes, the B-tree is ordered by parent directory ID then name. But two nodes that are logically contiguous doesn't mean they're physically contiguous on disk (even if the entire tree were in one contiguous extent). CatalogSearch reads the B-tree sequentially as if it were an ordinary file; plus it does relatively large reads (currently 32KB at a time). These reads are sequential, so it's usually very efficient for the drive (very often in the drive's track cache). In a recursive walk, it will read individual B-tree nodes (typically 8KB) as needed; they're often not sequential, so they may require the drive to seek to a different track. Disks are *much* better at large, sequential reads than small random reads.

Hope that helps,

-Mark

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


  • Follow-Ups:
    • Re: FSRef speed impacts and permission checks
      • From: Thomas Tempelmann <email@hidden>
References: 
 >Re: FSRef speed impacts and permission checks (From: Thomas Tempelmann <email@hidden>)

  • Prev by Date: Re: FSRef speed impacts and permission checks
  • Next by Date: Re: FSRef speed impacts and permission checks
  • Previous by thread: Re: FSRef speed impacts and permission checks
  • Next by thread: Re: FSRef speed impacts and permission checks
  • Index(es):
    • Date
    • Thread