Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: FSRef speed impacts and permission checks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FSRef speed impacts and permission checks

Subject: Re: FSRef speed impacts and permission checks
From: Mark Day <email@hidden>
Date: Thu, 15 May 2008 10:25:27 -0700

On May 15, 2008, at 12:31 AM, Thomas Tempelmann wrote:

On 15.05.2008 2:29 Uhr, "Mark Day" <email@hidden> wrote:
In current versions of Mac OS X (I think 10.4 and later), the volfs file system was removed. Now, volfs-style pathnames are parsed by VFS and turned into a full pathname inside the kernel (by walking up the vnodes' parent pointers). Then it walks down the path as if the caller had passed a full pathname, which lets us leverage the existing permission checks. (As Dominic pointed out, there is caching to reduce the cost of those permission checks.)
Is there maybe an optimization that skips this whole permission checking if the caller is root?

Being root usually means it doesn't have to fetch or evaluate the permissions of an object. So being root probably speeds things up a little, especially compared to callers who belong to multiple groups, or when files/directories have ACLs. (I'm no kauth expert, so I don't know how much more efficient the permission check is likely to be for root.)

I wonder because when I run my FSCatalogSearch over all items on a volume that contains 5 million items, it does that in less than 4 minutes. That looks to me like a straight walk over the directory file without random reads for parent permissions.
OTOH, when I scan the same volume recursively, even with the
FSGetCatatalogInfoBulk function, scanning becomes much much slower.
I understand that the catalog file is a B*Tree and that it's ordered mainly by a dir's NodeID.

So, while the order of visited dirs is different between CatSearch and recursive search, the latter should actually be faster because its parent dirs have been visited rather recently, making it more likely to be still cached, while with catsearch, it's all pretty random, so reading the parent dir entries is less likely to be cached as well.

Can you explain that?

There are a bunch of things going on that are different between CatalogSearch and a recursive walk.

First of all, CatalogSearch only has to check permissions on items that match your search criteria. A recursive walk has to check permissions for every object you visit. If the matches are a small fraction of the total items, that gives CatalogSearch less overhead. I wouldn't be surprised if CatalogSearch avoids building a path in order to check permissions (but again, I'm no kauth expert).

The recursive walk (using FSRefs) has to build and parse full paths for every item you visit. And it has to send you back enough data (from kernel to user space) for you to do matching in your app. CatalogSearch doesn't have to build/parse the paths (except perhaps for matches), and it has more efficient access to the fields you're matching against.

But one of the big wins for CatalogSearch is more efficient disk reads. Yes, the B-tree is ordered by parent directory ID then name. But two nodes that are logically contiguous doesn't mean they're physically contiguous on disk (even if the entire tree were in one contiguous extent). CatalogSearch reads the B-tree sequentially as if it were an ordinary file; plus it does relatively large reads (currently 32KB at a time). These reads are sequential, so it's usually very efficient for the drive (very often in the drive's track cache). In a recursive walk, it will read individual B-tree nodes (typically 8KB) as needed; they're often not sequential, so they may require the drive to seek to a different track. Disks are *much* better at large, sequential reads than small random reads.

Hope that helps,

-Mark

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: FSRef speed impacts and permission checks
From: Thomas Tempelmann <email@hidden>


References:  
  >Re: FSRef speed impacts and permission checks (From: Thomas Tempelmann <email@hidden>)




Prev by Date:
Re: FSRef speed impacts and permission checks

Next by Date:
Re: FSRef speed impacts and permission checks

Previous by thread:
Re: FSRef speed impacts and permission checks

Next by thread:
Re: FSRef speed impacts and permission checks

Index(es):

Date
Thread