Re: FSRef speed impacts and permission checks
Re: FSRef speed impacts and permission checks
- Subject: Re: FSRef speed impacts and permission checks
- From: Mark Day <email@hidden>
- Date: Thu, 15 May 2008 10:25:27 -0700
On May 15, 2008, at 12:31 AM, Thomas Tempelmann wrote:
On 15.05.2008 2:29 Uhr, "Mark Day" <email@hidden> wrote:
In current versions of Mac OS X (I think 10.4 and later), the volfs
file system was removed. Now, volfs-style pathnames are parsed by
VFS
and turned into a full pathname inside the kernel (by walking up the
vnodes' parent pointers). Then it walks down the path as if the
caller had passed a full pathname, which lets us leverage the
existing
permission checks. (As Dominic pointed out, there is caching to
reduce the cost of those permission checks.)
Is there maybe an optimization that skips this whole permission
checking if
the caller is root?
Being root usually means it doesn't have to fetch or evaluate the
permissions of an object. So being root probably speeds things up a
little, especially compared to callers who belong to multiple groups,
or when files/directories have ACLs. (I'm no kauth expert, so I don't
know how much more efficient the permission check is likely to be for
root.)
I wonder because when I run my FSCatalogSearch over all items on a
volume
that contains 5 million items, it does that in less than 4 minutes.
That
looks to me like a straight walk over the directory file without
random
reads for parent permissions.
OTOH, when I scan the same volume recursively, even with the
FSGetCatatalogInfoBulk function, scanning becomes much much slower.
I understand that the catalog file is a B*Tree and that it's ordered
mainly
by a dir's NodeID.
So, while the order of visited dirs is different between CatSearch and
recursive search, the latter should actually be faster because its
parent
dirs have been visited rather recently, making it more likely to be
still
cached, while with catsearch, it's all pretty random, so reading the
parent
dir entries is less likely to be cached as well.
Can you explain that?
There are a bunch of things going on that are different between
CatalogSearch and a recursive walk.
First of all, CatalogSearch only has to check permissions on items
that match your search criteria. A recursive walk has to check
permissions for every object you visit. If the matches are a small
fraction of the total items, that gives CatalogSearch less overhead.
I wouldn't be surprised if CatalogSearch avoids building a path in
order to check permissions (but again, I'm no kauth expert).
The recursive walk (using FSRefs) has to build and parse full paths
for every item you visit. And it has to send you back enough data
(from kernel to user space) for you to do matching in your app.
CatalogSearch doesn't have to build/parse the paths (except perhaps
for matches), and it has more efficient access to the fields you're
matching against.
But one of the big wins for CatalogSearch is more efficient disk
reads. Yes, the B-tree is ordered by parent directory ID then name.
But two nodes that are logically contiguous doesn't mean they're
physically contiguous on disk (even if the entire tree were in one
contiguous extent). CatalogSearch reads the B-tree sequentially as if
it were an ordinary file; plus it does relatively large reads
(currently 32KB at a time). These reads are sequential, so it's
usually very efficient for the drive (very often in the drive's track
cache). In a recursive walk, it will read individual B-tree nodes
(typically 8KB) as needed; they're often not sequential, so they may
require the drive to seek to a different track. Disks are *much*
better at large, sequential reads than small random reads.
Hope that helps,
-Mark
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden