Re: readdir vs. getdirentriesattr
Re: readdir vs. getdirentriesattr
- Subject: Re: readdir vs. getdirentriesattr
- From: Sean Farley <email@hidden>
- Date: Wed, 10 Dec 2014 18:43:33 -0800
Eric Tamura writes:
> It should be much faster.
>
> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2), which is like getdirentriesattr(), but supported in VFS for all filesystems. getdirentriesattr() is now deprecated.
Aha, that is interesting and a good lead. Thanks :-)
> The main advantage of the bulk call is that we can return results in most cases without having to create a vnode in-kernel, which saves on I/O: HFS+ on-disk layout is such that all of the directory entries in a given directory are clustered together and we can get multiple directory entries from the same cached on-disk blocks.
Thanks a lot for the explanation. So, if I understand correctly,
directories with a large amount of files will be sped up using this bulk
call vs. one-by-one calling lstat.
But perhaps not as much benefit for a large amount of directories with
one file each?
> How big are the directories in question? How many times are you calling this?
Since this is for the mercurial project, the answer is: depends on the
project. For my tests, I ran this on a handful of repositories (MacPorts
and some others I had lying around). I could generate test repositories
that are of a certain variety (e.g. one root with lots of files per
directory vs. lots of directories with one file) if there is some
insight into what you'd like me to specifically test.
As for the number of times we call this: the answer is once per
directory. This code stems from linux ext4 world where we call lstat for
each file in a directory and rely on the kernel to optimize that.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden