Re: readdir vs. getdirentriesattr
Re: readdir vs. getdirentriesattr
- Subject: Re: readdir vs. getdirentriesattr
- From: Thomas Tempelmann <email@hidden>
- Date: Mon, 22 Apr 2019 04:35:51 +0200
I like to add some info on a thread from 2015:
I recently worked on my file search tool (FAF) and wanted to make sure that I
use the best method to deep-scan directory contents.
I had expected that getattrlistbulk() would always be the best choice, but it
turns out that opendir/readdir perform much better in some cases, oddly (this
is about reading just the file names, no other attributes).
See my blog post: https://blog.tempel.org/2019/04/dir-read-performance.html
<https://blog.tempel.org/2019/04/dir-read-performance.html>
There's also a test project trying out the various methods.
Any comments, insights, clarifications and bug reports are most welcome.
Enjoy,
Thomas Tempelmann
> On 12. Jan 2015, at 17:33, Jim Luther <email@hidden> wrote:
>
> getattrlistbulk() works on all file systems. If the file system supports bulk
> enumeration natively, great! If it does not, then the kernel code takes care
> of it. In addition, getattrlistbulk() supports all non-volume attributes
> (getattrlistbulk only supported a large subset).
>
> The API calling convention for getattrlistbulk() is slightly different than
> getattrlistbulk() — read the man page carefully. In particular:
>
> • ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring
> ATTR_CMN_NAME allowed us to get rid of the newState argument).
> • A new attribute, ATTR_CMN_ERROR, can be requested to detect error
> conditions for a specific directory entry.
> • The method for determining when enumeration is complete is different. You
> just keep calling getattrlistbulk() until 0 entries are returned.
>
> - Jim
>
>> On Jan 11, 2015, at 9:31 PM, James Bucanek <email@hidden> wrote:
>>
>> Eric,
>>
>> I would just like to clarify: the new getattrlistbulk() function works on
>> all filesystem. We don't have to check the volume's VOL_CAP_INT_READDIRATTR
>> capability before calling it, correct?
>>
>> James Bucanek
>>
>>> Eric Tamura December 10, 2014 at 5:57 PM
>>> It should be much faster.
>>>
>>> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2),
>>> which is like getdirentriesattr(), but supported in VFS for all
>>> filesystems. getdirentriesattr() is now deprecated.
>>>
>>> The main advantage of the bulk call is that we can return results in most
>>> cases without having to create a vnode in-kernel, which saves on I/O: HFS+
>>> on-disk layout is such that all of the directory entries in a given
>>> directory are clustered together and we can get multiple directory entries
>>> from the same cached on-disk blocks.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden