Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: readdir vs. getdirentriesattr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: readdir vs. getdirentriesattr

Subject: Re: readdir vs. getdirentriesattr
From: Thomas Tempelmann <email@hidden>
Date: Mon, 22 Apr 2019 04:35:51 +0200

I like to add some info on a thread from 2015:

I recently worked on my file search tool (FAF) and wanted to make sure that I
use the best method to deep-scan directory contents.

I had expected that getattrlistbulk() would always be the best choice, but it
turns out that opendir/readdir perform much better in some cases, oddly (this
is about reading just the file names, no other attributes).

See my blog post: https://blog.tempel.org/2019/04/dir-read-performance.html
<https://blog.tempel.org/2019/04/dir-read-performance.html>

There's also a test project trying out the various methods.

Any comments, insights, clarifications and bug reports are most welcome.

Enjoy,
 Thomas Tempelmann


> On 12. Jan 2015, at 17:33, Jim Luther <email@hidden> wrote:
>
> getattrlistbulk() works on all file systems. If the file system supports bulk
> enumeration natively, great! If it does not, then the kernel code takes care
> of it. In addition, getattrlistbulk() supports all non-volume attributes
> (getattrlistbulk only supported a large subset).
>
> The API calling convention for getattrlistbulk() is slightly different than
> getattrlistbulk() — read the man page carefully. In particular:
>
> • ATTR_CMN_NAME and ATTR_CMN_RETURNED_ATTRS are required (requiring
> ATTR_CMN_NAME allowed us to get rid of the newState argument).
> • A new attribute, ATTR_CMN_ERROR, can be requested to detect error
> conditions for a specific directory entry.
> • The method for determining when enumeration is complete is different. You
> just keep calling getattrlistbulk() until 0 entries are returned.
>
> - Jim
>
>> On Jan 11, 2015, at 9:31 PM, James Bucanek <email@hidden> wrote:
>>
>> Eric,
>>
>> I would just like to clarify: the new getattrlistbulk() function works on
>> all filesystem. We don't have to check the volume's VOL_CAP_INT_READDIRATTR
>> capability before calling it, correct?
>>
>> James Bucanek
>>
>>>     Eric Tamura     December 10, 2014 at 5:57 PM
>>> It should be much faster.
>>>
>>> Also note that as of Yosemite, we have added a new API: getattrlistbulk(2),
>>> which is like getdirentriesattr(), but supported in VFS for all
>>> filesystems. getdirentriesattr() is now deprecated.
>>>
>>> The main advantage of the bulk call is that we can return results in most
>>> cases without having to create a vnode in-kernel, which saves on I/O: HFS+
>>> on-disk layout is such that all of the directory entries in a given
>>> directory are clustered together and we can get multiple directory entries
>>> from the same cached on-disk blocks.

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

Follow-Ups:
- Re: readdir vs. getdirentriesattr
  - From: Jim Luther <email@hidden>

Next by Date: mds vs f_fstypename
Next by thread: Re: readdir vs. getdirentriesattr
Index(es):
- Date
- Thread