Re: readdir vs. getdirentriesattr
Re: readdir vs. getdirentriesattr
- Subject: Re: readdir vs. getdirentriesattr
- From: Jim Luther <email@hidden>
- Date: Mon, 22 Apr 2019 14:56:22 -0700
I don’t really have time to look at the current fts implementation, but… it has
several options that effect performance (in particular, the FTS_NOCHDIR,
FTS_NOSTAT, FTS_NOSTAT_TYPE, and FTS_XDEV options). If you are trying to
compare fts to CFURLEnumerator (for example), use FTS_NOCHDIR and FTS_XDEV, but
don’t use FTS_NOSTAT and FTS_NOSTAT_TYPE.
> On Apr 22, 2019, at 9:59 AM, Thomas Tempelmann <email@hidden> wrote:
>
> Jim,
> thanks for your comments.
>
> If all you need is filenames and no other attributes, readdir is usually
> faster than getattrlistbulk because it doesn't have to do as much work.
> However, if you need additional attributes, getattrlistbulk is usually much
> faster. Some of that extra work done by getattrlistbulk involves checking to
> see what attributes were requested and packing the results into the result
> buffer.
>
> What's interesting is that on HFS+, readdir is not faster in my tests, but on
> a recent and fast Mac (i.e. not on my MacPro 2010), it can be twice as fast
> as the others when scanning an APFS volume. I wonder why. Is the
> implementation for getattrlistbulk in the APFS driver inefficient compared to
> the one in HFS+? The source code for the APFS FS driver has still not be
> published, or has it?
>
> You'll find that lstat is slightly faster than getattrlist (when getattrlist
> is returning the same set of attributes) for the same reason. There's no
> extra code needed in lstat to see what attributes were requested and packing
> the results into the result buffer.
>
> It's also significantly faster than using NSURL's getResourceValue, even if
> the NSURL has already been created regardless. That's probably due to all the
> objc overhead.
>
> By the way, I haven't tested this but I would expect
> enumeratorAtURL:includingPropertiesForKeys:options:errorHandler: (followed by
> a "for (NSURL *fileURL in directoryEnumerator)" loop) to be slightly faster
> than contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:
> because the URLs aren't retained in a NSArray. Using CFURLEnumerator may also
> be slightly faster than NSFileManager's directory enumeration.
>
> Now, that's something I had not considered, yet. Will try.
>
> Using POSIX/BSD APIs will be the fastest, but that means you have to deal
> with the different capabilities between file systems yourself (although
> getattrlistbulk helps with that a lot).
>
> Most interesting, though:
>
> Today someone pointed out fts_read. This does, so far always beat all other
> methods, especially if I also need extra attributes (e.g. file size).
>
> Can you give some more information about the fts implementation? Is this
> user-library-level oder kernel code that's doing this? I had expected that
> this would only be a convenience userland function that uses readdir or
> similar BSD functions, but it appears to beat them all, suggesting this is
> optimized at a lower level.
>
>
> I have updated my test project accordingly (with the fts code) in case anyone
> likes to run their own tests:
>
> http://files.tempel.org/Various/DirScanner.zip
> <http://files.tempel.org/Various/DirScanner.zip>
>
> Also, I am wondering if using concurrent threads will speed up scanning a dir
> tree on an SSD as well, by distributing each directory read to one thread (or
> dispatch queue). Will eventually try, but probably not soon. Gotta get my
> program out of the door soon, first.
>
> Thomas
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden