Re: readdir vs. getdirentriesattr
Re: readdir vs. getdirentriesattr
- Subject: Re: readdir vs. getdirentriesattr
- From: Jim Luther <email@hidden>
- Date: Mon, 29 Apr 2019 08:21:54 -0700
In contentsOfDirectoryAtURL, instead of "includingPropertiesForKeys:nil", use
"includingPropertiesForKeys:@[NSURLVolumeIdentifierKey]" (and add whatever
other property keys you know you'll need). The whole purpose of the
includingPropertiesForKeys argument is so the enumerator code can pre-fetch the
properties you need as efficiently as possible. The enumeration will be a bit
slower, but the entire operation of enumerating and getting the properties from
the URLs returned will be faster.
Also, use -[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:]
instead of
-[contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:] unless
you really need an NSArray of NSURLs. If your code is just processing all of
the URLs and has no need to keep them after processing, there's no reason to
add them to an array (which takes time and adds to peak memory pressure).
-[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] also
supports recursive enumeration (which stops at device boundaries -- you'll see
mount points but not their contents) so you don't have to do that yourself.
- Jim
> On Apr 29, 2019, at 8:01 AM, Thomas Tempelmann <email@hidden> wrote:
>
> Doing more performance tests for directory traversal I ran into a performance
> issue with [NSURL contentsOfDirectoryAtURL:]:
>
> See this typical code for scanning a directory:
>
> NSArray *contentURLs = [fileMgr contentsOfDirectoryAtURL:parentURL
> includingPropertiesForKeys:nil options:0 error:nil];
> for (NSURL *url in contentURLs) {
> id value;
> [url getResourceValue:&value forKey:NSURLVolumeIdentifierKey error:nil];
>
> I would have expected the call for fetching NSURLVolumeIdentifierKey to be
> rather fast because the upper file system layer should know which volume this
> belong to because it has to know which FS driver it has to pass the calls to.
> I.e., asking for the volume ID should be much faster than fetching actual
> directory data such as the file size, for instance.
>
> However, it turns out that this is just as slow as getting actual data from
> the lower levels.
>
> Could it be that the call is not optimized for returning this information as
> earlier as possible but that it passes the call down to the lowest level
> regardless of need?
>
> I mention this because it degrades the performance of a recursive directory
> scan significantly in my tests (on both APFS and HFS) - by more than 30%! The
> only thing even slower would be to call stat() instead (for getting the
> st_dev value).
>
> Is this worth having looked at? If so, should I report this via bugreporter
> (though, when I'm then asked to provide a system profiler report then, it's
> not going anywhere)?
>
> Thomas
>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Filesystem-dev mailing list (email@hidden)
> Help/Unsubscribe/Update your Subscription:
>
> This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden