Re: readdir vs. getdirentriesattr
Re: readdir vs. getdirentriesattr
- Subject: Re: readdir vs. getdirentriesattr
- From: Thomas Tempelmann <email@hidden>
- Date: Mon, 29 Apr 2019 22:19:52 +0200
Jim,
In contentsOfDirectoryAtURL, instead of "includingPropertiesForKeys:nil",
> use "includingPropertiesForKeys:@[NSURLVolumeIdentifierKey]" (and add
> whatever other property keys you know you'll need). The whole purpose of
> the includingPropertiesForKeys argument is so the enumerator code can
> pre-fetch the properties you need as efficiently as possible. The
> enumeration will be a bit slower, but the entire operation of enumerating
> and getting the properties from the URLs returned will be faster.
>
I know. That's the theory, but my benchmarking says it makes no difference
in that case. And that's quite logical because the pre-caching is meant for
data that has to come from the lowest level, i.e. where the catalog data is
fetched - it makes sense to combine multiple property requests into one,
just like the getdirentriesattr is meant to used like. However, as I
explained the volume ID is not stored in the catalog but at a higher level,
and therefore pre-fetching this at the lowest level makes no difference, at
requires no catalog access, right?
My performance tests always runs twice in fast succession, so that in the
second run, due to caching, all data's ready and does not incur random
delays that would give imprecise measurements. Sure, this does not give me
the worst case, but it gives me the best case results at least. And these
best case results say: Scanning "/System" on my Mac without getting the
Volume ID takes less than 3s, but with (with and without pre-fetching)
getting it takes over 6s. That's TWICE as much time. With smaller dir tree
the difference is less, possibly because then there's other caches helping.
I assume that when I re-run the scan, after having released all NSURLs from
the previous scan (even by restarting the test app), the framework creates,
fresh, NSURL objects, right? It's not that there is only one
NSURL instance on the entire system per volume item, shared between all
processes, or is there? The only caching, once I release an NSURL, is at
the volume block cache level, isn't it?
Also, use
> -[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] instead
> of -[contentsOfDirectoryAtURL:includingPropertiesForKeys:options:error:]
> unless you really need an NSArray of NSURLs. If your code is just
> processing all of the URLs and has no need to keep them after processing,
> there's no reason to add them to an array (which takes time and adds to
> peak memory pressure).
>
Thanks, that makes sense.
-[enumeratorAtURL:includingPropertiesForKeys:options:errorHandler:] also
> supports recursive enumeration (which stops at device boundaries -- you'll
> see mount points but not their contents) so you don't have to do that
> yourself.
>
Is that based on fts_read? Because I found that this is much faster on
local volumes (not on network vols, though) than all other ways I've tried.
And it brings along the st_dev value without time penalty, unlike
contentsOfDirectoryAtURL.
Regardless, I'll give that a try.
--
Thomas Tempelmann, http://apps.tempel.org/
Follow me on Twitter: https://twitter.com/tempelorg
Read my programming blog: http://blog.tempel.org/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Filesystem-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden