Re: Mystery Threads
Re: Mystery Threads
- Subject: Re: Mystery Threads
- From: "Gerriet M. Denkmann" <email@hidden>
- Date: Sat, 01 Oct 2016 20:27:16 +0700
> On 1 Oct 2016, at 01:33, Quincey Morris <email@hidden> wrote:
>
> On Sep 30, 2016, at 02:57 , Gerriet M. Denkmann <email@hidden> wrote:
>
>> Any ideas where to look for a reason?
>
> The next step is probably to clarify the times between:
>
> a. Accumulated execution time — the amount of time your code actually spends executing in CPUs.
>
> b. Elapsed time in your process — the amount of time that’s accounted for by your process, whether executing or waiting.
>
> c. Elapsed time outside your process — the amount of time that’s accounted for by system code, also whether executing or waiting.
Time Profiler tells me that 99.9 % of the time is in Running and Blocked - each very roughly half of the time (or 2/3 to 1/3).
Running is almost 100 % in my function.
Blocked is in roughly equal parts: my function, mach_msg_trap (from RunLoop) and workq_kernreturn (from start_wq_thread).
There are some minor variations between 8 or 20,000 iterations, but nothing to explain a difference factor of 8.
My function reports the running time:
start =[NSDate date]
…
dispatch_apply…
time = -start.timeIntervallSinceNow
which shows the same factor of 8 between 8 or 20,000 iterations.
>
> You can also play around with change isolation. Instead of changing two contextual conditions (the number of dispatch_appy calls, the number of iterations in a single block’s loop), change only one of them and observe the effect in Instruments.
Well, to compare I want in all cases (independent of nbr of iterations):
The iterations have disjunct working ranges and the sum of the working ranges of all iterations covers the whole bigArray.
The same nbr of operations (at the same indices) should be done on bigArray.
The operations in the working range should be done randomly (well: at least not sequentially).
One thing is quite clear: each iteration of my function accesses its working range sort of randomly.
If one uses sequential access, then a very different behaviour emerges.
> You can also try out some other instruments speculatively. For example, is there a different pattern in the Allocations instrument, indicating that one form of your code is doing vast numbers of memory allocations for some (unknown) reason. Or is I/O being doing, unexpectedly?
There are no allocations (except at the start one huge malloc of 400 MB).
There is no I/O.
The I tried the System Trace Instrument and learned:
2k iterations (180 msec):
the whole time there is zero-fill being done (very rarely a page-fault).
8 iterations (1500 msec):
the first 100 msec there is zero-filling, then the 8 threads just keep slugging along.
There are far fewer context switches (almost none after the zero-filling has ceased).
But still, I cannot see any reason why this should take so much longer.
My hypothesis is: with a big number of iterations (each having a working range ≤ 500 KB ) any 8 iterations running concurrently use together ≤ 4 MB, which might just fit into some cache.
With 8 iterations (each using a working range of 50 MB) there probably is a lot of cache reloading going on.
But I failed to see any proof of this hypothesis in Instruments.
Kind regards,
Gerriet.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden