Re: Semaphored tasks not scheduling efficiently

30 May 2008

      site_archiver@lists.apple.com
Delivered-To: darwin-dev@lists.apple.com

On May 28, 2008, at 12:04 PM, darwin-dev-request@lists.apple.com wrote:

Russ,
 = Mike
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list      (Darwin-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl...
I am optimizing a multithreaded app using shark on an 8-core
machine.  There is a typical MPCreateSemaphore/MPSignalSemaphore
semaphore-based setup that hands out work units to be processed to a
set of worker threads, and it is is working... but when you look
carefully with Shark you see that a fair portion of the time, only 7
cores are busy. As the work units are being released, the cores
start up --- but under close examination sometimes one will start
only for 2 usec or less, then stop with semaphore_timedwait_trap. It
then goes away for a long time (multiples of the 10msec os thread
slice) and the mach kernel shows that there is an idle thread
running. Sometimes the 'hole" in utilization moves to a different
core. It winds up biting twice because not only do you lose a core,
but you wind up with a ragged inefficient finish that wastes 6 or 7
cores.  All this repeats, giving me a swiss-cheese look in what
should logically be a very dense shark system

trace. All the work eventually gets done and it works, but not as
well as it should.
I saw some notes on some mach semaphores that don't make the target
task immediately runnable, not sure if that's what's happening here,
but once the main thread goes to sleep after starting the workers,
the eighth runnable worker thread should surely start, I'd think.
Can anyone give insight into the MP semaphore and scheduling
internals? Thanks.

I'm not aware of anything that would cause the symptoms you're seeing
that wasn't a result of contention; in the case where you have a core
idle, what is your worker thread blocking on?
"semaphore_timedwait_trap" is the mach incall that blocks a thread
waiting for a semaphore wakeup, so you are blocking "legitimately".
You might consider instrumenting your blocking points to see whether
you spend an unexpectedly long time blocked at any given point - it
sounds as though you expect to be running 8 threads non-stop, so it
should be fairly straightforward to tell the difference between a
legitimate and non-legit block.
You should also check to make sure you aren't triggering any page-in/
out activity, as "tens of ms" is consistent with disk/network I/O
timing.
If you are handling multiple work units in a given run, it sounds like
you might have a race between the work provider and consumers leading
to a unit stuck in the queue.  In the case where you have a thread
asleep, have you checked the state of your work units?
One final note; in a loosely-affine system like Darwin, I would tend
to encourage a worker pool of N+(N/4) or so to ensure maximal
saturation in the face of sporadic but protracted contention.  This
does have a tendency to give you a spiky tail on your saturation plot
(since if contention is low you have a chunk of work remaining at the
end that runs at about 1/4 saturation) so consider it no more than a
starting point.  Note that increasing the thread count into the 2-4N
range can have useful effects on saturation depending on the nature of
your workload, so do try moving both down and up.  Obviously if you
expect no contention, N is a good place to be.
This email sent to site_archiver@lists.apple.com

Michael Smith

tags

participants (1)