> The behavior is that if the threads that share common data are
> running on the same physical cpu die then the performance is 3 to 4
> times greater than if they are split across two physical dies. I ran
> the test shutting off one die per physical cpu and showed the same
> results as if all cpus were active.
I've seen similar issues benchmarking 10GbE NICs, and I don't even
need pthreads. The scheduler tends to run the user mode application
on one core, the interrupt handler kernel thread (iokit "workloop") on
another, and the network stack (dlil) kernel thread on yet another. I
think I've actually seen worse performance than you because for me (on
Tiger), I see up to a 25% increase on some benchmarks, and up to a few
hundred percent on others, if I totally disable a CPU package on a
dual dual core MacPro or Xserve.
I think the fundamental problem is the scheduler doesn't have a clue
about cpu affinity, and MacOSX is lacking any APIs or command line
interfaces that would allow the app or admin to give it a clue (like
you can on Linux, Solaris, etc). Good would be a scheduler with some
notion of CPU affinity, and better would be a scheduler that allowed
the user to give it some hints.
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden