Did StackSpace or pthread_get_stackaddr_np change recently on 8-core machines?
Today's benchmark shows one test 4x slower when running on eight
cores than when running on one core rather than the expected 8x
faster, as the problem is embarrassingly parallelizable and coded to
take advantage of that.
The time is spent waiting in __spin_lock under
pthread_get_stackaddr_np called from StackSpace(). The innermost loop
is a genuinely recursive function which needs a large, size
determined at runtime scratchspace, allocated via alloca
for speed since most of the time it will fit on the stack, and alloca
should not have any contention between cores where malloc might.
Before calling alloca, however, the size is checked preflight using
StackSpace, falling back on malloc if there's not enough.
Alas, StackSpace is synchronizing all the cores and killing
performance. It did not used to do that.
Is there something I need to understand? Is that just a system bug?
Or is there a reason I shouldn't have expected to determine the
available stack space in a multi-threaded environment without
incurring any synchronization overhead? What shared resource am I
inadvertently serializing upon?
Do I need to avoid calling StackSpace? (I certainly do now, but if
that's fixed in a system update, this won't be shipping for a while
so I can wait.)
If I need to avoid it, what is the recommended fast, non-serializing
way to preflight before calling alloca to ensure alloca won't blow
out the stack?
This function is recursive, so keeping a persistent static store
doesn't help, and performance is critical here, which is why alloca
was an ideal fit.
In general, this raises the concern that even after testing and
shipping a product, the addition of a lock in an obscure system call
via System Update could have extreme performance consequences. I'd
been spoiled for decades by regular clock speed bumps. I'd hope that,
at least with embarrassingly parallelizable problems, we can look
forward to years of increase in the number of cores providing similar
Do not post admin requests to the list. They will be ignored.
PerfOptimization-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden