This intrigued me so I decided to try to track it down. Especially
since what I do see on Leopard is that the maximum number of threads
your demo app can create is 2559. When I looked into the kernel code,
I found that unsurprisingly there's a compile-time limit on the number
of threads per process, THREAD_MAX, which is 2560 in Leopard. When
this is hit thread_create_internal fails and returns KERN_FAILURE,
which I verified as best I can is what actually happens on my system;
this propagates back all the way to userspace where it's translated to
EAGAIN by the pthread library.
OK, I don't have the sources here to look over, but that answers a different question for me; I was wondering if the machine type had something to do with how many threads you can spawn. My work machine is a dual-quad core machine, so I didn't know if that had something to do with it.
Hmmm... actually... can someone with a dual-quad core machine running Leopard give that test app a quick whirl? If different systems get different results, then something VERY interesting is going on.
I can reproduce this with 10.4.11 on my Quad G5. I cannot reproduce it with Leopard on the same machine; it behaves exactly as on my Intel machine under Leopard, which is to say it hits the 2560 limit and fails gracefully.
To track this down a bit, I took a System Trace of the app running from start to finish (and moved it to an Intel machine so I could analyse it in reasonable time... poor Quad's looking so old these days). That confirmed what is actually already apparent if you look at your output; although you're being told you're able to spawn 7k threads, only 2559 ever actually run. In fact, System Trace tells me that there are only ever 2560 threads in your app, total, including your main thread. Unfortunately without Dtrace I can't really poke into Tiger's kernel to see what's going on at runtime, and I can't actually be sure System Trace is accurate in this case because alas threads are identified via pointers, which are reused, and if say there's a bug where new threads are clobbering old ones, it's not guaranteed Shark will be able to see this.
However, looking at the Libc source, I think I see the issue; in Tiger's pthread implementation the pthread record (in userspace) is allocated and configured before actually seeing if the corresponding kernel pthread can be created; and even if that fails, it doesn't do anything because it doesn't appear to be doing the necessary error checking. :(
It looks like the reason you eventually hit EAGAIN, nonetheless, is that eventually it fails to preallocate the new thread's stack space. I'm not sure why that has the odd but consistent ~7k limit; I didn't investigate that.
On Leopard this is all rewritten, and it properly handles, in userspace, not being able to create a new kernel thread. So in summary, I'm more or less certain this is a Tiger issue, I doubt it's specific to the Quad (could be PPC-specific, since I didn't check Intel, but I doubt it), and it appears to be fixed in Leopard.
Again, though, file a bug report and cc me the number, so I can have my sleuthing verified by those that actually do this stuff for their day job. :)
Wade |