site_archiver@lists.apple.com Delivered-To: darwin-dev@lists.apple.com I'm using 10.5.1 on a dual-core iMac. Or -- Pierre Baillargeon _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-dev mailing list (Darwin-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-dev/site_archiver%40lists.appl... Here's the question currently on my mind. Explanation about why I'm asking follow: Does pthread_cond_wait() releases the mutex before waiting on the condition in a manner that could make it lose a broadcast? Can pthread_cond_broadcast() only wake up one thread when two threads are waiting on the condition variable, leaving the other one behind? I've tried to follow the code of __pthread_cond_wait() in the Darwin source code, but I've been stumped by the fact that the mutex being passed in is not explicitly unlocked anywhere! The variable is checked to verify that it is indeed a mutex but is not otherwise used anywhere else in the code, making me wonder where and how it can possibly be released. I've gone down to wait_queue_assert_wait64_locked(). Is the unlock somehow implicit? The problem arises in the following situation: I have 3 threads and the goal is that a "central gate" blocks them until all 3 have finished their work. The code roughly is: int count = 0; // Global variable (really a class variable in the Gate implemetation) pthread_mutex_lock(mutex); if( ++count == 3 ) { count = 0; pthread_cond_broadcast(cond); } else { // Ignore spurious wake-up in this example. // Even if spuriously waken-up, it would not affect the issue. pthread_cond_wait(cond, mutex); } pthread_mutex_unlock(mutex); The state I witness is that 2 threads successfully exit but the 3rd thread is blocked in the pthread_cond_wait(). More mysteriously, count == 1. (Could be due to some cache coherency issues? Or could the processor re-order memory write? I've looked at the disassembly and the increment is fully after the mutex lock and before the cond broadcast, so it's not a compiler optimization issue.) But the only way the other two threads can possibly have escaped this function is to for the count to have reached 3, one thread being woken up by the broadcast, the other having done the broadcast. So the only situations I can see that would lead to the state I'm witnessing is either: 1. pthread_cond_wait() releases the mutex before putting teh thread on the wait queue, missing the broadcast. 2. pthread_cond_broadcast() only waking up one thread. I've seen other discussions saying that if a signal arrives on a thread sleeping on a condition variable, that thred could miss the broadcast. Is that exact? It could explain the missing wake up. This email sent to site_archiver@lists.apple.com