Re: pthread_cond_wait() missing broadcast?
Re: pthread_cond_wait() missing broadcast?
- Subject: Re: pthread_cond_wait() missing broadcast?
- From: Terry Lambert <email@hidden>
- Date: Mon, 7 Jan 2008 15:19:08 -0800
On Jan 7, 2008, at 11:49 AM, Pierre Baillargeon wrote:
I'm using 10.5.1 on a dual-core iMac.
Here's the question currently on my mind. Explanation about why I'm
asking follow:
Does pthread_cond_wait() releases the mutex before waiting on the
condition in a manner that could make it lose a broadcast?
Or
Can pthread_cond_broadcast() only wake up one thread when two
threads are waiting on the condition variable, leaving the other one
behind?
I've tried to follow the code of __pthread_cond_wait() in the Darwin
source code, but I've been stumped by the fact that the mutex being
passed in is not explicitly unlocked anywhere! The variable is
checked to verify that it is indeed a mutex but is not otherwise
used anywhere else in the code, making me wonder where and how it
can possibly be released. I've gone down to
wait_queue_assert_wait64_locked(). Is the unlock somehow implicit?
The problem arises in the following situation: I have 3 threads and
the goal is that a "central gate" blocks them until all 3 have
finished their work. The code roughly is:
int count = 0; // Global variable (really a class variable in the
Gate implemetation)
pthread_mutex_lock(mutex);
if( ++count == 3 ) {
count = 0;
pthread_cond_broadcast(cond);
}
else {
// Ignore spurious wake-up in this example.
// Even if spuriously waken-up, it would not affect the issue.
pthread_cond_wait(cond, mutex);
}
pthread_mutex_unlock(mutex);
The state I witness is that 2 threads successfully exit but the 3rd
thread is blocked in the pthread_cond_wait(). More mysteriously,
count == 1. (Could be due to some cache coherency issues? Or could
the processor re-order memory write? I've looked at the disassembly
and the increment is fully after the mutex lock and before the cond
broadcast, so it's not a compiler optimization issue.)
But the only way the other two threads can possibly have escaped
this function is to for the count to have reached 3, one thread
being woken up by the broadcast, the other having done the broadcast.
So the only situations I can see that would lead to the state I'm
witnessing is either:
1. pthread_cond_wait() releases the mutex before putting teh thread
on the wait queue, missing the broadcast.
2. pthread_cond_broadcast() only waking up one thread.
I've seen other discussions saying that if a signal arrives on a
thread sleeping on a condition variable, that thred could miss the
broadcast. Is that exact? It could explain the missing wake up.
Whatever mutex you were waiting on is acquired by thread before it's
allowed to run, just as if you had explicitly attempted a lock.
<http://www.opengroup.org/onlinepubs/009695399/functions/pthread_cond_broadcast.html
>
You are most likely going back o sleep attempting to acquire the mutex
in multiple threads, and the one thread unlocks, which unblocks the
remaining two, and then they race to sleep, and the other unlock
either doesn't happen, or misses it's wakeup because of your order of
operation.
In general, calling the:
pthread_cond_wait(cond, mutex);
implies that you intend to do something with tthe resource protected
by the mutex, which would normally prevent a race.
-- Terry
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden