From: Roland Schwarz (roland.schwarz_at_[hidden])
Date: 2005-09-02 10:18:44
John Maddock schrieb:
>I can also reduce the number of threads to about 10 and still get the
You can go as low as two.
>However, I can't see what the problem is: when the deadlock occurs all the
>threads are waiting for the writer condition variable (m_waiting_writers) to
>wake up one of the writers at
>Line 512. The member m_waking_writers is set to one, and as far as I can
>see that can only occur in
>read_write_mutex_impl<Mutex>::do_wake_writer(void) line 1425, which then
>must have notified the condition variable to wake up one thread. m_state
>must have been set to zero before all this happens so the woken thread
>should not loop and go back to sleep (Footnote, actually that appears not
>to be true, sometimes a thread is woken with m_state == -1 but that appears
>not to be the immediate cause of the problem). So.. I'm stumped at present.
When a thread is releasing its lock, the waiters on the condition
are notified_one. The m_state is set to 0, and m_num_waking_writers > 0.
Now when it happens (and it does happen) that another thread enters the
do_write_lock _before_ any other thread has been woken up, it will see
an m_state of 0. And this is bad, since there are m_num_waking_writers > 0.
This is bad because obtaining the lock (which will be granted because
of m_state == 0) in essence is kind of a wakeup. But the code does not
account for this and correct the m_num_waking_writers.
Hence the do_wake_write will never again try to notify_one any waiters.
This leads to deadlock.
What is left: Who actually is beeing woken up then? Yup obviously
the original waiting writer receives a spurious wakeup, sees that
the m_state is -1 again, and keeps on waiting.
My already posted bugfix solves for this, but I am not yet sure what is
the other do_*_lock operations. Are they susceptible to this bug too?
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk