From: Pinho Romulo (romulo.pinho_at_[hidden])
Date: 2008-04-09 05:03:55
Many thanks for the quick response.
I was quite confident, however, that due to the well behaved nature of the application the problem was not the race condition you mentioned. Nevertheless, adding the mutex-protection to the right places brought light into the problem and I discovered it was located in the class' constructor. I completely overlooked the member initialization order and the threads (Active Objects) were initialized and often activated before the complete initialization of the mutexes that eventually synchronize them. Since the wait condition was executed at the very beginning of the thread, the mutex could be not-ready and that was the race condition.
After changing the initialization order, everything went well, even without the mutex-protection around mExec2. I do agree, however, that adding the mutex-protection is the best option and I have done so.
Thanks again for the assistance.
Quoting Pinho Romulo <romulo.pinho_at_[hidden]>:
> I've been facing deadlock situations using boost 1.33.1 under
> Windows and Linux. The problem occurs when
> boost::condition::notify_one() sometimes does not release the
> corresponding waiting thread. A second thread that depends on this
> release eventually becomes locked. The reason for this post is that
> I could not reproduce this behaviour using Windows threads with VC++
> 6.0 nor using pthreads in Suse 10.3 with g++ 4.2.1.
Your problem is that the waiting threads are checking mExec2 to see
whether or not to proceed, as part of the predicate loop around the
condition wait, but the updates to mExec2 in Exec2() and Release2()
are not done under protection of the mutex. You therefore have a race
condition, and it is possible that the waiting thread is not actually
waiting in the condition variable when notify is called, or it does
not see the changed value when it does wake, and waits again.
You don't need to hold a mutex during the call to notify_one(), but
you *do* need to hold a mutex to update the shared data. In this case
you need to be holding a lock on mMutex1 when you update mExec2.
There is no guarantee that this would work with POSIX condition
variables, so you are lucky if it does.
> I was in doubt if I should direct this post to a users list. I just
> could not find one specific for threads. My apologies in any case.
The main boost users list would have been the most appropriate place,
but it's not out of place here.
-- Anthony Williams | Just Software Solutions Ltd Custom Software Development | http://www.justsoftwaresolutions.co.uk Registered in England, Company Number 5478976. Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL _______________________________________________ threads-devel mailing list threads-devel_at_[hidden] http://lists.boost.org/mailman/listinfo.cgi/threads-devel