Boost logo

Boost :

Subject: Re: [boost] [thread 1.48] Multiple interrupt/timed_join leads to deadlock
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2012-12-07 13:03:58


Le 07/12/12 01:23, Gaetano Mendola a écrit :
> On 06/12/2012 23.31, Vicente J. Botet Escriba wrote:
>> Le 05/12/12 21:02, Gaetano Mendola a écrit :
>>> On 05/12/2012 19.44, Vicente J. Botet Escriba wrote:
>>>> Le 05/12/12 18:49, Gaetano Mendola a écrit :
>>>>> On 05/12/2012 18.35, Vicente J. Botet Escriba wrote:
>>>>>> Le 05/12/12 16:59, Gaetano Mendola a écrit :
>>>>>>> On 05/12/2012 16.29, Vicente Botet wrote:
>>>>>>>> Gaetano Mendola-3 wrote
>>>>>>>>> On 05/12/2012 13.42, Vicente J. Botet Escriba wrote:
>>>>>>>>>> Le 05/12/12 12:33, Gaetano Mendola a écrit :
>>>>>>>>>>> On 05/12/2012 09.16, Anthony Williams wrote:
>>>>>>>>>>>> On 04/12/12 18:32, Gaetano Mendola wrote:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>> I was investigating a rare deadlock when issuing an
>>>>>>>>>>>>> interrupt and
>>>>>>>>>>>>> a timed_join in parallel. I come out with the the following
>>>>>>>>>>>>> code
>>>>>>>>>>>>> showing the behavior.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The deadlock is rare so sometime you need to wait a bit.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I couldn't try it with boost 1.52 because the code is invalid
>>>>>>>>>>>>> due the precondition of "thread joinable" when issuing the
>>>>>>>>>>>>> timed_join.
>>>>>>>>>>>>
>>>>>>>>>>>> That's a hint.
>>>>>>>>>>>>
>>>>>>>>>>>>> Is the code not valid or a real bug?
>>>>>>>>>>>>
>>>>>>>>>>>> The code is invalid: you keep trying to interrupt and join
>>>>>>>>>>>> even
>>>>>>>>>>>> after
>>>>>>>>>>>> the thread has been joined! Once the thread has been
>>>>>>>>>>>> joined, the
>>>>>>>>>>>> thread
>>>>>>>>>>>> handle is no longer valid, and you should exit the loop.
>>>>>>>>>>>
>>>>>>>>>>> I haven't seen this statement in the documentation.
>>>>>>>>>>> The loop was meant to exploit exactly this, then you are
>>>>>>>>>>> confirming
>>>>>>>>>>> that interrupting a joined thread is not valid. How do I safely
>>>>>>>>>>> interrupt then a thread?
>>>>>>>>>>> There is no "atomic" check_joinable_then_interrupt,
>>>>>>>>>>> whatching at
>>>>>>>>>>> the
>>>>>>>>>>> interrupt code it seems that the check is done inside. I'm
>>>>>>>>>>> lost.
>>>>>>>>>> Boost.Thread and std::thread are designed so that there is only
>>>>>>>>>> one
>>>>>>>>>> owner of the thread. That is only one thread can
>>>>>>>>>> join/interrupt a
>>>>>>>>>> thread
>>>>>>>>>> safely.
>>>>>>>>>
>>>>>>>>> Unless I have totally missed it the documentation doesn't mention
>>>>>>>>> anything about thread safety (would that be an hint about it?).
>>>>>>>>
>>>>>>>> From the 1.48 documentation
>>>>>>>> "Member function timed_join()
>>>>>>>>
>>>>>>>> bool timed_join(const system_time& wait_until);
>>>>>>>>
>>>>>>>> template<typename TimeDuration>
>>>>>>>> bool timed_join(TimeDuration const& rel_time);
>>>>>>>>
>>>>>>>> Preconditions:
>>>>>>>>
>>>>>>>> this->get_id()!=boost::this_thread::get_id()
>>>>>>>>
>>>>>>>> Postconditions:
>>>>>>>>
>>>>>>>> If *this refers to a thread of execution on entry, and
>>>>>>>> timed_join
>>>>>>>> returns true, that thread of execution has completed, and *this no
>>>>>>>> longer
>>>>>>>> refers to any thread of execution. If this call to timed_join
>>>>>>>> returns
>>>>>>>> false,
>>>>>>>> *this is unchanged.
>>>>>>>> "
>>>>>>>>
>>>>>>>> Your second call doesn't satisfy the pre-conditions, so that the
>>>>>>>> outcome of
>>>>>>>> this second call is undefined.
>>>>>>>
>>>>>>> That precondition tests that your are not interrupting yourself
>>>>>>> doesn't say anything about thread safety. Am I missing something ?
>>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> no function is considered been thread-safe until it is stated
>>>>>> explicitly.
>>>>>
>>>>> You are completely right, it was my own fault considering it thread
>>>>> safe.
>>>>>
>>>>> So even thread_group is not thread safe? The implementation is full
>>>>> thread safe.
>>>> How it is thread-safe? Only one thread can call to the thread_group
>>>> functions at a time.
>>>
>>> The implementation is straight.
>>> The boost::thread_group keeps a std::list of boost::threads pointers.
>>>
>>> the public methods:
>>>
>>> create_thread
>>> add_thread
>>> remove_thread
>>> join_all
>>> interrupt_all
>>> size
>>>
>>> are all protecting the internal status and then all the threads in the
>>> list with a common mutex.
>>>
>>> It appears completely thread safe to me.
>> You are right. I didn't see the implementation. Anthony, I don't know
>> what was the intent of the initial design. Do you have an idea?
>> What do you suggest? update the documentation?
>
> Even the implementation can be improved, even if it's thread safe it
> doesn't avoid to issue a join to an already joined thread, or for the
> matter to interrupt a thread already joined/interrupted.
>
I don't know if you comment has something to be with these two tickets

https://svn.boost.org/trac/boost/ticket/7668: thread_group::join_all()
should check whether its threads are joinable
https://svn.boost.org/trac/boost/ticket/7669: thread_group::join_all()
should catch resource_deadlock_would_occur

> I also believe the create_thread returning a boost::thread pointers is
> very dangerous, indeed doing so the boost::thread_group doesn't have
> anymore the ownership of it, it would be better if create_thread would
> have returned a thread id instead.
I don't want to invest to much on the thread_group class, as for me this
class was designed before threads were movable and a complete redesign
should be done. You can of course create a Trac ticket requesting this
new feature.
>
> Sure the programmer can shoot himself in the foot in other more
> spectacular way but at least having a thread group interface obbeying
> the principle of least astonishment can be a plus.
Best,
Vicente


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk