Boost logo

Boost Users :

Subject: [Boost-users] [Thread] Timed join returning true before thread terminated
From: John Rocha (jrr_at_[hidden])
Date: 2012-03-08 22:32:05


Hello,

I'm working on a multi-threaded application that uses boost threads. The threads
are deployed in a cascading mechanism such as:

Main thread creates thread1

thread 1 creates threads 2, 3 and 4

thread 2 creates threads 5-14

The shutdown mechanism is to send a thread interrupt and then do a timed join to
wait for the child thread[s] to finish their shutdown.

So Main sends a thread interrupt to thread1, and then waits for X seconds

thread 1 sends a thread interrupt to thread 2, and does a timed wait, then
thread 3 and wait then thread 4 and wait

thread 2 does the same for each of its children: send a thread interrupt and
then wait.

The problem is, that on rare occasions (1 out of 474 attempts in my last test
cycle), thread 1 will return early from the timed_join, and it returns true,
indicating the child thread is dead -- but it's not.

I have timed logging that shows when a specific threads, shutdown activities
start and stop, and I can see that thread 1 isn't waiting 10 seconds for the
child to exit, and I can see the child is still running.

Any suggestions for this? The only thing I can think of right now is that maybe
thread1 is getting an interrupt for something that is causing it to leave it's
timed_join early? I haven't looked into the boost code for this yet. I'm hoping
that this is something others have encountered and already solved?

Or maybe other debugging tips could be provided?

Thanks in advance,

-=John


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net