Boost logo

Boost :

From: Martin Wille (mw8329_at_[hidden])
Date: 2004-07-03 05:01:58


Vladimir Prus wrote:
> Martin Wille wrote:
>
>
>>Recently a problem came up with program_options/cmdline_test_dll.
>>Several times, my computer crashed and I haven't been able to figure
>>out the reason for it. Today, I was lucky to see that test eating up
>>all the memory and CPU. So it looks like it ran into an infinite loop.
>>This for several times has been an indicator for something going
>>wrong with the signal handling in Boost.Test; this time, it also looks
>>like Boost.Test is the culprit; strace shows output similar to the
>>other cases:
>>
>>--- SIGABRT (Aborted) @ 0 (0) ---
>>rt_sigprocmask(SIG_SETMASK, [RTMIN], NULL, 8) = 0
>>rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
>>kill(21453, SIGABRT) = 0
>
>
> Hmm.... the test worked OK for me! I'm really interested to figure out where
> the SIGABRT comes from: maybe some assert fires.

The short version:
Usually *some* signal is raised and caught, siglongjumping()
to some other location on the call stack confuses exception
handling. An exception is thrown, the (confused) exception
handling mechanism thinks it is invalid and calls terminate()
which in turn calls abort(). abort() raises the SIGABRT,
the signal handler gets invoked again.
Longer versions can be found in the mailing list archives,
this problem has been reported a few times already.

We're deep into UB land, and the signal handling code is
known to fail on como. Apparently, it also fails on gcc 2
under certain circumstances. I wouldn't be surpised at all
if it also failed on other compilers.

>>This type of failure is a showstopper for testing. I suggest to
>>disable the sigaction based signal handling in Boost.Test at
>>least for gcc 2 and for como. Perhaps, other compilers are also
>>affected.
>
>
> You mean the problems is only on those two toolsets? Yes, I think disabling
> signal handling in Boost.Test to see where the test fails would be very
> desired. BTW, you mention como, but I don't see that toolset in linux
> regression results.

The problem is known to exist on como. That's actually the major
reason for como not being on the list of regression results.
The code exploits UB, and I'm expecting it to cause problems
with other toolsets, too, some day. However, other than to
disable the sigaction/siglongjmp based signal handling, I have
no suggestion to fix it.

Regards,
m


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk