From: Jim Douglas (jim_at_[hidden])
Date: 2006-03-02 13:36:27
A recent problem I encountered that caused the total failure of a
regression test run has highlighted what I (and others) believe to be a
design flaw in execution_monitor::catch_signals (in file
IIUC the purpose of the code in question is to intercept UNIX signals
and convert them to exceptions which contain a diagnostic message
indicating the signal type. For most signals this should work reliably.
However, we suggest that this mechanism is prone to failure when
attempting to deal with a SIGSEGV, as happened in my case.
What I observed was a test process that had stalled by blocking against
a mutex. Detailed analysis showed that the test failed with a memory
segment violation resulting in a SIGSEGV.
In brief, the primary execution failure corrupts the program heap. When
the C++ exception is thrown at execution_monitor:462, the exception
handling mechanism calls __cxa_allocate_exception which then calls
std::malloc. But, because of the corrupted heap, this call blocks
against the malloc mutex.
This is the specific case with QNX, and other OSs will deal with this in
different ways. The general point I would like to make is that after a
memory segment violation, any process's memory will be in an undefined
state and it is unreasonable to assume that it will be capable of
continued execution as per the present design. I propose that there is
no safe way to intercept a SIGSEGV and that it should be allowed to
terminate the process without intervention.