Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2003-09-26 12:17:32


"Rozental, Gennadiy" <gennadiy.rozental_at_[hidden]> writes:

>> "Gennadiy Rozental" <gennadiy.rozental_at_[hidden]> writes:
>> > In fact there are series of SEH that look pretty
>> "recoverable". Most
>> > of them are related to some king of arithmetic errors (float or
>> > integer). If user willing to take the risk of wrong result it seems
>> > perfectly legit to continue.
>>
>> Careful. That can only work if you have continuation-model
>> EH. If you have termination-model EH (like in C++), there
>> will be trouble if the arithmetic error occurs in a region
>> that's expected to be non-throwing. Furthermore, because of
>> FP pipelining, FP exceptions may occur some random number of
>> instructions after the actual error, so controlling where the
>> "recoverable" exceptions arise is well-nigh impossible.
>
> When I say "recoverable" I mean that test program could skip current test
> case and proceed with next one - IOW it should not affect further
> testing.

What makes you think the failure hasn't corrupted the rest of the
test program?

> I admit the possibility of some missing destructors

?? How??

> but this is the user call - after all in test program it may not be
> that major crime

For large test suites, stopping the test for user interaction can be
problematic.

>> I disagree. The next thing that usually happens after a
>> crash during testing is debugging.
>
> That's the key. Once you switched from scenario 2 to scenario 2 you
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
??

> do want to know where it crashed. In Boost.Test term it means that
> you pass extra argument --catsh_system_error=no.

I think you got the wrong default.

>> > 9th from 10 test cases. Why should we throw out the work done?
>>
>> Who said anything about throwing out the work done? I was
>> just saying you probably shouldn't try to do any *more* work.
>
> I would be difficult and inconvenient to access reporting mechanisms from
> inside SE handler.

Why? You can call any function you like in there.

> Instead I convert it to special fatal error notification. I catch
> it on Monitor level and invoke shutdown routines. It does may skip
> some destructors or we could get unlucky and crush again. But this
> way Boost.Test code much cleaner.

How so?

> Also I think that many NT compilers may produce some kind of window
> with error message - that would be major inconvenience for
> regression tester.

Not if you exit, only if you rethrow to invoke JIT debugging.

>> > For testing this is not true - we are not going to invoke the
>> > debugger.
>>
>> But hopefully you're going to report *some* useful
>> information about the crash, and unwinding can easily
>> interfere with your ability to do so.
>
> Why?

Because corrupted objects may be destroyed (or manipulated in catch
blocks), causing another crash which obscures the reasons for the
original crash.

> I do NOT propose to employ unwinding in production code. More over
> we seemed to be in total agreement.

Nope.

> In some rare case user may be willing to do so - and I do not see a
> reason to prevent it.
>
>> You have not done any of the required legwork to make sure
>> that you actually have recoverability from SEHes, and you
>> can't -- it requires explicit and painstaking cooperation
>> from the program dropped into your testing framework.
>
> I talk under assumption that in majority of the cases test cases in
> unit test program more or less independent. That mean that if you
> got "integer divide by zero" error in one test case you could fail
> that one and continue.

Maybe. It's guesswork. Is it really worth the risk that additional
errors during unwinding will obscure information about the original
problem?

>> > In most cases I am willing to take the risk and resort to
>> > Boost.Test shutdown procedures, that will show results report.
>>
>> There's no reason you can't show the results report without unwinding.
>
> I may look into possibility to do so. But I afraid it may broke
> encapsulation. IOW will be inconvenient.

The choice to take that risk should be the user's, in any case, and
should be made explicitly.

>> > Though I one wants second option should also be available.
>>
>> You seem convinced that trying to recover from SEHes is a
>> reasonable default behavior.
>
> For production program I do not argue the possibility to recover. In
> test program in some rare cases seemed pretty innocent. Also I argue
> the way I perform reporting even if are not going to continue.

Well, I said I was done trying to convince you in my last message, but
I guess I lied. Now I'm *really* done.

>> > P.S. I want to emphasize that this discussion should also be
>> > applicable to fatal signals caught in signaling capable
>> > environment.
>>
>> I don't believe so. Signals are truly asynchronous and can
>> occur during nothrow regions such as destructors. Nothing
>> warrants unwinding at that point.
>
> What about SIGALRM? I use one to timeout test case.
> And again what will happened if I got SIGFPE while in destructor and employ
> unwinding?

How should I know? I think it's undefined behavior.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk