Boost logo

Boost :

From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2019-12-05 13:01:03


On 2019-12-05 14:22, Alexander Grund via Boost wrote:
>> No, you're giving too much credit to compilers. Compilers cannot
>> analyze at the same level as humans do. For example, if the valid
>> index is stored in a data member in one method of a class and used in
>> another, and the first method is required to be called first (which
>> might also be enforced by a runtime check via another member
>> variable), the compiler is not able to know that the index is always
>> valid. At least, I haven't seen such cleverness.
>
> That's why there is usually a checked and an unchecked method. Your
> usual accesses though are: `for(auto c: string) ...`, for(int i=0;
> i<string.size();i++) do(string[i]);`, `if(i+1<string.size())
> bar(string[i], string[i+1])` and the like. Easy on optimizers
>
> The scenario you describe is hard to check for a human to: Are you sure
> this invariant always holds if it is enforced in some other method which
> need to have been called before? For all iterations of code evolving?

Yes, and such use cases appear more often than you might think. Call
that first method a constructor or init(), and it becomes obvious to the
human reviewer that that method must be called before further using the
object. You may argue what if that init() method is not called and the
answer depends on how probable that is. Maybe that method is only called
from a few places of your program, like factory methods, and you can
easily ensure the precondition is not violated. If it is a real
possibility that you really want to protect from, add a flag or use some
other criteria for checking that it was called. More often, though, I
would just make it a documented precondition, and allow UB if users
don't follow it.

> If
> so then better use the string.unsafe_get_i_have_checked_the_index(i)
> method (made-up). This is a clear signal to any reviewer/reader that the
> index was indeed checked and the unsafe call is deliberately used.

Such method is called operator[]. I understand you want that name for
the checked version, but that is not the case for historical reasons
(for which I'm glad).

And again, where check is needed, I will have it written myself and the
way I want it (which may not involve throwing an exception). Where I
don't have it written, I don't want it, at all, and I'm responsible for
the consequences. I would prefer if such use pattern didn't require me
jumping through hoops.

>> Mistakes happen, I'm not arguing with that. But defensive programming
>> is not the solution because it doesn't save you from these mistakes.
>> Code that passes invalid index to at() is just as incorrect as the one
>> that does so with operator[].
> No. Defensive programming is a safety net.

It is not. It doesn't make your code correct, it doesn't make it work,
it doesn't help you fixing the issue. It only prevents your code from
crashing, which is a bad thing (because a crash - with a backtrace - is
better).

> If you always use at() and
> only in specific scenarios [] then at least your application will crash
> with a possibly good stacktrace or you can catch the exception in
> main(), prepare a meaningful error, log it, send it to your server and
> play jingle bells.

Thing is, you can't prepare a meaningful error in main(). You don't have
a backtrace at that point and you don't know which of the thousands of
at() calls threw. Sure, your application won't crash via signal, but, as
I said, that is a bad thing.

>>> No, the check gets optimized away if you do everything right.
>>
>> As I said above, you can't count on that.
> So what? Marginally lower performance if even measurable but safe from
> buffer overflows.

I don't want that marginal performance drop, especially given that I get
nothing in return.

>> You can't recover from an unexpected exception. And it is unexpected
>> by account that we're talking about a human mistake.
> You can: For e.g. a connection/parser/... you terminate the connection
> and log the issue. At the least you can catch it in main or some
> exception hook to log a backtrace and exit gracefully.

You could terminate the connection and things like that, but ultimately
that exception doesn't help you fixing the problem. When connections are
dropping like flies on your production server and you have no idea where
to fix the problem, I'd prefer to get just one backtrace of a crash
instead and then fix the bug in a matter of minutes.

>> The correct way of tackling mistakes is human reviewing, testing and
>> debugging. For the debugging part, a crash with a saved backtrace is
>> more useful than an exception with a meaningless error message like
>> "fixed_string::at", as many standard libraries like to do.
>
> You don't get a crash, at least not with a "meaningful backtrace" if you
> don't use exceptions. You get UB, a security issue and possibly a crash
> in some other part later instead of where the error was detected. And
> why would "human reviewing, testing and debugging" be more reliable than
> "human codewriting"? It is still a human and humans make mistakes.
> Better use a safety net.

With asserts enabled you'll get a crash at the point of error.
Hopefully, you'll discover the bug soon enough during testing. But even
in release builds, when asserts are disabled, chances are high the crash
will still point you to the problematic code. Of course, memory access
bugs are nasty and painful to deal with, but I honestly didn't have to
deal with one for quite a long time now, even though I've never used
at(). You're more likely to mess up memory when working with raw pointers.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk