Boost logo

Boost :

Subject: Re: [boost] [variant2] Andrzej's review -- design
From: Andrzej Krzemienski (akrzemi1_at_[hidden])
Date: 2019-04-05 09:06:49

czw., 4 kwi 2019 o 21:13 Emil Dotchevski via Boost <boost_at_[hidden]>

> On Wed, Apr 3, 2019 at 11:56 PM Mike via Boost <boost_at_[hidden]>
> wrote:
> >
> > > Going back to variant, in one case, we are defining an empty state,
> > which
> > > forces the user to provide explicit handling for that state,
> > *everywhere*.
> > You keep repeating that the user would have to explicitly handle that
> > everywhere. I thought I just showed you that this is not the case.
> > Except for visit, you don't have to explicitly handle the valueless
> > state at all
> Maybe you're thinking it's not a lot of work, but introducing a new state
> in any type has overhead. Every time you touch a variant object, you must
> consider the possibility of it being valueless. Your program must define
> behavior for that case. You have to write unit tests for that behavior,
> because people will be passing valueless variants when they shouldn't,
> because they can.

This is you view of the things. Let me offer a different one. I agree with
what your words imply: if possible avoid adding new state to the type, also
avoid any kind of degenerate or partially formed state. A type is always
more robust without these states. But sometimes we cannot afford to follow
this rule because of other constraints/objectives we have.

Let's consider a pointer: either a raw pointer or unique_ptr or shared_ptr.
Not only does it have the degenerate nullptr state, but what is far worse,
you get this state *by default*. I think this is the harm that pointers do,
as well as other types that use default constructor to form a degenerate or
a partially formed state. But even for these types would not go as far as
checking in every function that gets or returns one, if it is null. This is
what we have preconditions and postconditions for. If you split your
function into smaller sub-functions and you wanted to check for it in every
single sub-function it would kill your performance.

Now imagine an alternative design to a unique_ptr, let's call it
unique_ptr2. It differs in the set of constructors: it does not have the
default constructor, it doesnt have the constructor taking nullptr_t, and
in constructors that take a raw pointer, if a null pointer is passed an
exception is thrown. There is no way *initialize* this pointer to a null
pointer value, except for using move and copy constructor. The only way the
null pointer can get into this type is when unique_ptr2 is moved from. But
such case is special: either a move constructor is called on a temporary,
and the language semantics guarantee that only destructor will be invoked
and no-one else will observe the object, or the programmer explicitly calls
std::move(). But in the latter case programmers are already warned that
using such moved from object other than destroying or resetting it is
dangerous and could corrupt the program. This is because other member
functions of the type can have narrow contracts, they may be valid to be
called before the move, but invalid after the move.

If some function f() returns unique_ptr2, I will not be checking if it is
in the special state. I know it is technically possible, but I trust that
the author of f() has done her job, and does not do nasty things.
Otherwise, if she does nasty things, the value of a unique_ptr2 is the
least of my problems. The natural course of action to me is to trust that
every party does their part of the contract, and I can just have a global
assumptions that objects of this type when passed to or returned from the
function are never null. Technically the invariant allows the null pointer,
but I get the practical guarantee that I should never be concerned with it.

Someone could say, "but people will be returning null unique_ptr2, because
they can". But I do not consider this a reason to insert defensive checks
everywhere, or apply strange modifications to type unique_ptr2. I work
under assumption that people write code not just because they can, but that
they want to achieve some goal. And I assume that this goal is not to
corrupt the program. I may get the null pointer from f() if f() has a bug.
And it can even lead to UB in the program or a crash. But that bug needs to
be fixed inside f(), not by the users of f().

In the similar vein, I would consider it the wrong decision if someone
proposed to change unique_ptr2<T>, so that in constructor it preallocates
on the heap some T, and this T can be used when an object is moved from:
instead of nullptr, we assign the preallocated T, and owing to that even
after the move our object is never null.
It is possible to do it, but the run-time cost and the bizarre logic is not
worth the effect of stronger invariant.

My point in short: in this discussion we have to distinguish
partially-formed states that are easy to construct from those that are only
reached in circumstances that already require special attention.

> But what's the upside? Why is the empty state preferable to another, valid
> state? Either way we must define behavior, but in the latter case the
> behavior is already defined, there is nothing to do.

I do not mind never-empty guarantee in principle. My problem is that
providing it involves too much overhead. The costs outweigh the benefit. If
there was a way to provide the never-empty guarantee for free, I would be a
strong proponent of it.

template <class T>
> void f( std::vector<T> & x, std::vector<T> const & y )
> {
> x = y;
> }
> Above, if the assignment fails, the content of x is unspecified, but it is
> guaranteed to be a valid vector, containing valid objects (if not empty).

This is the type of code that I consider "requiring special attention". If
the function throws it leaves x in a valid but unspecified state. I do not
care much about "valid", but my concern is about "unspecified". I would
never want anyone to observe the state of this object. Therefore I would
rather change the signature of function f() like this:

template <class T>
std::vector<T> f(std::vector<T> const & y )
  return std::vector<T>{y};

So that it is guaranteed that upon exception no-one can see the value. Or
if such rewrite is impossible, I would make sure that the object referred
to by reference `x` is destroyed before any catch-handler tries to stop the
stack unwinding.

IOW, I consider the code that allows to observe the unspecified (but valid)
state the problem: not the unspecified state itself.


Boost list run by bdawes at, gregod at, cpdaniel at, john at