Boost logo

Boost :

From: William E. Kempf (wekempf_at_[hidden])
Date: 2002-12-15 10:04:28


Fernando Cacciola said:

[ snip long description on design ]

> Suppose we go on with a 1-Element-Sequence model:

BTW, that's what I'd do, as it's more intuitive.

> Since the sequence contains at most one element,
> and since ref() is already undefined for the case the sequence is
> empty(), one possibility would to replace ref() by operator T&()
> provided the initializing
> constructor is explicit and empty() is not replaced by a conversion to
> bool (safe or not).

A bad idea on many levels. This "implicit cast" can lead to ambiguities
(and I'm still going to assume the presence of the safe-bool, as I think
it's the valid choice). If nothing else, it's easy to wind up with a cast
in cases you didn't expect, which if the optional were uninitialized would
lead to unexpected undefined behavior. There's a reason I used ref() in
my interface suggestion.

> bool optional<T>::empty() ;
> operator T const&() const ;
> operator T& () ;
>
> This would look on the user code as:
>
> optional<int> foo() { return optional<int>(3) ; }
> void bar()
> {
> optional<int> f = foo();
> if ( !f.empty() )
> {
> bar(f);
> f = 2 ;
> ...
> if ( f == 2 )
> gee(f);
> }
> }
>
>
> This interface gives optional<> an almost strict value semantics.
> It would almost look like an ordinary T variable, with the only
> exception being the explicit empty() call.
>
> AFAICT, this interface is completely consistent and provides completely
> sound semantics.
>
> But there is a problem... And the only way for me to show you the
> problem is to ask you to trust the fact that I've been using different
> versions of optional<> for more than two years on dozens of different
> projects and I had been caught already by most of the traps that some
> interfaces hide.
> {I've just counted 436 appearances of the word 'optional<' in my code
> base}

I don't have to trust you on this to believe the following ;).

> I originally used _a lot_ the interface shown above.
> The problem with this interface is precisely the fact that it shows
> ordinary value semantics. IOWs, is the fact that it really looks like an
> ordinary variable were the trap is hidden. It is too easy for a dumb
> programmer like me to forget about testing the initialization state.
>
> IOWs, when you see code like this:
>
> opt = 2 ;
> int n = opt ;
> if ( opt == 3 )
> foo(opt)
> etc...
>
> it is hard, or at least it was hard for me, to keep in mind that
> unlike ordinary values, 'opt' might have no value at all.
> That is, I used to loose track of the fact that all the expressions
> above can have undefined behavior; they look too familiar
> for me to remember that; and the _declaration_ of opt is
> usually not near enough to account for the lack of context.

Get rid of the implicit casts and return to ref() and I believe these
kinds of mistakes will go away. The explicit code required to get the
value out should stand as a reminder that there may be no value to get.

> As a side note, notice that that empty() must remain explicit
> if operator T&() is used instead of ::ref(), otherwise, expressions like
> if ( opt ) or ( opt == 0 ) would be ambiguous.

Definately, but if you go back to ref() to prevent the misuses that would
occur with an implicit cast then the safe-bool becomes, once again, a
viable and useful addition.

> I believe that is is always important when syntax alone comunicate
> meaning. This is were the 'operator T&()' interface fails. And I _had_
> been caugth by this so much as to go through the trouble of changing it.

Yep... but I never suggested that interface ;).

> Fortunately, a possible remedy to the problem of lack of
> context teeling that expressions like:
>
> (opt = 2 ), or ( int x = opt), or (opt == 2 )
>
> can be undefined, is to keep the explicit ref() member function
> as William suggested.
> This way, the expressions would look like:
>
> (opt.ref() = 2 ), or ( int x = opt.ref()), or (opt.ref() == 2 )
>
> and at least won't look decievenly familar.
>
> Therefore, I can see that an interface with a explicit member function
> used to access the possibly uninitialized value is pretty sound.

Good.

> Suppose we go on with an explicit member function as the value access
> method. How about safe_bool() as a replacement for empty()?

I wouldn't go as a total replacement. I believe some people will prefer
the use of empty(), since it can make the code easier to understand. I'd
leave both.

> Safe_bool will allow the familiar idioms:
>
> if ( opt )
> if ( !opt )
> if ( opt == 0 )
> if ( opt != 0 )
>
> The first form, which is probably the most used, comunicates exactly
> what it means.
> But the second form, using operator ==, appears to me confusing for
> value-semantic objects because a comparison against 0 as indication of
> uninitialization is typical of pointers but not of containers.

Some would consider it confusing, and they'd just avoid using it. Others
won't, and I see no reason to prevent them from using this.

> No container-like object that I know of uses that idiom to indicate
> emptyness() so I believe it would we wrong to have it in this sort of
> interface
>
> As a conclusion, I agree that it is possible to model optional<>
> entirely as a 1-element-sequence provided that its
> interface uses explicit member funcions for both value access and
> emptyness testing.

I can live with that, but I honestly see no reason to avoid the safe-bool.

> I notice in particular that this interface allows for a consistent
> definition of relational operators.
>
> --CHECKPOINT-- Do we agree this far?

I do.

> The quest for a pointer-like interface:
>
> Let's get back to the point I realized that:
>
> if ( opt == 3 )
> int x = opt ;
> opt = 1 ;
> opt.foo();
>
> had a syntax too familiar with non-optional values, and hidden the fact
> the any of the expressions above would be undefined had 'opt' been
> uninitialized.
>
> William's interface do not show this problem because the syntax is not
> familiar.
>
> Back then, I wondered which familiar syntax conveys the precise notion
> that the expressions above can be undefined unless the 'initialized' or
> 'value-exist' condition is met.
> I've noticed that pointers do just that.
> When you see:
>
> if ( *opt == 3 )
> int x = *opt ;
> *opt = 1 ;
> opt->foo();
>
> it is clear by syntax alone that 'opt' _must_ refer
> to an existing value and that this might not be the case
> so you know that a test for this condition must be present
> somewhere before the expressions are reached.
>
> If we think about this per see, we should agree that this interface is
> even more comunicative than the .ref() interface:
>
> if ( opt.ref() == 3 )
> int x = opt.ref() ;
> opt.ref() = 1 ;
> opt.ref().foo();
>
> The above indicates that there is something 'peculiar' with 'opt', but
> it doesn't comunicate what.
> OTOH, the pointer-like interface is self-evident.
>
> I concluded then, and I still think, that
> for the possible unexisting value access operation,
> operator*() and ->() allows for a very familiar and comunicative
> syntax which expresses _exactly_ the semantic of the operation.

I don't agree. ref() on it's own may not indicate anything, but since the
type's name is "optional", I think everything is conveyed nicely. On the
other hand, operator*() and operator->() are only unambiguous in meaning
when the type fully models a pointer, which optional can not.

> It is not just a minor convenience IMO.
>
> And to be honest, the fact that this interface allows optional<>
> to be replaced by a pointer is not the reason why I adopted it,
> and having though about droping it, I realized that it is not
> even the reason why the pointer interface is really good.
> It is really good because it comunicates _clearly_ and _each time_ you
> access an optional value that the operation will be undefined unless the
> initialized condition is met.
> For dumb programmers like me, this is a very important feature
> of the interface; and I know this based on experience, since I stop
> being caught often by uninitialized optionals<> once I adopted it.

Are you sure that you've not just become used to the interface, and thus
don't find the "mixed metaphor" intuitive? Did you ever use an interface
with an explicit ref() and actually find it difficult to understand?

> Anyway, I concede that the pointer interface is really just a visual
> aid. And of course, smart programmers don't need this aid, and can
> safely deal with whatever consistent interface you gave them.
> But I'd like to stress out that this particular aid is significantly
> important.

I'm not so sure.

> Anyway, I notice that _only_ operator *() and ->(), which are used to
> access the value,
> are the part of the pointer-like interface that I consider really
> important. operator safe_bool() is just handy and it is consistent with
> optional<> (a value-based container) even though it allows expressions
> of the form:
>
> if ( opt == 0 )
>
> just because it goes along with the pointer-like interface.
> Because this expression (with the intended meaning) is used with
> pointers. I think that if the pointer-like interface is dropped, this
> should be dropped too.

By dropping bool conversion entirely? I think you'll find this
inconvenient, especially with Mr. Dimov's classic:

if ((opt = foo()) != 0)

> The problem with the pointer-like interface.
>
> Mostly thanks to William Kempf, we've came to realize
> that the pointer-like interface makes optional<> appear like a
> pointer, while it is not, and that this false appearance
> evidences itself when we face the fact that the
> relational operators that are perfectly defined
> for optional<> per see, since it is a value-based container,
> conflicts with its pointer-like interface.
>
> Thinking about this problem we realize that operators *() and ->() shape
> optional<> into a pointer-like thing even though it isn't.
>
> It is clear to me after much discussion that optional<> is
> not a pointer at all but a container.
>
> OTOH, it is true that pointers can either be null or point to a valid
> object,
> so it is true that there is some analogy between a pointer and an
> optional<>.
> This analogy is what makes operators *() and ->() viable options.
> However, optional<> should not be considered a pointer just like
> an iterator should not be considered a pointer either.
>
> The semantics of value access for optional<> is clearly defined,
> thus the following are just three different syntactic variants
> of the same:
>
> (1) (opt = 2), ( 3 == opt), ( x = opt )
> (2) (opt.ref() = 2), ( 3 == opt.ref()), ( x = opt.ref())
> (3) (*opt = 2), ( 3 == *opt), ( x = *opt)
>
> I argued that
>
> (1) is problematic
> (2) it only lacks self-evident syntax.
> (3) is ideal (in itself).
>
> However, operator*() should be seen on optional as a visual aid and not
> as implying that optional<> is a pointer.
>
>
> OTOH, I came to figure that resembling a pointer w.r.t to value access
> is a good idea because it is true that pointers do model perfectly well
> the notion of optional pointees.
> IOWs, there exist the well defined concept of OptionalValue as Augustus
> shown.
> This model is well defined by a part of pointers interface:
>
> operator*, operator-> and conversion to bool.
>
> And pointers have historically been used to deal with optional
> objects precisely because they model this concept.
>
> Therefore, it make sense to me to have optional<>, a container with
> value semantics, models the OtionalValue concept, provided that it's
> clearly understood (and documented) that it is a container
> with value semantics and not a pointer.
>
> Of course, having a value-semantic container model a concept which is
> defined via a pointer-like interface has the drawback that it makes the
> appearance that it is a pointer.
> This is an important drawback which definitely justifies reconsidering
> to which extent it is convenient to model the OptionalValue concept in
> optional<>.
>
> A pro of modeling this concept is that allows optionals<> to be
> interchanged with true pointers; but I concede that this pro is not
> really that much important.
> A con is that it could make the programmer believe it is a pointer.

And it makes relational comparisons either non-intuitive, or difficult to
use! That's key, I think.

> Since it is fundamental that optional<> is not mistaken to be a pointer,
> but rather a value-based container modeling the OptionalValue concept,
> proper documentation is not enough.
> It is required that the interface of optional<> that looks like a
> pointer follow _exactly_ pointer semantics. This way, if it comes to
> ocurr that a programmer did think that optional<> is a pointer, the code
> he wrote won't behave abnormally.

But you can't do this with relational operations.

> A way to measure this potential problem is to consider what would happen
> if optional<T> were actually be replaced by T*
> With the current definition of the OptionalValue concept, the code will
> have _exactly_ the same effect, so the potential danger of mistakenly
> consider optional<T> as a pointer is conceptual but not practical.
>
> However, and very unfortunately, this _requires_ the properly well
> defined relational operators to be disallowed, because they can
> effectively create practical problems if optional is mistaken for a
> pointer and used, for example, to test for aliased equivalence as you do
> when you compare pointers.

So, just to keep pointer-like operations you're going to make the
interface difficult to use for many valid use cases?

> As a general conclusion, the final decision boils down, as
> Augustus said, to weight the convenience of this syntax:
>
> if ( opt ) // or if ( opt != 0 )
> {
> ...
> ...
> if ( opt2 && *opt == *opt2 )
> foo(*opt)
> }
> else
> {
> ...
> ...
> opt.reset(3);
> }
>
> against the convenience of the more conceptually consistent:
>
> if ( !opt.empty() )
> {
> ...
> ...
> if ( opt == opt2 )
> foo(opt.ref())
> }
> else
> {
> ...
> ...
> opt.reset(3);
> }
>
>
> My experience shows that comparions of optionals is _extemely_
> less frequent than testing for initalization and accessing
> the value; so I really prefer the pointer-like interface
> with poisonned relops.

Well, you have more experience than I do, but I can't see how this would
be the case.

> Provided the documentation is reworked to clearly state
> that optional<> is a value-based container and not a pointer.
>
> Anyway, the viable choices are, AFAICT:
>
>
> (A) Strict container-like interface:
>
> optional () ;
>
> explicit optional ( T const& v ) ;
>
> optional ( optional const& rhs ) ;
>
> template<class U> optional ( optional<U> const& rhs ) ;
>
> optional& operator = ( optional const& rhs ) ;
>
> template<class U> optional& operator = ( optional<U> const& rhs ) ;
>
> bool empty() const ;
>
> T const& ref() const ;
> T& ref() ;
>
> friend void swap ( optional<T>& x, optional<T>& y ) ;
>
> bool operator == ( optional<T> const& x, optional<T> const& y ) ;
>
> bool operator != ( optional<T> const& x, optional<T> const& y ) ;
>
> (A) Pointer-like interface to model OptionalValue concept:
>
> optional () ;
>
> explicit optional ( T const& v ) ;
>
> optional ( optional const& rhs ) ;
>
> template<class U> optional ( optional<U> const& rhs ) ;
>
> optional& operator = ( optional const& rhs ) ;
>
> template<class U> optional& operator = ( optional<U> const& rhs ) ;
>
> T const* operator -> () const ;
> T* operator -> () ;
>
> T const& operator* () const ;
> T& operator* () ;
>
> T const* get() const ;
> T* get() ;
>
> operator safe_bool() const ;
>
> friend void swap ( optional<T>& x, optional<T>& y ) ;

I can live with any choice, so long as I get an optional ;). But I do
think (now) that the container design is more sound, and more useful.

William E. Kempf


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk