Boost logo

Boost :

From: Fernando Cacciola (fcacciola_at_[hidden])
Date: 2002-06-12 09:01:49


----- Original Message -----
From: "Douglas Gregor" <gregod_at_[hidden]>
To: <boost_at_[hidden]>
Sent: Wednesday, June 12, 2002 12:40 AM
Subject: Re: [boost] Proposal --- A type-safe union

> [snipped]
> > What about automatic casting from non-specified types?
> > (For example: initializing a union of <float, string>
> > with a double value). Personally, I think this
> > behavioral detail should be specified by a policy.
>
> I agree that we need implicit conversions to types stored in the union
(e.g.,
> your variant<float,string> implicitly constructed from a double example),
but
> I don't agree that this should be a policy. The reason will be discussed
> below where it comes up again.
>
I agree here.

> > In general, I see four separate, policy-controlled,
> > aspects:
>
> I try to avoid policy-based designs when possible, because they introduce
a
> large amount of variance in the semantics of a single component. I'll
discuss
> each of the four potential policies below:
>
> > 1) Storage: (i) boost::any (ii) on-stack
>
> When does one prefer the boost::any storage method instead of the on-stack
> method? The on-stack approach is a lightweight approach that closely
> resembles C-style unions; in the cases where variant objects are too large
to
> put on the stack, the variant objects can easily be allocated on the heap
> (just like we would do with a C-style union). The only cases where
on-stack
> allocation isn't feasible occur with recursive or incomplete types
(discussed
> later...). So, I would say that we should only use on-stack allocation but
> allow an 'escape' to heap allocation for recursion and incomplete types.
>
I'm not even sure if we *need* heap allocation (see below). If we don't
*need* it, I'll vote for stack-allocation only.

> > 2) Automatic casting: (i) never, (ii) cast to the
> > first type possible, (iii) cast if only one possible
> > type found
>
What means 'automatic' casting? If it means implicit conversions, I think
this could be achievable by having the set of operator Tn() in the variant.

> How about:
> (iv) use overload resolution to choose the best type, or fail if no best
> type exists
>
Maybe I didn't understand point (2), but how would we use overload
resolution and what for?

> Again, I don't see a need for a policy here. Overload resolution is a C++
> feature that every C++ programmer is aware of and it seems the natural
> solution for this component. Here are the reasons why I reject the other
> three:
>
> (i) implicit conversion is a feature of C++. Suppressing implicit
> conversions should be performed by user types, not by the variant, because
> the set of implicit conversions for a type is a property of the type
itself,
> not a container of that type.
I agree that a variant should be implicitely convertible to any of its
types.

> (ii) this is very unnatural. You could end up with something like:
> variant<int, float> v;
> v = 3.14159; // v stores the integer 3
> (iii) this is reasonable, but again: overload resolution is already
> understood as a mechanism for finding the best alternative, so why not
reuse
> it?
>
> Side note: the desire for overload resolution is why I am unsure of a
purely
> typelist-based approach. Getting all of the right overload candidates
there
> given just a typelist is horrific.
>
> > 3) Assignmenmt error: (i) Compile-time error, (ii)
> > initialize with default value of first type
>
> I don't understand why (ii) would ever be desirable. We should be checking
for
> errors as early as possible.
>
I agree that (i) should be the only option.

> > The fourth aspect relates to the issue of variant to
> > variant assigmnet:
> > Variant<List1> u1;
> > Variant<List2> u2;
> > ...
> > u1 = u2; file://Problem(?): u2's held type may be
> > invalid
> > // for assignment into u1
> >
> >
> > 4) Variant to variant assigmnet error: (i)
> > Compile-time error if List2 is not a sub set of List1.
> > (ii) Run-time error (iii) assign the default value of
> > the first type on List1
>
> How about:
> (iv) compile-time error if for any of the types of the source variant
there
> is no 'best candidate' type in the target variant. So something like this
> should be fine:
>
> variant<int, float, double> v1;
> variant<long, double> v2;
> v1 = v2; // okay
>
Agreed.

> Again, I can't see a reason for using (ii) or (iii) because we should be
> checking for errors as soon as possible. The behavior of (iii) could drive
> one positively mad during a late-night debugging session. (i) is okay, but
> overly strict if we want implicit conversions (I do).
>
Agreed.

> >
> > [snip]
> >
> > > * Recursive types should be allowed, i.e., a
> >
> > variant can hold a value of a
> >
> > > type that is constructed using the variant type
> > > * Incomplete types should be allowed.
> >
> > Can you clarify on these two points?
>
> I'll try to clarify by example. The canonical example for the use of
recursive
> types in variants is an expression tree. In an expression tree, each
> expression node can be, for instance, a literal value (e.g., '2'), a
variable
> (e.g., 'x'), or an operation on other expressions (e.g., 'expression +
> expression'). The last of these requires the expression type to be
recursive,
> because the types it can store depend on the expression type itself. If we
> were using C-style unions, we might represent an expression like this:
>
> struct expr {
> enum { et_literal, et_variable, et_plus, et_negate } type;
> union {
> int literal;
> variable_t* variable;
> std::pair<expr*, expr*> plus;
> expr* negate;
> } value;
> };
>
> We'd like to do the same with our generic discriminated union type. I
> personally aspire to have a variant type that is similar in expressive
power
> to that of ML, where the above might be written as:
>
> type expr =
> | Int of int
> | Var of variable
> | Plus of expr * expr
> | Negate of expr
>
> >From the implementation point of view, recursive types aren't hard to
deal
> with: just don't try to allocate them on the stack. The complexity is
stating
> the recursion when using the variant type. My personal preference is to
use a
> pseudo-keyword 'rec', e.g.,
>
> typedef variant<int, variable_t*, Plus<rec, rec>, Negate<rec> > expr;
>
> As for incomplete types, refer back to the C-style union version of
'expr'.
> The variable_t type doesn't need to be defined because we're only holding
a
> pointer to it. If we could specify that a particular type is incomplete
> (therefore requesting heap allocation), then we could easily use
incomplete
> types with the variant. It might look like this:
>
> typedef variant<int, incomplete<variable_t>, Plus<rec, rec>, Negate<rec>
>
> expr;
>
>
I don't know about incomplete types.
In my view, T* is not an incomplete type; it is a perfectly complete type of
pointer type.
If I put a pointer to T in a discriminated union, I want the 'pointer', not
the 'T' instance, to be represented by the union.
That is, the union deals with the pointer value only; to whatever it is
pointing to, and how that is allocated, is IMO, totally unrelated.
I don't think there is a really good reason to have a truly-incomplete type
in an union.

As for recursive types, the tag approach (using 'rec') would allow variant
to be fully stack-based. No need for heap allocation within variant.

> > > - The visitor is fine, but there are simpler
> >
> > syntactic constructs that could
> >
> > > be used. For instance, instead of 'visit_at' just
> >
> > use the function call
> >
> > > operator.
> >
> > The function call operator cannot be static, which is
> > somewhat restrictive.
>
> How so? Instead of a static function call operator, just default-construct
an
> empty class.
>
> > > Also, the 'Visitor' class can be an implementation
> >
> > detail with a
> >
> > > (more natural?) syntax like:
> > > Super_union<...> u0;
> > > // ...
> > > u0.visit(size_visit());
> >
> > This syntax is indeed more natural, when the visit
> > policy is stateless. However, if we make go() to
> > return *this, we end up with a very clean way for
> > applying the operation to multiple objects:
> >
> > typedef Super_union<The_list> Concrete_union;
> > Concrete_union::Visitor<size_visit>::type
> > size_visitor;
> >
> > // ...
> >
> > int result =
> > size_visitor.go(u0).go(u1).go(u2).total_;
>
> Have you needed this type of behavior before? I can't come up with any
> scenarios where it would help. Does it make the code shorter or more
obvious?
> Just the calls to 'go' don't tell the whole story, because there is a
> significant amount of overhead (in C++ code, not at run-time) with this
style
> of visitor: the visitor's 'visit_at' routine must be templated over the
> variant type and the 'Visitor' type generator must be used. Does the
benefit
> of easy application to multiple variant objects outweigh the more
complicated
> interface? If yes, there is a second part: does this benefit also outweigh
> the cost of create a new type of interface (visit_at and the Visitor type
> generator) instead of an interface based on a well-understood concept (a
> function object)?
>
I agree with Doug here. It seems to me that a multiple-visitor is merely a
syntatic aid. Being so difficult to implement, I don't think it worth it.

> Following is a partial sketch of the interface I have in mind. I've
omitted
> details that we haven't really discussed yet or that I haven't thought
about
> long enough to have a strong opinion on (tagged types, recursive types,
> stating type incompleteness, etc.).
>
> Doug
>
> template<typename T1, typename T2 = unused2, ..., typename TN = unusedN>
>
I'd like you to elaborate on the need of an extensive-list instead of a
dot-list.

> class variant
> {
> public:
> variant(); // default-construct value of type T1
>
> variant(const T1& t1); // assign value t1
> variant(const T2& t2); // assign value t2
> // ...
> variant(const TN& tN); // assign value tN
>
> ~variant();
>
> void swap(variant& other); // swap values
>
> variant& operator=(const T1& t1); // assign value t1
> variant& operator=(const T1& t1); // assign value t1
> // ...
> variant& operator=(const TN& tN); // assign value tN
>
> variant& operator=(const variant& other); // assign value from other
>
> // semantics described above
> template<typename OT1, typename OT2, ..., typename OTN>
> variant& operator=(const variant<OT1, OT2, ..., OTN>& other);
>
> const std::type_info& type() const; // typeid(type of current value)

What is the 'current' value of a variant?

> bool empty() const { return false; } // boost::any compatibility

I'll add:

operator T1 const& () const ;
operator T1& () ;

etc...

Fernando Cacciola
Sierra s.r.l.
fcacciola_at_[hidden]
www.gosierra.com


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk