Boost logo

Boost :

From: John Max Skaller (skaller_at_[hidden])
Date: 2001-08-21 19:55:10


Peter Dimov wrote:
 
> We're using the 'variant' term in different ways, I suppose.

        Probably. That's why my previous rave describing what
a variant 'is'.

> I know about ML discriminated unions. They can be approximated in C++ by
>
> discriminated_union<xml_text, xml_tag> u;
>
> but the approximation obviously isn't a real ML union since it isn't
> statically checked.

        Yes.
 
> My variant<> (and Kevlin Henney's boost::any) solve a different problem.

        They can, _sometimes_ be used to solve an interesting
class of problems for which there is no other solution,
that is, where dynamic typing is actually required by the design.
But the basic requirement is for a discriminated
union over a finite list of cases.

        When I say basic, I mean two things:

        1) that this data structure
is fundamental mathematically. Without it, your programming
language is assuredly broken, it can't even begin to be used
to build data structures -- which are needed both in themselves,
and also for building representations of primitives (class types).

        2) 90% of design requirements call for closed unions.
90% of the balance call for a closed union including
a 'don't know what to do with this' case.

        Just about the only time you _really_ need dynamic
typing is where you must dispatch to unknown code,
which can be upgraded while the program is running.
That obviously precludes static checking.
The need is more common than actual solutions :-)

> They are used to implement dynamically checked polymorphic values

        Yes, I know, but this data structure is VERY weak,
and should be avoided at all costs. (Static completeness
requires a 'default' on every decoding)

        It provides neither abstraction, which admits polymorphism
of the virtual function kind, nor closed unification of the ML variant
kind, but instead, it provides an open unification: this is not
abstraction, and so there is NO polymorphism here. And it is not
finite unification, so there is NO unifying type for which one
can code closed functions.

        I.e. it breaks the open/closed principle,
one of the most fundamental tenants of good design
and robust software construction.

        So what is it? In theory, its a an infinite sum.

        Usually, its a serious design fault.
The reason is the usual rationale for using it:
one handles a set of messages/transactions using this technique,
so that if new ones get invented, the program just ignores
them -- but it still keeps running.

        In an accounting program, this is guarranteed to be
an error. If you add a new transaction type, you must modify
the whole program to cope with it (or your books won't balance).

        In a messaging system, it is almost always an error.
You just can't ignore part of a coherent protocol and expect
communications to continue.

        Sometimes, however, it is the only way to cope.
Correctness is less important than service delivery.
A web browser that ignores tags it doesn't understand
is an example. An interscript weaver that ignores method
calls that it doesn't understand is another (in both
cases the specifications require correct interpretation
of those tags that are decoded, and totally ignoring
those that are not).

        So yes, I agree, _sometimes_ the idiom is useful,
especially in continuously running real time systems.
The real problem is, people are going to use it in the 99%
of cases where its the wrong solution -- because there is
no better solution -- and much MUCH worse, they're going
to think that it is the correct solution.

> I have found the non-ML, dynamically typed variant quite convenient,
> especially when porting, for instance, PHP code to C++. It also enables
> "poor man's typesafe vararg functions" taking std::vector<boost::any>.

        Yes. I agree. The idiom is useful for dynamic typing,
with constraints, of which finiteness is not one.
In effect, the dynamic cast to the base is a way of testing
set membership of some open union, providing a little
more discrimination than Java's 'Object' (an upcast that
always succeeds in Java :-)

        BTW: I'm NOT against the library, I'd just like to
see a proper description of when it's theoretical status justifies
correct usage: dynamic typing is useful, but obviously should
be avoided wherever possible, precisely because it delays
checking until run time that which often can be checked
at compile time.

        In practice .. this is even worse than in theory,
because most programmers don't know any theory. :-)

        BTW2: In Ocaml, there is a new kind of variant
called a 'polymorphic variant'. Basically, you can code
anything you want: invent new variants at will, combine
any set of them into a type. The difference is, that
although the set is open, every single usage is
_still_ statically checked (every 'local' usage is
either finite or has a default)

-- 
John (Max) Skaller, mailto:skaller_at_[hidden] 
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
New generation programming language Felix  http://felix.sourceforge.net
Literate Programming tool Interscript     
http://Interscript.sourceforge.net

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk