Boost logo

Boost :

From: Douglas Gregor (gregod_at_[hidden])
Date: 2002-03-26 10:12:04


On Tuesday 26 March 2002 04:26 am, you wrote:
> There have been many discussions about variant types / discriminated unions
> on Boost. At the moment I'm working on something where I could use an
> efficient, recursive variant type. Before I consider implementing a variant
> type myself, I'd like to know what is the status of current variant type
> implementations?

There doesn't seem to be anything in the works that can handle recursive
types.

> I also need to be able to build recursive variant types. Consider this
> Ocaml type definition:
>
> type value =
>
> | Int of int
> | Float of float
> | String of string
> | Aggr of value array
>
> ^^^^^
> Most of the variant type proposals I have seen do not support this.

... yet this may very well be the most important feature of variants, and
Ocaml is probably the gold standard for variant types.

> I came up with the following syntax:
>
> struct value : variant<value // B&N (probably needed)
> , int
> , float
> , string
> , aggr<value>
>
> {
> // ...
> };
>
> (This kind of recursion does not work if the data structure (in this case
> aggr<>) wants to instantiate the type.)
>
> This would, of course, require some constructors, etc... to be written by
> the user (those could be macroized). Another technique would be to use
> some kind of placeholders, but those seem to require special support for
> each data structure (in this case aggr<>), which is something I do not
> like.
>
> typedef variant<int, float, string, aggr< variant_self > > value;
>
> Does someone have an idea on how to support recursive variant types?

I think we can make the placeholders work, though perhaps under a different
syntax that above, e.g.:

  typedef variant<int, float, string, rec1<aggr, variant_self> > value;

Where:

  template<typename Arg1, typename Variant>
  struct maybe_map_to_variant {
    typedef Arg1 type;
  };

  template<typename Variant>
  struct maybe_map_to_variant<variant_self, Variant> {
    typedef Variant type;
  };

  template<template<typename> class T, typename Arg1>
  struct rec1 {
    template<typename Variant>
    struct map_to_variant {
      typedef T<typename maybe_map_to_variant<Arg1, Variant>::type> type;
    };
  };

  // rec2, rec3, etc. should be pretty obvious

As for actually handling the layout of the variant, here's a simple (but
non-optimal approach):
  - the size of the variant = max size of any non-recursive entry, or the
size of a void* (whichever is larger)
  - the alignment of the variant = max alignment of any non-recursive entry,
or the alignment of a void* (whichever is larger)
  - recursive types are stored on the heap

The non-optimality comes from the fact that some recursive types in a variant
will be no larger than a variant, and therefore don't have to be stored on
the heap. Shifting the example to something that more obviously has this
trait, consider:

type expr =
  | Int of int
  | Float of float
  | Negation of expr

So the negation of an expression is just an expression - the type is no
larger than, and should have the same alignment as, an expression; yet the
approach will allocate it on the heap. I can see two approaches to solving
this:
  1) Perform a 'test instantiation' on recursive types, using a variant that
assumes all recursive types need to be allocated on the heap. If the size and
alignment of the object are <= the size and alignment of the non-recursive
types in the variant, then this recursive type need not be allocated on the
heap. Other recursive types will, of course, have to be allocated on the heap
to avoid that pesky limitation that object sizes be finite :)

  2) Use (and write) that small_string_optimization class template we talked
about for Boost.Function a while back. The small_string_optimization class
template might look like this:
    template<int Size, int Align>
    struct small_string_optimization {
      // Allocate and construct a copy of the given T object. It will be
      // allocated within the small string optimization class if possible,
      // or on the heap if it is too large to fit in the class.
      // Return value: true if allocated in the small_string_optimization
      // class, false if allocated on the heap
      template<typename T> bool construct(const T&);

      // Destruct and deallocate the contained object of type 'T'
      template<typename T> void destroy();
      
      // Get a pointer to the contained object of type T
      template<typename T> T* get();
      template<typename T> const T* get() const;

      // Is type T small enough to fit without heap allocation?
      template<typename T>
      struct small_enough {
        BOOST_STATIC_CONSTANT(bool, value = /* ... */);
      };

      union {
        char data[Size];
        typename type_with_alignment<Align>::type align;
        void* heap_data;
      };
    };

  This handles recursive types automatically, and will also let
Boost.Function and Boost.Any include a general version of the small string
optimization without much hassle.

I'm solidly on the side of #2. I'm almost motivated enough now to write
small_string_optimization, but Signals is giving me "the look."

        Doug


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk