Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2002-08-01 11:28:24


From: "Paul Mensonides" <pmenso57_at_[hidden]>

> > You have to prove, or at least, explain a lot more deeply, this point
> > precisely. Why different sequence-types are undoubtly useful in
run-time,
> > but not in compile-time? It all boils down to this.
>
> Primarily, because the reasons that we have multiple run-times containers
don't
> apply to functional programming. For example, the only reason that you
wouldn't
> always use std::vector as your sequence type is that insertion requires
copying
> all of the elements and possibly reallocating all of the memory. You use
> std::vector for fast access, and push_back is on average very efficient.
This
> is unlike template metaprogramming in nearly every regard. Insertion
*always*
> requires copying, and (if you count the innards of the compiler) *always*
> requires reallocating. You use std::list to avoid this cost of insertion.
It
> is fast because it only has to *modify* a few pointers, but this _is not
> possible_ in a functional environment. You have to copy the entire
sequence
> anyway. Furthermore, this copying is not particularly expensive because
it is
> mostly just copying a reference into some compiler's internal structure
of
> identifiers.

I think this is a slightly hasty analysis. In general we need to know a bit
about the internals of the compiler to say what goes on. For example,
insertion in a dot-style list may only require copying the head part of the
list:

Original list:

  +---+ +---+ +---+ +---+ +---+ +---+
  | A +-->| B +-->| C +-->| D +-->| E +-->| F +-->*
  +---+ +---+ +---+ +---+ +---+ +---+
                    ^
After inserting X: |
  +---+ +---+ +-+-+
  | A +-->| B +-->| X |
  +---+ +---+ +---+

It's easy to imagine a compiler implementation which models this
representation internally.

It's also easy to imagine an implementation which uses an array of
pointers-to-types to represent the parameters of a template instantiation.
So, while technically there are O(N) copies involved in any vector
transformation, those would be fast pointer copies and the
newly-instantiated vector<...> type would be the only dynamic allocation
required internally:

Before:
  +---+---+---+---+---+---+---+
  | . | . | . | . | . | . | . |
  +-|-+-|-+-|-+-|-+-|-+-|-+-|-+
    | | | | | | |
    | | | | | | |
    | | | | | | |
    | | | | +---+---+
    | | | | |
    V V V V V
    A B C D E nil
    ^ ^ ^ ^ ^ ^
    | | | | | |
    | | | | | +---+
    | | | | | | |
  +-|-+-|-+-|-+-|-+-|-+-|-+-|-+
  | . | . | . | . | . | . | . |
  +---+---+---+---+---+---+---+
After^

It's hard to overestimate the impact of the low constant factor on this
sort of O(N) operation on linear structures. That's part of the reason
sorted vectors are so effective. We seldom notice that insertion in a
sorted vector is linear.

> Given that this is the case, the most efficient solution would be to
always use
> a tuple-like (i.e. vector-like) container that can hold a huge number of
> elements.

I'm not sure. When things get really large, push_front on a cons-style list
could easily be faster than copying all those pointers.

> On the other hand, the most extensible solution would be to use a
> cons-style list. One or the other is fine, but both is unnecessary.

I agree with you and Andrei that some numbers would really help with the
analysis. However, I think even with the numbers I'd be reluctant to give
up the flexibility at this stage, especially if it's not costing us much.
There are other reasons to be able to work with different sequences. For
example, if someone already has a program full of Loki typelists and wants
to use an MPL algorithm, they can do it. A program that manipulates
boost::tuple can treat the tuple's types as an MPL sequence. I realize that
these all have cons-style structures, but their types /are/ expressed
differently from mpl type_list, and from one another. Not being hardcoded
to use a single type list structure still has great advantages, and as
Emily said I appreciate the flexibility to experiment with other
structures.

-Dave

-----------------------------------------------------------
           David Abrahams * Boost Consulting
dave_at_[hidden] * http://www.boost-consulting.com


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk