Boost logo

Boost :

From: David A. Greene (greened_at_[hidden])
Date: 2002-12-16 03:48:21


Over the weekend I realized just what strange and wonderful things
tuples are. This message is as much to get my thoughts recorded as it
is a request for discussion about the various classes and bits of
code presented within. Some of this stuff may be useful for Boost and
some may not. I'd like to hear what you all have to say. It may be
that I'm simply restating the obvious to some, but I found the whole
thing rather intriguing. I'll start from the beginning.

Last week I quickly wrote a prototype symbol manager for the students
in a compiler class I am helping to teach. The implementation is a
typical stack of symbol table objects so that names can be resolved at
the proper scope. Naturally, I wanted to provide an iterator
interface so they could examine the contents of the symbol manager.

Because each symbol table is an autonomous entity living on the scope
stack, there is no way to iterate "across" scopes. I've run into this
sort of problem several times and have long wished for an iterator
that can cross container boundaries. In the interest of time, I
hacked together an interface to allow the students to iterate at a
particular scope, but I'm not very satisfied with that solution.

Over the weekend I had a bit of free time so I wrote up a
spanning_iterator based on the iterator_adaptors framework. To throw
in a bit more complication, I wanted something generic enough that I
could span different kinds of container. There should be a way to
span a std::vector and a std::list as long as their value_types are
the same. The spanning_iterator holds two tuples -- one for the
"current" iterators and the other for the end iterators for each
container the spanning_iterator spans. The idea is that when the
current "current" iterator reaches its end (the corresponding element
in the "ends" tuple) the next iterator in the "current" tuple becomes
current and so on until all "current" iterators reach their ends.

Actually implementing the iterator_adaptor policies for this is
conceptually straightforward: simply walk the "current" tuple until an
iterator is found that is not at its end. Then simply operate on
that iterator (dereference it, increment it, etc.). Getting this to
work correctly with the tuple interface is non-trivial, however.

The fundamental problem is that it's inconvenient to iterate through a
tuple. All we have is the get<> template to access tuple elements.
Iterating is again conceptually simple -- just increment an index.
But the fact that get<> is a template implies the index must be a
compile-time constant, meaning we need a compile-time loop (or, if
you prefer, programmer-controlled loop unrolling) to iterate.

As I went through design after design, I realized why this was so
difficult to wrap my brain around: the tuple is neither a strictly
compile-time nor run-time entity. On one end of the spectrum, we have
MPL sequences. These are clearly compile-time only constructs. They
hold type information that is completely available at compile-time.
On the other end we have something like std::vector which provides
only a run-time interface.

Tuples sit somewhere in-between. They hold run-time values, but they
are indexed at compile-time. This means that the sorts of things we
want to do with tuples and how we want to iterate over them slide
along a spectrum -- they change depending on how we want to view the
tuple.

For example, in the spanning_iterator, I want to apply an operation to
a member of the tuple, but I don't know _which_ member until run-time.
Because the spanning_iterator can hold different types of iterator, I
don't even know at compile-time what kind of iterator is going to be
affected. This has two consequences: we must use run-time checks to
find the current element to operate upon and the operation itself must
be passed to the looping construct because the loop cannot return a
tuple element (how would we code the return type?). A static loop
does us no good because the execution condition (whether an iterator
is at its end) returns a run-time value. Furthermore, we need a loop
that can "early out" once it finds the correct iterator. We don't
want to go operating upon all of the iterators after the current one
as well.

Here's my solution: do_if_first. The idea is to have a loop with a
run-time condition that is checked for executing the body and
terminating and a compile-time condition for checking whether to
iterate. We need static iteration because tuples are indexed
statically. We need run-time execution because tuple values change at
run-time. The if_first part tells us that the loop will terminate
once it finds a tuple element that satifies the execution condition.

   namespace detail {
     template<typename Body,
              typename Result = typename Body::result_type>
     struct stop {
       // Called if no element satifies the execution condition
       static Result execute(typename Body::argument_type a) {
         return(Body::default_result(a));
       };
       // ... More execute members for different arity
     };

     // ... More stop for void return type

     template<typename Body,
              typename Result = typename Body::result_type>
     struct do_if_first {
       static Result execute(typename Body::argument_type a) {
         // Execute the body
         if (Body::execute_condition(a)) {
           // Early out
          return(Body::execute(a));
         }
         // Iterate
         return(boost::mpl::if_<typename Body::iterate_condition,
                               do_if_first<typename Body::next>,
                                // Note: call with current Body to
                                // prevent invalid tuple indexing
                               stop<Body> >::type::execute(a));
       };
       // ... More execute members for different arity
     };

     // ... More do_if_first for void return type
   };

   template<typename Body, typename Result = typename Body::result_type>
   struct do_if_first {
     static Result execute(typename Body::argument_type a) {
       return(detail::do_if_first<Body, Result>::execute(a));
     };
     // ... More execute members for different arity
   };

   // ... More do_if_first for void return type

Here's how do_if_first is used by the spanning_iterator dereference
policy:

   template<typename IteratorAdaptor, typename Argument,
            typename Result, int i>
   struct body_base {
     typedef IteratorAdaptor iterator_adaptor;
     // Assume base_type has methods for getting iterators out of the
     // "current" and "end" tuples.
     typedef typename iterator_adaptor::base_type base_type;
     typedef typename base_type::iterator_tuple iterator_tuple;
     typedef Argument argument_type;
     typedef Result result_type;

     // loop variables
     enum {
       first = 0,
       last = boost::tuples::length<iterator_tuple>::value,
       index = i,
     };

     // Don't want to instantiate with invalid tuple index. We only
     // iterate if the next index will be a valid one
     typedef boost::mpl::less<boost::mpl::integral_c<int, index>,
                             boost::mpl::integral_c<int, last - 1> >
     iterate_condition;

     static bool execute_condition(const base_type &a) {
       // Is the current iterator not at the end?
       return((a.template get_iterator<index>()
              != a.template get_end<index>()));
     };
   };

   struct spanning_iterator_policy {
     // ...
     template<typename IteratorAdaptor, int i>
     struct dereference_body :
       public body_base<IteratorAdaptor,
                        const typename IteratorAdaptor::base_type &,
                        typename IteratorAdaptor::reference,
                        i> {
       typedef body_base<IteratorAdaptor,
                         const typename IteratorAdaptor::base_type &,
                         typename IteratorAdaptor::reference,
                         i> base_type;

       typedef dereference_body<typename base_type::iterator_adaptor,
                                index + 1> next;

       static typename base_type::result_type
       execute(typename base_type::argument_type a) {
         // Remember, only called if get_iterator<index>() is not at end
         return(*a.template get_iterator<index>());
       };

       // Gets called if all iterators at end. Gets called with the
       // iteration state corresponding to the last "current" iterator.
       static typename base_type::result_type
       default_result(typename base_type::argument_type a) {
         assert(!"No valid dereference candidate");
         // Satisfy the compiler
         // Would probably (hopefully!) crash
         return(*a.template get_iterator<index>());
       };
     };

     template <class IteratorAdaptor>
     typename IteratorAdaptor::reference
     dereference(const IteratorAdaptor& x) const {
       // Start iterating (unrolling) at tuple index 0
       return(do_if_first<dereference_body<IteratorAdaptor, 0> >::
              execute(x.base()));
     };
   };

Similar constructs are used for increment and decrement. I haven't
yet made spanning_iterator smart enough to present itself as the
"weakest" category of iterator contained in its tuples, but that
should be a relatively easy matter.

After thinking about this some more, I realized that several different
looping constructs could be useful for tuples depending on whether
they are viewed "more statically" or "more dynamically." Here's a
(likely partial) list:

   do_all : operate on all tuple elements
   do_if_all : operate on all tuple elements satisfying some condition
   do_if_first : operate on first element satisfying some condition
                 ("early out")
   do_until : operate as long as some condition doesn't hold
                 ("early out")
   do_after : operate only after seeing that an element satisfies
                 some condition ("late in")

We could imagine more complicated loops such as do_if_until, etc.
Then do_until simply becomes do_if_until with an always-true execute
condition. We can then imagine coding similar loops that are more
static (executing on elements with even indicies, for example, making
the execute_condition static as well). All these loops are likely
useful outside of the tuple domain but I haven't put any thought into
that.

Is any of this stuff useful for Boost? I think spanning_iterator
is a useful thing and a (relatively) conveneient way to iterate
over a tuple is nice to have. What are your thoughts?

                          -Dave

-- 
"Some little people have music in them, but Fats, he was all music,
  and you know how big he was."  --  James P. Johnson

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk