Boost logo

Boost :

From: Dan Rosen (dan.rosen_at_[hidden])
Date: 2005-03-27 22:25:52


Hi all,

Over the past few weeks I've been slowly working on (what I hope to
be) a candidate for a boost::tree. I've reviewed the comments on
previous submissions by Kasper Peeters
(http://www.damtp.cam.ac.uk/user/kp229/tree) and Justin Gottschlich
(http://nodeka.com/TreePart1.pdf), and I'm trying to put together a
set of concepts to address some of the issues that were discussed
here.

Of the two "how to design a generic tree container" camps, I'm finding
myself mostly in the one where individual nodes are not exposed to
users, and the tree is an opaque structure. Also, nodes are not
identical to iterators, and the iterator concept is not extended in
any significant way from the standard. (Anybody who would like more
detail, I'm happy to provide my draft of the concepts, but I don't
want to get into that detail here -- it's not particularly relevant to
my questions that follow.)

I feel like I've been making decent progress so far, mostly working on
paper, just documenting the concepts, and doing some coding exercises
for proofs of concept. But I don't have anything I'd consider ready
for primetime yet -- not even remotely -- and in fact as I progress
more solidly into the coding phase, I'm finding myself getting
somewhat mired in the details. So I thought I'd solicit opinions from
experienced Boosters here on some of the issues I'm trying to work
through.

I'm primarily finding myself conflicted about what degree of
flexibility the concepts and their concrete implementations need to
offer. I thought it would be useful to allow clients of tree<> to be
able to specify the container type for a node's children. In the code
I'm currently playing with, I have a couple "selector" structs,
basically identical to Boost Graph Library's vecS, listS, etc, and the
tree<> class template takes one of these selectors as a template
parameter. That much is all well and good, but I quickly run into
trouble once I start dealing with nodes and iterators.

First of all, I'd like to provide access to the children of a given
node as a sequence of the type specified by the client. So if you have
an iterator to an element in the tree (the structure of the tree being
otherwise opaque), I want to be able to say something like:

  tree<foo, list_selector> t;
  tree<foo, list_selector>::some_kind_of_iterator i;
  // assume t contains something useful and i points to an element in t
  std::list<foo> children = t.children(i);

At this point, I've got a problem: internal to the tree<>, the lists
representing child sequences contain not instances of type "foo" but
instances of something like node<foo>. So I can't simply provide
access like this unless I can guarantee that my nodes are convertible
in all contexts to their stored data. That seems like a rat hole.

Another thing I'm stuck on is, even if I don't expose the container
structure like this, I still have a problem with iterators. The above
example omits the actual iterator type, but to get into some
specifics, one type of iterator I'd like to provide is a pre-order
traversal iterator. In my current code, this iterator is basically
just a wrapper for iterators into the flat sequences I've been
instructed to use, and some minimal code that knows how to jump
between these sequences appropriately (for example: if a node has
children, operator++ yields the first child of that node). It's easy
to get an iterator into the sequence containing a node's children
(internally, it's just something like node.children.begin()), but it's
not so easy to get an iterator into the sequence containing a node's
parent, unless a node contains not a pointer to its parent but an
iterator instead. And that scheme breaks down for children of the root
node, which is not in any sequence, as it has no siblings. I'm sure
there are other options for how to implement the node structure and
iterator relationships, but none jump out at me.

In short, although I'd like my concepts for trees to allow users to
specify and access the sequences representing the children of a node,
my implementation of this is proving very clumsy. So I'm wondering two
things:

1. Is this a good idea in the first place? Are there really any
compelling use cases, where you'd want one tree to store its node
structure in vectors and another in lists? Or is this a case where a
policy-based approach is misplaced effort?

2. If this actually is a good idea (which I'm starting to get the
feeling is not the case), my experience so far is that there's some
significant overhead -- both in terms of the size of the structures,
and in terms of speed (extra pointer traversals when doing anything
with iterators). Does anybody have any brilliant ideas about how to
minimize this overhead?

If these questions are too vague, or lacking in context, I'm more than
willing to provide the rough draft I've got of the structural
concepts, and the very incomplete code (such as it is).

Thanks in advance for your expert advice!
dr


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk