Boost logo

Boost :

From: Joao Abecasis (jpabecasis_at_[hidden])
Date: 2005-03-28 08:15:39


Hi Dan!

Even though I'm not an "experienced booster" nor do I consider this to
be "expert advice"... I've spent some time playing with designs and
implementations for a generic tree framework and hope you'll find my
comments useful.

Dan Rosen wrote:
> Of the two "how to design a generic tree container" camps, I'm finding
> myself mostly in the one where individual nodes are not exposed to
> users, and the tree is an opaque structure.

In my design nodes were made an implementation detail but I also
attempted to define some node concepts to allow advanced users to plug
their own node types into pre-defined trees.

All pre-defined trees would take a TreeNode parameter which could
(should!) be ignored for most use cases.

> Also, nodes are not
> identical to iterators, and the iterator concept is not extended in
> any significant way from the standard. (Anybody who would like more
> detail, I'm happy to provide my draft of the concepts, but I don't
> want to get into that detail here -- it's not particularly relevant to
> my questions that follow.)

I'd be interested in looking at the details, however draft they are
(Feel free to mail them privately).

> I feel like I've been making decent progress so far, mostly working on
> paper, just documenting the concepts, and doing some coding exercises
> for proofs of concept.

This is where I am ;-)

> But I don't have anything I'd consider ready
> for primetime yet -- not even remotely -- and in fact as I progress
> more solidly into the coding phase, I'm finding myself getting
> somewhat mired in the details. So I thought I'd solicit opinions from
> experienced Boosters here on some of the issues I'm trying to work
> through.
>
> I'm primarily finding myself conflicted about what degree of
> flexibility the concepts and their concrete implementations need to
> offer. I thought it would be useful to allow clients of tree<> to be
> able to specify the container type for a node's children. In the code
> I'm currently playing with, I have a couple "selector" structs,
> basically identical to Boost Graph Library's vecS, listS, etc, and the
> tree<> class template takes one of these selectors as a template
> parameter. That much is all well and good, but I quickly run into
> trouble once I start dealing with nodes and iterators.

I'm not sure if you're referring to the standard containers here. I
initially considered using those, after considering allocator issues and
the (potentially) large per-node overhead I almost completely dropped
the idea.

Right now, I'm pondering on a node concept based on boost::range, plus
some free functions (insert, update, erase...) that would allow greater
flexibility and extensibility, with no overhead.

> First of all, I'd like to provide access to the children of a given
> node as a sequence of the type specified by the client. So if you have
> an iterator to an element in the tree (the structure of the tree being
> otherwise opaque), I want to be able to say something like:
>
> tree<foo, list_selector> t;
> tree<foo, list_selector>::some_kind_of_iterator i;
> // assume t contains something useful and i points to an element in t
> std::list<foo> children = t.children(i);

Personally, I prefer a tree-iterator concept that allows traversal of
the tree without needing to carry the tree around. Having children as a
member function of the tree prevents this.

> At this point, I've got a problem: internal to the tree<>, the lists
> representing child sequences contain not instances of type "foo" but
> instances of something like node<foo>. So I can't simply provide
> access like this unless I can guarantee that my nodes are convertible
> in all contexts to their stored data. That seems like a rat hole.

That's one of the reasons I decided to define a TreeIterator concept
that overlaps with the standard Iterator concept when it comes to
sibling traversal.

> Another thing I'm stuck on is, even if I don't expose the container
> structure like this, I still have a problem with iterators. The above
> example omits the actual iterator type, but to get into some
> specifics, one type of iterator I'd like to provide is a pre-order
> traversal iterator.

On the thread Justin Gottschlich started on this issue
(http://tinyurl.com/44e3u), I suggested pre-order and other
linearizations of a tree could be provided externally. I think putting
these on the tree unnecessarily complicates the design of the tree and
the iterators.

> In my current code, this iterator is basically
> just a wrapper for iterators into the flat sequences I've been
> instructed to use, and some minimal code that knows how to jump
> between these sequences appropriately (for example: if a node has
> children, operator++ yields the first child of that node). It's easy
> to get an iterator into the sequence containing a node's children
> (internally, it's just something like node.children.begin()), but it's
> not so easy to get an iterator into the sequence containing a node's
> parent, unless a node contains not a pointer to its parent but an
> iterator instead. And that scheme breaks down for children of the root
> node, which is not in any sequence, as it has no siblings. I'm sure
> there are other options for how to implement the node structure and
> iterator relationships, but none jump out at me.

I solved parent/root issue by having the tree class promise only the
most basic TreeIterator: for traversal from parent to child. I think
this is the basic form of tree traversal all tree-iterators must provide.

 From the initial root iterator one could get SiblingIterators for
iteration over its children.

Depending on the underlying node, both iterators could actually be the
same. This could be inspected through separate traits classes.

> In short, although I'd like my concepts for trees to allow users to
> specify and access the sequences representing the children of a node,
> my implementation of this is proving very clumsy. So I'm wondering two
> things:
>
> 1. Is this a good idea in the first place? Are there really any
> compelling use cases, where you'd want one tree to store its node
> structure in vectors and another in lists? Or is this a case where a
> policy-based approach is misplaced effort?

In my design I planned to use a fixed size array for n-ary trees and a
single or double-linked list for general trees.

> 2. If this actually is a good idea (which I'm starting to get the
> feeling is not the case), my experience so far is that there's some
> significant overhead -- both in terms of the size of the structures,
> and in terms of speed (extra pointer traversals when doing anything
> with iterators). Does anybody have any brilliant ideas about how to
> minimize this overhead?

Right! My idea was to write the basic single and double-linked lists
operations from scratch. We don't be need the whole std::list interface
for the nodes anyway.

João


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk