Boost logo

Boost :

From: David B. Held (dheld_at_[hidden])
Date: 2005-02-27 22:52:26


Justin Gottschlich wrote:
> "David B. Held" <dheld_at_[hidden]> wrote in message
> news:cvqlqp$fnk$1_at_sea.gmane.org...
> [...]
>>Second, trees are complicated because there are two sets of
>>iterators that need to be addressed. First is the iterator
>>over objects in the root, and second is the iterator over
>>children. However, they are two separate kinds of iterators.
>
> Well, I think this depends on how the tree's structure is formed.
> I think it's possible to achieve the tree building with a single
> iterator type and then add additional iterators as needed. The
> reason I say this is that if you consider the initial tree object
> as the "root node" (owning one of the contained objects of the
> tree), you should only need a single iterator type since by
> definition a tree can only have one root node. Since you wouldn't
> need to iterator over the objects contained within the root level
> as by definition, they'd be children of the root node. At least,
> this is how I handled this problem in my tree container.
> [...]

It sounds to me like you are not considering the general case
where a given node can have K keys and N children. Now typically,
K = 1, which is why I made it the default in the sample code. That
is what you are talking about. But for a Btree, you often have
K = N - 1. That is, each node contains K "objects" or "keys",
and N children. Yes, you could provide a single iterator type
that iterates over keys both *within* a node and *among* nodes,
but it would be a fat iterator (at least algorithmically, if not
structurally). I think a better decomposition is to recognize that
for trees with large aggregate nodes, the fundamental operations
are to access the keys within the node, and access the children
separately.

To illustrate, let me offer an exaggeration. Suppose we build a
Btree for use in a disk-based container. We could build it so that
K = 64, N = 4. That is, each node contains 64 "keys", but only
4 children. There are numerous ways to distribute keys and children
in such a configuration, and there is no a priori reason for an
implementation to exclude the possibility of any given distribution.
So such a tree would have a narrow branching factor, but fat nodes.
That might better correspond to the size of disk sectors while
not wasting too much space on a wide branching factor.

While it may seem wasteful to treat the key set for a node as an
array, I suspect that for the degenerate cases where K = 1, the
array access key_[0] will get optimized at compile time (and we
could certainly provide convenience functions that aid in this
optimization).

Dave


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk