Boost logo

Boost :

Subject: Re: [boost] [stl_ext_adv] ] Interest in advanced data structures
From: Nathan Ridge (zeratul976_at_[hidden])
Date: 2011-12-03 18:43:23


> > > > > * Question #1: why augmenting on B+ trees? The augmenting idea
> > > > > can be as easily built on top of RB-trees, which are
> > > > > the weapon of choice for implementing associative containers.
> > > > > Unless proved otherwise, RB-trees are assumed to be faster than B+
> > > > > trees for in-memory structures (which your own tests hint at, cf.
> > > > > "insert N random elements" table.)
> > > >
> > > > Could a B+ tree library be a basis for implementing disk-backed B+
> > > > trees? I think that would be useful. In a clever implementation
> > > > the memory-backed and disk-backed B+ tree might even be the same
> > > > class, with the difference factored out into a policy template
> > > > parameter.
> > > >
> > > > Regards,
> > > > Nate
> > > >
> > > > I understood this comment is about adding persistence or serialization
> > > support to the proposed B+ trees. Please correct if this is not the case.
> > > There are two levels of complexity in this area.
> > > A trivial one is just saving and reading data as a list, since these B+
> > > trees are build on top of doubly linked lists. This might be sufficient
> > for
> > > many practical applications, but certainly not for all.
> > > Hence there is the second and less trivial problem of saving and reading
> > > operations which preserve topology of these B+ trees that are designed to
> > > work in RAM. (Note: "topology" has been used here as a state of
> > > connectivity of nodes with data stored in the nodes).
> > > In principle, the problem can be solved for the submitted B+ trees, but I
> > > will need some Boost specific requirements to implement this feature.
> >
> > What I mean by a disk-backed B+ tree is a B+ tree where only the nodes that
> > are needed at a given time are loaded into memory from disk - the entire
> > tree on disk might be too large to even fit into memory.
> >
> > Thank you for this clarification.
> This is certainly the third level of complexity, this is why below only
> ideas about possible solutions.
> Assuming that problem #2 has been already solved the reading operation can
> be implemented through reading a sub-tree from the highest internal node
> which represents the root node of the sub-tree.
> Direct writing back the same sub-tree, which has been modified while
> working in RAM, is not allowed, since it can destroy invariants of the
> whole structure saved on disk. It seems to me the most reasonable option is
> to implement writing as efficient splice operations (in theory they have
> logarithmic cost), which are applied to the structure stored on disk. I
> know the efficient splice operations are possible for B+ trees in RAM, but
> I am not sure if these algorithms are also suitable for disk systems.
>
> Hope this quick consideration makes some useful sense.

How is it done by relational database indexes? Aren't those commonly
implemented as disk-backed B+ trees?

Regards,
Nate
                                               


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk