Boost logo

Boost :

Subject: Re: [boost] Interest in B-tree library for Boost?
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2010-09-23 11:53:13


Beman Dawes wrote:
> On Mon, Sep 20, 2010 at 7:48 PM, Phil Endecott
> <spam_from_boost_dev_at_[hidden]> wrote:
>> Does anyone have any thoughts about how atomic inserts can be implemented on
>> top of this? ?I'm not too concerned about recovering from random disk
>> corruption, but I'd like to survive in the case where the program dies (e.g.
>> another thread segfaults!) while inserting a new record. ?Or, is the
>> implementation "not quite atomic but safe enough" ?
>
> The btree classes have a flush() function that forces writing out of
> internal buffers. That is intended to help provide "not quite atomic
> but safe enough".
>
> Beyond that, I'd like to hear ideas. Is some form of journaling the
> preferred approach?

> How do SQLlite or other libraries provide safety against another
> thread crashing? Do they actually stop the other threads until the
> b-tree modification completes, have some form of recovery, or some
> other approach? Do they have protection against another process
> corrupting the B-tree?

There are a couple of pages here that describe how SQLite does it:

     http://www.sqlite.org/atomiccommit.html
     http://www.sqlite.org/wal.html

I think those are worth reading. It's also possible to find docs about
the MVCC system that PostgreSQL uses, but that seems to have a lot of
functionality for multiple concurrent transactions that isn't needed
here. I've tried to find something about how Berkely DB does it but
without success; does anyone know anything about that?

The important question for now is whether this sort of thing can be
implemented as a layer on top of your library, or whether something
more intrusive is needed. I think that the SQLite journals work in
terms of disk blocks, which are not units that are exposed in the
application API. Perhaps what is needed is the ability to insert a
layer between your code and the filesystem; SQLite has a "VFS" that I
think may do something like this.

>> Harder is the implementation of range storage (1D or e.g.
>> bounding rectangles); Beman, you mentioned that someone had implemented an
>> R-tree on top of your previous B-tree; can you say something about how that
>> is done?
>
> It has been 10 or 15 years. I can remember the general principles,
> which are that keys represent minimum bounding rectangles (MBR's), and
> that as the search down the tree proceeds you may have visit multiple
> child nodes if the shape you are looking for spans several MBR's. Very
> similar to an N-tree or Z-tree. But beyond that I'd have to dredge
> around in my memory to come up with more details.

As it happens I also found some R-tree material on the SQLite web site:

     http://www.sqlite.org/rtree.html

It looks, though I am not certain, as if they have implemented R*-trees
on top of their regular tables by creating a "virtual tree", i.e. there
are database rows that define parent-child relationships. I think this
reduces the performance from O(log N) to O((log N)^2), but I haven't
worked out the details.

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk