Boost logo

Boost :

Subject: Re: [boost] Boost library submission (poll for interest)
From: Stefan Strasser (strasser_at_[hidden])
Date: 2010-01-05 10:54:52


Am Monday 04 January 2010 23:56:32 schrieb Bob Walters:
> I have a library I would like to submit for inclusion in Boost. This
> message is just soliciting for interest per the submission process.
> If interest is expressed, I'll carry on with a preliminary submission.
>

still very interesting ;-)
however, after having a look at your documentation I'm confused again about
the discussion we had off-list.

I thought the reason that requires you to load the entire dataset on startup
and save it entirely at checkpoints and recoveries is that the user needs to
access the mapped region directly through pointers, since the user types
exist in an unserialized form there. is that right?

what confuses me is the "Updating Entries" section and the fact that you do
track changes made by the user, e.g. via trans_map::update.

so if the library is aware of every change to the mapped region, what
stops you from employing a shadow paging technique and only writing to the
mapped region on transaction commit, when the modified pages are logged?

that would make your library MUCH more widely usable.

or is my assumption that the library is aware of every change incorrect?
because on the other hand, dereferencing trans_map::iterator returns a
non-const reference to value_type, indicating the opposite.

http://en.wikipedia.org/wiki/Shadow_paging

> The library is an embedded database, to the tune of products like the
> embedded version of InnoDB, BerkeleyDB, etc. In this case, the

if the copying-entire-dataset is required after all I'd like to
suggest a change of the name of the library to avoid confusion.

BerkeleyDB's implementation of a STL container interface is called DbSTL and I
think there are other implementations by similar names out there which
generally refer to "database access with an STL interface", but IIUC that is
exactly the use case your library couldn't support: use as an embedded
database (with large dataset).
something like InterprocessContainers comes to mind.

http://www.oracle.com/technology/documentation/berkeley-db/db/programmer_reference/stl.html

> Currently, I've only implemented map<> (probably the most useful
> container type in a database),

what kind of tree does trans_map use internally?

> but the plan is to continue to add
> containers as requested/needed until ACID compliant versions of a
> decent portion of the STL are available.

I can't wrap my head around the locking or MVCC used for isolation.
could you explain this a little more in detail? does the library automatically
record/lock accesses? is the user supposed to lock manually on every access?
information about transaction-local changes are saved inside a map entry,
right? how are possible read-accesses over an entire range stored, e.g. by
using map::equal_range()?

thanks,


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk