Boost logo

Boost :

Subject: Re: [boost] Boost library submission (stldb - poll for interest)
From: Bob Walters (bob.s.walters_at_[hidden])
Date: 2010-01-08 10:46:11


On Wed, 6 Jan 2010 09:10:57 +0100, Stefan Strasser
<strasser_at_[hidden]> wrote:

>> ....  A map implementation
>> which internally segmented itself into sections under independent mutexes
>> in order to permit highly concurrent modifications of any kind would be a
>> big help. Anyone know of any such implementations?
>
> any implementation based on shadowpaging, e.g. this open source one:
> http://1978th.net/tokyocabinet/
>
> I don't know any internals of Berkeley DB, but it also supports concurrent
> modifications.
>
> the containers I describe here:
> https://svn.boost.org/svn/boost/sandbox/persistent/libs/persistent/doc/html/persistent/advanced.html#persistent.advanced.containers
>
> also support concurrent modifications, since their container nodes are
> implemented as MVCControlled objects.
> but they are much less efficient in absolute terms than a low-level container
> implementation.

There's a problem with this approach to concurrency. I don't like the
unavoidable chance
of exceptions forcing retry.

The company I work for had a problem like this with BerkeleyDB. We
had a map which was typically
very small, but subject to a constant load of insert/update/delete
activity. Under
such a load, the prospect of page-level contention increases, and in
BerkeleyDB's case
we got a lot of 'deadlock' exceptions, which could recurr 5+ times
despite retries.
The rety-based approach can be a death spiral, in that once contention occurs,
the fact that threads are then having to try 2+ times to complete
their transactions only
increases the load against the containers, thereby also increasing the
probability of contention.
I think I can digress to the arguments which led to RDBMS'es doing blocking.

So I probably have a bias here, but I don't like the notion of the
container failing thread A
simply because thread B inserted a record which causes some
rebalancing which touched
the node pointers on the node that A wants to modify. The container should be
predictable so that end users can optimize according to its behavior.

The STLdb map does not have this problem. Inserts with different key values
will never cause a contention exception. Even when two processes do modify the
same entry, blocking is possible, as an alternative to retry logic,
because that can
be more efficient overall.

All of this is reinforcing my hope for trans_map<key,locator> - you
address the chief
persistence related problems I would have if I just put 10 gigs of
values into the STLdb database,
and I think I can keep these container implementations fast and
predictable, and meet your
desire for low-level containers.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk