Subject: Re: [boost] NuDB: A fast key/value insert-only database for SSD drives in C++11
From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2017-03-25 20:22:50
On Sat, Mar 25, 2017 at 4:01 PM, Lee Clagett via Boost
> The other responses to this thread reiterated what I thought could
> occur - there should be corruption "races" from a write call to file
> sync completion.
NuDB makes the same assumptions regarding the underlying file system
capabilities as SQLite. In particular, if there are two calls to fsync
in a row, it assumes that the first fsync will complete before the
second one starts. And that upon return from a successful call to
fsync, the data has been written.
When there is a power loss or device failure, it is possible that
recent insertions are lost. The library only guarantees that there
will be no corruption. Specifically, any insertions which happen after
a commit, might be rolled back if the recover process is invoked.
Since the commit process runs every second, not much will be lost.
> Writing the blocks to the log file are superfluous because it is
> writing to multiple sectors and there is no mechanism to detect a
> partial write after power failure.
Hmm, I don't think there's anything superfluous in this library. The
log file is a "rollback file." It contains blocks from the key file in
the state they were in before being modified. During the commit phase,
nothing in the key file is modified until all of the blocks intended
to be modified are first backed up to the log file. If the power goes
out while these blocks are written to the log file, there is no loss.
> I jumped into the internal fetch function which was sorted within a
> single bucket and had a linked list of spills. Reading the README first
> would've made it clear that there was more to the implementation.
The documentation for NuDB needs work! I can only vouch for the
maturity of the source code, not the documentation :)
> So the worst case performance is a link-list if a hash collision.
Right, although the creation parameters are tuned such that less than
1% of buckets have 1 spill record, and 0% of buckets have 2 or more
> Was the primary decision for the default hash implementation performance?
If you're talking about xxhasher, it was chosen for being the best
balance of performance, good distribution properties, and decent
security. NuDB was designed to handle adversarial inputs since most
envisioned use-cases insert data from untrusted sources / the network.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk