Boost logo

Boost :

Subject: Re: [boost] NuDB: A fast key/value insert-only database for SSD drives in C++11
From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2017-03-21 21:13:14


On Tue, Mar 21, 2017 at 5:01 PM, Peter Dimov via Boost
<boost_at_[hidden]> wrote:
> The obvious question one might have is, in what scenarios is a database that
> only supports insertions, but not updates or deletions, useful?

So this actually comes up quite a bit in decentralized systems, where
the cryptographic digest (or hash) of a binary object is used as the
key to identify the object. Its called "Content-addressable storage:"
https://en.wikipedia.org/wiki/Content-addressable_storage

In digital currencies such as Bitcoin and Ripple these values
represent immutable information such as transactions that have taken
place in the past. It also comes up in distributed storage systems
such as file sharding, to represent pieces of files.

There's no question that this library is quite specialized and
applicable only in niche use-cases. But for those cases, it works
amazingly well.

> A follow-up to that is, is it not possible to add update/delete support to
> NuDB by just appending the new data and making the key file point to the new
> location (using zero-sized data for deletions)?

That is certainly possible, But space for the unused objects created
as a result of updates and deletes could not be reclaimed in
real-time. Administrators would have to use an offline process to take
a database and re-create it to exclude those objects. This could take
quite a long time for databases in the tens of terabytes, on the order
of days or weeks.

It is also not a use-case required for the applications that I
developed NuDB for, so I haven't done it. But it could be explored.

> Last, the README says "Value sizes from 1 to 2^32 bytes (4GB)", but the file
> format says uint48_t. Which is it? 2^32 is perhaps a bit limiting nowadays.

Probably the documentation is a touch out of date. The database
imposes a limit of 2^32 bytes for individual values. I think that's
big enough. NuDB is designed for database that have an enormous number
of keys, not a small number of keys with enormous values. If an
application needs a key/value store where typical values are in excess
of 4GB, they don't need support for billions of keys (since a billion
of those values would exceed all known storage limits).

Also I think I changed that limit from 2^48 down to 2^32 to make NuDB
work identically when built for 32-bit processor targets as when built
for 64-bit processor targets.

Thanks for the questions!


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk