Boost logo

Boost :

Subject: Re: [boost] NuDB: A fast key/value insert-only database for SSD drives in C++11
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2017-03-27 15:08:58


>> Durability means that corruption to the database will not cause further
>> data loss during subsequent use. For example, if you use a single bit to
>> indicate that a record is deleted, and corruption flips that bit to
>> deleted, and your implementation has no way of noticing the corruption,
>> you have lost data after the corruption. Ideally when a user next
>> accesses that record, they should see an error like "Record corrupt".
>>
>> There is a fun history of corruption and SQLite at
>> https://www.sqlite.org/howtocorrupt.html. Last time I looked, there was
>> a popular fork of SQLite which implements per-row checksumming, but the
>> default build does not (the canonical advice is: "use a proper filing
>> system like ZFS if you don't want bit errors"). But SQLite is very
>> carefully written to check consistency during modifications, and where
>> it can it will refuse to modify data when the metadata doesn't match up.
>>
>> So, a database can be durable and not detect arbitrary damage to user
>> data. It just cannot lose further user data due to corruption of its own
>> structures.
>>
>
> Given that SQLite doesn't do any checksum'ing of its data (i.e. its pages),
> I don't see how it could be durable in the way you seem to imply Niall.

I would agree with you (and those maintaining the fork of SQLite which
does checksum its rows) that row checksumming should be done if one is
claiming durability.

But I can see the point of those in SQLite who say that the code does
carefully check that metadata is sensible before changing things. I
personally don't think that goes far enough, but equally, if a database
did just that and claimed durability I'd grudgingly accept it because of
SQLite's stature.

But you are right, I wouldn't personally say SQLite can claim it
implements durability as a personal opinion. It needs to do more, and
it's not like it's much more, the code allows a checksum to be added per
row and validated very easily. The reason it is not in the main repo is
purely due to unresolved philosophical differences between factions of
opinion in the devs, not a lack of technical implementation nor even
much of a performance penalty. The only rational argument I've heard
against row checksumming is that for really small embedded devices, the
extra storage and memory used could be a problem. Personally, I'd make
the row checksumming optional, and again there is no technical reason
that isn't easy. I believe the fork makes it a user definable setting.

But differences of philosophy often trump technical arguments, and
people have taken stands on opinion. It's no different there than here.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk