Subject: Re: [boost] NuDB: A fast key/value insert-only database for SSD drives in C++11
From: Lee Clagett (forum_at_[hidden])
Date: 2017-03-29 17:01:27
On Wed, 29 Mar 2017 12:44:42 -0400
Vinnie Falco via Boost <boost_at_[hidden]> wrote:
> On Wed, Mar 29, 2017 at 12:32 PM, Lee Clagett via Boost
> <boost_at_[hidden]> wrote:
> > Read this [paper on crash-consistent applications]
> > ...
> > https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf
> This is quite helpful, thank you! I think this paragraph is relevant
> to my use-case?
> "Current file systems do not provide atomic multi-block appends;
> appends can be broken down into multiple operations. However, most
> file systems seemingly guarantee that some prefix of the data written
> (e.g., the first 10 blocks of a larger append) will be appended
> It sounds to me like I have this case covered with the "partial write"
> failure mode of fail_file. Or is there another case I missed?
This portion was worded poorly by the authors. If you look at table 1,
a single block append doesn't work when the filesystem is **not** doing
metadata journaling. Its inconceivable that multi-block appends would
appear atomically for these configurations. Their intent was to point
out that filesystem configurations achieving single block atomic append
could actually do up to 10 blocks atomically.
And the partial write failure test case does not cover what I am
talking about. A filesystem is allowed to write the control structures
before writing the data, and still meet your constraints for fsync. So
the pointer to the sector has been stored but the data at that sector
was never written.