Boost logo

Boost :

Subject: Re: [boost] [afio] Formal review of Boost.AFIO
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2015-08-26 22:04:10


On 26 Aug 2015 at 1:04, Thomas Heller wrote:

> My main beef with the design you
> chose is that it implies shared ownership where it seems to not be
> necessary.

Turns out I got a spare hour tonight even if it's at 2am, so I can
reply earlier than expected. Tomorrow electricity is off for me all
day, so no internet and no replies till late.

Let us assume that file descriptors must be reference counted in user
space as they are a precious resource, and as I mentioned you need to
internally tie lifetimes of internal workaround objects to things. So
let's assume the need for std::shared_ptr<handle> has been agreed and
move onto the meat on the bone which is the choice of shared future
semantics over unique future semantics for the
std::shared_ptr<handle> transported by afio::future.

First things first: shared future vs unique future isn't helpful
terminology. For AFIO both have equal overhead. What we're really
talking about is whether future.get() can be called exactly once
(future) or many times (shared future). From now on, I will therefore
talk in terms of consuming futures (future) vs non-consuming futures
(shared future).

I've been trying to think of an API for asynchronous filesystem and
file i/o which (a) uses consuming future instead of non-consuming
future and (b) is invariant to the concurrency engine under the
bonnet so end user code doesn't need to be made customised for the
concurrency runtime.
 
Ok, let's revisit the original pattern code I mentioned:

EXAMPLE A:

shared_future h=async_file("niall.txt");
// Perform these in any order the OS thinks best
for(size_t n=0; n<100; n++)
  async_read(h, buffer[n], 1, n*4096);

This expands into these continuations:

EXAMPLE B:

shared_future h=async_file("niall.txt");
// Call these continuations when h becomes ready
for(size_t n=0; n<100; n++)
  // Each of these initiates an async read, so queue depth = 100
  h.then(detail::async_read(buffer[n], 1, n*4096));

Here we explicitly say we don't care about ordering for the
async_reads with respect to one another and the OS can choose any
ordering it thinks best, but we do insist that no async_reads occur
before the async_file (which opens "niall.txt" for reading).

Let's imagine that with value-consuming semantics:

EXAMPLE C:

future h=async_file("niall.txt");
// Call these continuations when h becomes ready
for(size_t n=0; n<100; n++)
  // Each of these initiates an async read, so queue depth = 100
  h.then(detail::async_read(buffer[n], 1, n*4096));

On the second h.then() executed i.e. for when n==1, you should see a
future_errc::no_state exception thrown because the first .then() is
considered to have consumed the future state. This is because
future.get() can be called exactly once.

So, to have it not throw an exception, you need to do this:

EXAMPLE D:

future h=async_file("niall.txt");
shared_future h2(h);
// Call these continuations when h becomes ready
for(size_t n=0; n<100; n++)
  // Each of these initiates an async read, so queue depth = 100
  h2.then(detail::async_read(buffer[n], 1, n*4096));

Now we consume the consuming future into a non-consuming future, and
non consuming futures are not consumed by their continuations so you
can add as many as you like (in this case 100).

Where I was coming from in the presented AFIO API design was that (a)
you the programmer is always trying to give as much opportunity to
parallelise (i.e. no ordering) as possible to the OS, hence the
pattern of adding more than one continuation to a future is very
common. That would throw an exception if the programmer forgot to
first convert the consuming future into a non consuming future (b) I
can't see any way of adding type safety to enforce single consumption
of consuming futures such that Example C refuses to compile.

However this discussion makes me wonder if I'm missing a trick. The
big value in non consuming futures is they force you to think about
better design. I'd like to not lose that if possible. So, perhaps
Example A could be rewritten as:

EXAMPLE E:

future h=async_file("niall.txt");
// Perform this after h
future r=async_read(h, buffer0, 1, 0);
// Perform these in any order the OS thinks best
// after r
for(size_t n=0; n<100; n++)
  parallel_async_read(r, buffer[n], 1, n*4096);

In other words, for the case where you issue multiple continuations
onto a consuming future, you must use an API with a parallel_*
prefix. Otherwise it will throw on the second continuation add.

Thoughts?

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk