Boost logo

Boost :

Subject: [boost] Universal async i/o (was: Re: [Fibers] Performance)
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2014-01-24 07:46:05

On 24 Jan 2014 at 13:13, Bjorn Reese wrote:

> > I'm having trouble understanding this. A chained operation must by
> > definition be one operation being called as some other operation
> > completes, and can never possibly refer to operations running in parallel.
> Think of the execution of chained operations as analogous to the
> execution of CPU instructions.
> Niall has already explained the situation where all chained operations
> should be passed to the scheduler to avoid latency. This is analogous
> to avoid flushing the CPU pipeline.

That's a good analogy, but there are significant differences in
orders of scaling. Where a pipeline stall in a CPU may cost you 10x,
and a main memory cache line miss may cost you 200x, you're talking a
50,000x cost to a warm filing system cache miss. There are also very
different queue depth scaling differences, so for example the SATA
AHCI driver on Windows gets exponentially slow if you queue more than
a few hundred ops to it simultaneously, whereas the Windows FS cache
layer will happily scale to tens of thousands of simultaneous ops
without blinking. How many FS cache layer ops turn into how many SATA
AHCI driver ops is very non-trivial, and essentially it becomes a
statistical analysis of black box behaviour which I would assume is
not even static across OS releases.

> You can also have chained operations that are commutative, so the
> scheduler can reorder them for better performance. This is
> to out-of-order CPU execution.

Indeed that is the very point of chaining: you can say to AFIO that
this group A here of operations can complete in any order and I don't
care, but I don't want that this group B here of operations to occur
until the very last operation in group A completes. This affords
maximum scope to the OS kernel to reorder operations to complete as
fast as possible without losing data integrity/causing races. It's
this sort of metadata that the ASIO callback model simply doesn't

It's actually really unfortunate that more of this stuff isn't
documented explicitly in OS documentation. If you're into filing
systems, then you know it, but otherwise people just assume that
reading and writing persistent data is just like any other kind of
i/o. The Unix abstraction of making fd's identical for any kind of
i/o when there are very significant differences underneath in
semantics is mainly to blame I assume.


Currently unemployed and looking for work in Ireland.
Work Portfolio:

Boost list run by bdawes at, gregod at, cpdaniel at, john at