Boost logo

Boost :

Subject: Re: [boost] [afio] Formal review of Boost.AFIO
From: Thomas Heller (thom.heller_at_[hidden])
Date: 2015-08-30 16:47:44


On 08/30/2015 06:01 PM, Niall Douglas wrote:
> On 29 Aug 2015 at 22:30, Thomas Heller wrote:
>
>>> For me this is an absolute red line which cannot be crossed. I will
>>> never, *ever* agree to a library API design which makes end users
>>> even feel tempted to use reference capturing semantics in a
>>> concurrency context. Period. They are banned as they are nuclear
>>> bombs in the hands of the typical C++ programmer.
>>
>> Sure, whatever, this is just derailing ... You do realize though that
>> you demote your users library to idiots not knowing what they do?
>> Despite promoting your library as something very niche only used by experts?
>> This is by giving up your own promoted "Don't pay for what you don't
>> use" principle. But let's move on...
>
> No, this stuff is at the core of where we are diverging in approach
> and why the AFIO API is so displeasing to you.
>
> Where I am coming from is this:
>
> "If the cost of defaulting to guaranteed memory safety to the end
> user is less than 0.5% overhead, I will default to guaranteed memory
> safety."

Where did you get this number from?

>
> Which in the case of a shared_ptr relative to ANY filing system
> operation is easily true, hence the shared_ptr.

I am slowly starting to understand your need of your lightweight future
... however, this makes my opposition even stronger ...

>
> I am also assuming that for power end users who really do care about
> this, it is trivial for them to extract from the shared_ptr an
> afio::handle&. The only difference here from my perspective is what
> is the default.

Sure, one could always do that, but what then? You'd have to convert it
back every time you need it for library, no?
If the answer is here is, no you can just call the functions that take a
handle, then take my and hartmuts advise: Don't overcomplicate your API,
it makes your live miserably. As you effectively demonstrated after my
gist it is very easy to avoid "atomic bombs" with the single most easy
interface. The only loss here is that it saves you an incredible amount
of time maintaining all the different "frontends" to your library.

>
> I appreciate that from your perspective, it's a question of good
> design principles, and splashing shared_ptr all over the place is not
> considered good design. For the record, I *agree* where the overhead
> of a shared_ptr *could* be important - an *excellent* example of that
> case is std::future<T> which it is just plain stupid that those use
> memory allocation at all, and I have a non memory allocating
> implementation which proves it in Boost.Outcome. But for AFIO, where
> the cost of a shared_ptr will always be utterly irrelevant compared
> to the operation cost, this isn't an issue.

And it all starts to make sense ... since you have that memory
allocation already for handle and friends, you seem to try to optimize
it away in the future? As said earlier, your API functions should not
operate on a future to a handle in any form.

>
> I also have a second big reason I haven't mentioned until now for the
> enthusiasm for shared_ptr. In addition to needing to bind internal
> workaround objects to lifetimes of things, AFIO's dispatcher and
> handle will be bound into other languages such as .NET, C, Python
> etc. All these languages use reference counting as part of their
> garbage collection, and therefore if the lifetime of AFIO objects is
> also reference count managed it makes getting the interop in
> situations where say a Python interpreter is loaded into the local
> process much easier.

Then what's stopping you from implementing that for those layers? Why
does *everyone* else has to pay for it? Again, this goes against the
very principle "Don't pay for what you don't use".
(I might want to have a class which implements some file I/O stuff, this
of course gets a handle member, in the continuations I want to access
some of my objects state, I decide to make ownership of this object
shared, as such I end up with an unnecessary extra shared ptr which you
claim is absolutely necessary to avoid atomic bombs, so I end up with
two bomb defusal kits while I would only need one, having redundancy is
cool and everything ...)

>
> I expect after the lightweight futures refactor of AFIO to add John
> Bandela's CppComponents support to AFIO. This lets you wrap AFIO
> objects into Microsoft COM objects, and from there rebinds into any
> other programming language is quite easy.

To be honest, I just don't care about that. While that might be a cool
feature for some, why does everyone has to pay for that?

>
> This is a BIG reason I am so absolute regarding ABI stability.
> Microsoft COM imposes stern requirements on the ABI layer. A big
> reason behind Monad is solving C++ exception transport through
> Microsoft COM, hence my predilection for APIs which are noexcept
> returning monad<T> as those are 100% Microsoft COM compatible.

Again, why does everyone has to pay for that?
How does this behave when having different CRTs?

>
>>> So let's replace all your reference captures with shared_ptr or value
>>> captures and thereby making them safe. I have forked your gist with
>>> these changes to:
>>>
>>> https://gist.github.com/ned14/3581d10eacb6a6dd34bf
>>>
>>> As I mentioned earlier, any of the AFIO async_* free functions just
>>> expand into a future.then(detail::async_*) continuation which is
>>> exactly what you've written. So let me collapse those for you:
>>>
>>> https://gist.github.com/ned14/392461b85e73add13e30
>>
>> Where does error handling happen here? How would the user react to
>> errors? What does depends do (not really clear from the documentation)?
>> How do I handle more than one "precondition"? How do I handle
>> "preconditions" which have nothing to do with file I/O?
>
> I literally collapsed your original code as-is with no additional
> changes, so the answers to all the above are identical to your
> original code.

No you didn't all the "Possible error handling..." parts have been
omitted. Every operation carries an unneeded
shared_future<shared_ptr<handle>> with it. I am still unsure what the
semantics of afio::future<T>::then really are and how they interact with
other operations on the object, for exampe afio::future<T>::get.

>
> The depends(a, b) function is simply something like:
>
> a.then([b](future){ return b; })

That's something I can't make any sense of when being issued on an
afio::future. I see no unwrapping support.

>
> Preconditions are nothing special. Anything you can schedule a
> continuation onto is a precondition.

when_all(...) that's all you need.

>
>> Which ultimately leads me to the question: Why not just let the user
>> standard wait composure mechanisms and let the functions have arguments
>> that they really operate on?
>> All questions would have a simple answer then.
>
> In my collapsed form of your original code, all the futures are
> std::future same as yours, and hence the need for a depends()
> function to swap the item with continues the next operation with a
> handle on which to do the operation.

Yeah, still makes no sense to me why i need a future to the handle in
the first place ...

>
> And there is absolutely nothing stopping you from writing out the
> continuations by hand, intermixed to as much or as little degree the
> default expanded continuations. I'm thinking in a revised tutorial
> for AFIO this probably ought to be on the second page to show how the
> async_* functions are simply less-typing time savers.
>
> I'm thinking, thanks to this discussion of ours, that I'll move those
> default expansions out of their detail namespace. People can use
> expanded or contracted forms as they prefer.

Just make that your API, document it properly and re-use stuff for
dependency handling from what's already there. Benefit: Massive time
saving on your part on documenting the stuff plus already defined clear
semantics which are understood by everyone who is familiar with the
Concurrency TS.

>
>> NB: I am still not sure what afio::future<>::then really implies, does
>> it invalidate the future now? When will the continuation be executed,
>> when the handler future is ready or when the "value" future is ready?
>> What does that mean for my precondition?
>
> afio::future<T> simply combines, for convenience, a future handle
> with a future T. Both become ready simultaneously with identical
> errored or excepted states if that is the case. You can think of it
> as if std::future<std::pair<handle, T>> but without the problems of
> ABI stability.

I hope you understood by now that all those questions could have been
avoided if you just reused what's already there. ABI stability is
nothing I am personally interested in and also have no experience as C++
by definition doesn't have any ABI guarantees. I think if you'd get rid
of all that, it would be far easier to discuss your design. Why isn't it
possible that consumers of AFIO implement all those ABI stability stuff?
It seems to be highly platform dependant anyway.

>
>>> This is almost identical to my second preference API for AFIO, and
>>> the one Gavin Lambert suggested in another thread.
>>>
>>> As I mentioned in that thread with Gavin, I have no problem at all
>>> with this API design apart from the fact you need to type a lot of
>>> depends() and I suspect it will be a source of unenforced programmer
>>> error. I feel the AFIO API as submitted has a nice default behaviour
>>> which makes it easy to follow the logic being implemented. Any
>>> depends() which is a formal announcement that we are departing from
>>> the default stand out like a sore thumb which draws the eye to giving
>>> it special attention as there is a fork in the default logic.
>>
>> That's exactly the problem: You have a default logic for something that
>> isn't really necessary, IMHO. The async operation isn't an operation on
>> the future, but on the handle. And to repeat myself: That's my problem
>> with that design decision.
>
> Would you be happy if AFIO provides you the option of programming
> AFIO exclusively using the expanded continuations form you posted in
> your gist?

The "expanded continuation form" is just one example of how you could
use that API I sketched. But yes, I would be happy if all futures
disapear as arguments to any of your file I/O operations.

>
> In other words, I would be handing over the decision to the end user.
> It would be entirely up to them which form, or mix of forms, they
> choose.

I strongly advise to only offer *one* form.

>
>>> Under your API instead, sure you get unique futures and shared
>>> futures and that's all very nice and everything. But what does the
>>> end user gain from it?
>>
>> 1. Clear expression of intent
>> 2. Usage of standard utilities for wait composures
>> 3. Transparent error handling
>> 4. "Don't pay for something you don't use"
>
> I appreciate from your perspective that the AFIO API is not designed
> using 100% standard idiomatic practice according to your experience.
>
> However, if as I mentioned above, if I give you the end user the
> choice to program AFIO *exactly* as your gist proposed (apart from
> the shared_ptr<handle>), would you be happy with the AFIO design?

It would make it much cleaner and easier to get, yes.

>
>>> They see a lot more verbiage on the screen. They find it harder to
>> > follow the flow of the logic of the operations and therefore spot
>> > bugs or maintain the logic.
>>
>> Verbosity isn't always a bad thing. Instead of trying to guess what is
>> the best default for your users stop treating them as they would not
>> know what they do.
>
> Excellent library design is *always* about defaults. I default to
> safest use practice where percentage of runtime and cognitive
> overhead allows me. I always provide escape hatches for the power
> programmer to escape those defaults if they choose, but wherever I am
> able I'm not going to default to semantics which the average C++
> programmer is going to mess up.

What I suggested is neither unsafe nor specifically tailored to a power
user, it is something that made the most sense to me for the reasons I
tried to give now.

>
>>> I am seeing lots of losses and few gains here apart from meeting some supposed design
>>> principle about single responsibility which in my opinion is a useful
>>> rule of thumb for inexperienced programmers, but past that isn't
>>> particularly important.
>>
>> You do realize that this is highly contradicting to anything else you
>> just said in your mail? You claim to have your library designed for the
>> inexperienced user not knowing what they do, yet your design choice
>> violates principles that are a nice "rule of thumb for inexperienced
>> programmers"?
>
> Even very experienced programmers regularly mess up memory safety
> when handed too many atomic bombs as their primitives, and debugging
> memory corruption is an enormous time waste. If you like programming
> exclusively using atomic bombs, C and assembler is the right place to
> be, not C++.

As you effecitvely demonstrated, it's very easy to avoid those bombs
even with my proposed succinct API.

>
> Atomic bomb programming primitives are a necessary evil, and the
> great thing about C++ is it gives you the choice, and when a safer
> default introduces a big overhead (e.g. std::future over
> std::condition_variable) you can offer options to library end users.
> However when the overhead of using safer programming and design
> patterns is unimportant, you always default to the safer design
> pattern (providing an escape hatch for the power programmer where
> possible).

As you effecitvely demonstrated, it's very easy to avoid those bombs
even with my proposed succinct API. So what's left to you is instead of
implementing some "clever defaults" and document them, you merely need
to document what you'd consider best practice. Which is why I think that
even though we come from different perspectives, the "easy" yet
dangerous API (however I see nothing dangerous in the API I proposed, it
is merely how you use it) is easier to be used save than the 100% fool
proof (which I highly doubt such thing exists) is harder to use "as you
need it" so you end up having two versions of basically the same. If you
do this I'll come back and say: "Why do you need those"?

>
>> Beside that, single responsibility should *always* be pursued. It makes
>> your code testable, composable and easier to reason about.
>
> There are many ways of making your code testable, composable and
> easier to reason about. You can follow rules from a text book of
> course, and that will produce a certain balance of tradeoffs.
>
> Is following academic and compsci theory always going to produce a
> superior design to not following it? For most programmers, probably
> yes most of the time. However once you reach a certain level of
> experience, I personally believe the answer is no: academic and
> compsci theory introduces a hard-to-see rigidity of its own, and it
> has the particular failing of introducing a dogmatic myopia to its
> believers.
>
> My library design is almost exclusively gut instinct driven with no
> principles whatsoever. Its single biggest advantage is a lack of
> belief in any optimality at all, because the whole thing is
> completely subjective and I wake up on different days believing
> different things, and the design evolves accordingly haphazardly. Its
> biggest failings are explainability and coherency and you probably
> saw that in the documentation because I don't really know why I
> choose a design until forced to explain myself in a review like this.

Which is the single most reason why we suggest to just stick with what's
already there. I couldn't spot a single reason where your design is
truly superior to anything that is composable to what's already there.
The big advantage: It wasn't cooked up by a single person over night who
isn't really sure if it was the right decision the next day.

>
> And thank you Thomas for enabling me to explain myself.

You are welcome.

>
>>> If you really strongly feel that writing out the logic using .then()
>>> is very important, I can add that no problem. Right now the
>>> continuation thunk types live inside a detail namespace, but those
>>> can be moved to public with very little work.
>
> Can I get a yay or nay to the idea of giving the end user the choice
> to program AFIO using expanded continuations as per your gist? If you
> yay it, I think this long discussion will have been extremely
> valuable (and please do reconsider your vote if you agree).

I am 100% aligned to what Hartmut answered to that question. I am not
going to change my vote though (reminder: it was "no as it currently
stands"), there are still too many open questions which are seeking
answers, we only scratched the surface here. Once everything has been
cleaned up, I am more than happy to review again though.

I would like to settle this now as we are starting to running in
circles. I understand your point of view you do mine. I think all has
been said. We probably won't agree 100% with each other and that's fine
for me.

>
> Niall
>
>
>
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk