Boost logo

Boost :

Subject: Re: [boost] [boost.process] 0.6 Redesign
From: Klemens Morgenstern (klemens.morgenstern_at_[hidden])
Date: 2016-04-18 19:06:09


Thanks for the Feedback, here are my thoughts on the raised points.

> However I'm not sure if I'm for your specific formulation. Here are
> my top issues:
>
> 1. You should rip out all usage of Boost.Iostreams. It's been without
> maintainer for years now, and has a fundamentally broken design for
> async i/o. Nobody should be encouraged to use it in new code.

I do actually agree with that - boost.iostreams has an obsolete design.
But using it's file_descriptors and streams makes it so much easier for
now, which would be a lot of reimplementation. So as long as
boost.iostreams is not marked as obsolete, I wouldn't think it's
necessary to do so.

> 2. You should completely eliminate all synchronous i/o as that is
> also fundamentally broken for speaking to child processes. Everything
> needs to be async 100% of the time, it's the only sane design choice
> [2]. You can absolutely present publicly a device which appears to
> quack and waddle like a synchronous pipe for compatibility purposes,
> indeed AFIO v2 presents asynchronous i/o devices as synchronous ones
> if you use the synchronous APIs even though underneath the i/o
> service's run() loop gets pumped during i/o blocks. But underneath it
> needs to be 100% async, and therefore probably ASIO.

Actually I did not look at boost.afio, which might be a good idea.
Currently you only have the async_pipe representation, which wraps
around a either ordinary posix-pipe or a named pipe on windows.
Now I do not think, that I need to make everything async, because I can
think of enough scenarios where this is a complete overkill. For
example, I might just want to pipe from one process to another:

pipe p;
auto c1 = execute("e1", std_out>p);
auto c2 = execute("e2", std_in<p);

There I don't need any async behaviour but I need pipes. Or I have a
program where I need a strong correlation between input and output - why
would I do that async?

pipe p_in, p_out;
auto c1 = execute("c++filt", std_out>p_out, std_in<p_in);

So I could either require the pipe class to always hold a reference to
an io_service or to not implement read/write functions. Doesn't really
make sense two me, especially why less would be more here.

> 3. Instead of inventing your own i/o objects, I think you need to
> provide:
>
> (a) Async and sync objects extending ASIO's base objects with
> child_stdin, child_stdout and child_stderr - or whatever your
> preferred naming.
>
> (b) std::istream and std::ostream wrappers. These are not hard, ASIO
> helps you a lot with these once you have the native ASIO objects.

I don't get what you want to tell me here, sry. Thing is: I need a
custom object, beacuse the current asio-objets
(windows::stream_handle/posix::stream_descriptor) are bidirectional for
one handle, but I need an object that writes on one and reads on the other.

> 4. Child processes are not like threads and should not be represented
> as a first order object. They should instead be an opaque object
> represented by an abstract base class publicly and managed by a RAII
> managing class from whom the opaque object can be detached, assigned,
> transferred etc.

If I understand what you say here correctly, that this is what I
originally planned. But I decided that against that, because it is much
easier the current way and you can store all needed information in a
simple child. And actually you can detach the child.

> 5. Replace all the on_exit() machinery with future continuations i.e.
> launching a process *always* returns a future. If someone wants to
> hook code onto when the process exits, a future continuation is the
> right tool. Similarly for fetching return codes, or detaching oneself
> from the child. Python's new subprocess.run() returns exactly the
> struct your future also needs to return.
>

Well again: that would require the usage of asio all the time, I don't
think that's very sensible. Currently you can just write a simple
process call and wait for it, and it's as simple as it gets:

execute("something");

Also I need to have a callback on_exit to cancel the read operations on
async pipes. I thought about doing this automatically, but then again
one could use one pipe with several subprocesses.

BUT: allowing a std::future<int> to be set on exit seems like a neat
feature, I should add that.

> 6. Looking through your source code, I see references to
> boost::fusion and lots of other stuff. Great, but most people wanting
> a Process management library don't want to drag in a copy of Boost to
> get one. It's easier to just roll your own. So drop the Boost
> dependency.

Ok now I am confused: you want me to make everything asio, so that
you'll always need boost.asio, but you want to eliminate the dependency
on boost.fusion? Why? That's definitely not going to happen, since I do
a lot of meta-programming there, this would become too much work for a
library which clearly needs other boost libraries. The only way I see
that happening is, when I propose a similar library for the C++
Standard, and that won't happen very soon...

>
> 7. Looking through your source code, I am struck about how much
> functionality is done elsewhere by other libraries, especially ASIO.
> I think less is more for Boost.Process, I always personally greatly
> preferred the pre-peer-review Boost.Process even with its warts over
> the post-peer-review one which had become too "flowery" and "ornate"
> if that makes sense. The latter became unintuitive to program
> against, I kept having to look up the documentation and that annoys
> me. This stuff should be blindingly obvious to use. It should "just
> work".

Uhm, what so is this a bad thing, to use boost.asio that much? I don't
really get your point here.

> I conclude my mini-review by suggesting "less is more" for
> Boost.Process. 99% of users want the absolute *minimum* featureset.
> Look at Python 3.5's new subprocess module, that is a very good API
> design and featureset to follow. It's intuitive, it gets the job done
> quickly, but it exposes enough depth if you really need it to write a
> really custom solution. I'd *strongly* recommend you copy that API
> design for Boost.Python and dispense with the current API design
> entirely. The absolute clincher in Python's subprocess is you can
> never, ever race nor deadlock stdout and stderr. That makes an
> underlying async i/o implementation unavoidable. I'd personally
> suggest save yourself a ton of hassle and use ASIO's pipe/unix socket
> support facilities, it's becoming the Networking TS anyway.

Well I think we have different philosophies on what the library should
do. If you want a pure async implementation of boost.process, one could
maybe built it atop boost.process and call it apio or something.

The reasons I integrated the asio stuff are the following:
   - on windows you need named pipes for async stuff (should be
distinguished be the pipe library)
   - notification of on_exit must be integrated, so it is launched
immediatly after executing the subprocess.
   - io_service must get notified on fork

If that weren't the case I would prefere boost.process to be much
simpler, and have not async functionality whatsoever. Though
theoretically the whole pipe-functionality could be moved into another
library, which is just used by boost.process.

>
> Hope this is helpful.

Sure it was, thank you. I hope my reation doesn't seem to stubborn.
>
> Niall
>
>
> [1]:
> https://github.com/ned14/boost.afio/blob/master/include/boost/afio/v2/
> detail/child_process.hpp
>
> [2]: I refer to the stdout/stderr deadlock problem which is the
> biggest reason anyone reaches for a process management library
> instead of just using the
> syscalls directy. The internals of the child i/o needs to be 100%
> async to
> prevent deadlocking. You can absolutely present publicly a device
> which appears
> to quack and waddle like a synchronous pipe for compatibility
> purposes, indeed AFIO v2 presents asynchronous i/o devices as
> synchronous ones if you use the synchronous APIs even though
> underneath the i/o service's run() loop gets pumped during i/o
> blocks.
>

I really don't think this is the case: boost.process should first of all
be a wrapper around the syscalls, so I can use it on different platforms
without and #ifdefs.
The async stuff is second, and is a set of features, extending it. At
least that's the reason I need a process library, which is the reason
I'm working on that.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk