Boost logo

Boost :

Subject: Re: [boost] [process] Formal Review starts today, 27 October
From: Klemens Morgenstern (klemens.morgenstern_at_[hidden])
Date: 2016-10-27 14:33:46


Dear Douglas,

thank you for your E-Mail, I wasn't really sure if I should answer at
all, because it is my impression that some of your points are just a
matter personal preference. That of course is my personal impression, so
please understand, that I have no interest in arguing with you,
especially since you already gave me your verdict. Which was expected
with exactly this outcome.

However: many of the points you raised have actual reason and it has
been discussed to come up with the solutions I provided. Therefore I
consider it the right way to address the points and actually explain the
reasons for design decisions, even though you reach different
conclusions. Please don't understand this as a personal attack, I just
want other reviewers to see your criticism beside my reasoning.

Am 27.10.2016 um 15:09 schrieb Niall Douglas:
> On 27 Oct 2016 at 9:26, Antony Polukhin wrote:
>
>> We encourage your participation in this review. At a minimum, kindly state:
>> - Your knowledge of the problem domain
> Deep. I've written several of these over many years for many
> platforms and used the many incarnations of Boost.Process over many
> years.
Just out of curirosity - are any of those open-source so I could look
into them for reference?
>> You are strongly encouraged to also provide additional information:
>> - What is your evaluation of the library's:
>> * Design
> I have serious concerns:
>
> 1. I really don't like the tagged named parameter idiom for any
> constructor. It may be Python idiomatic, but it's not C++ idiomatic.
> It also adds seriously to compile times and weird error messages for
> a very low increase in usability. I particularly think tagging stdio
> redirection using operator > a mistake.
>
> If Process were ever standardised, the tagged parameter idiom would
> be removed in any case.
First of: operator> is one of three ways to express that, the others are
operator() and operator=. I personally like >, which is why it is used
in the examples. I added the other variants to accomodate people with
different taste.

The reason I chose this style is, because you have a lot of arguments
going into a child process where in most cases you only use a few.
Therefore you could either build a large function with many unused
parameters like CreateProcess, you could take the posix approach in
calling many functions or you could use a variadic function. Since the
first one looks horrible and the second incurs a runtime-overhead we
chose the third. With the variadic list you cannot get around named
parameters.
Secondly all the parameters you pass to a launched process are useless
after it is launched, so I think the named parameter approach is quite
fitting, since it's a rather unique problem.
> 2. I think supplying the process launch executable and arguments as a
> parameter sequence is a serious design error. The process executable
> should always be a filesystem::path period and it should stand alone
> maybe passed to a constructor. The child process arguments should be
> an iterator pair or a gsl::span or an initialiser list. I'd even
> accept a const lvalue ref to any STL iterable.
Well this is among the possibilites you have, and I don't like to force
people to use a rather narrow style. Additionally not every command is a
real file, you might want to use some command, e.g. "bp::system("type
my_file.txt", bp::shell)". Now granted, this particular example would
not be necessary, since we have C++ facilities, but I always assume,
that I don't know every use-case.
> 3. The ability to pass a space separated string for launch executable
> and arguments needs to be eliminated permanently. It's an enormous
> reliability pit and it's dangerous.
Sometimes you need this, e.g. "bp::system("C:/Program
Files/MyTool/foo.exe", "C:/Program Files/OtherTool/bar.exe");". This has
well defined rules.
> 4. I don't think the child constructor should launch the process. It
> should construct a child instance only, maybe the call operator
> should do the actual launch. The call operator should take the launch
> arguments, the constructor the filesystem::path to the binary. The
> call operator should be reusable, each time it is called you get a
> new child process. This lets you reuse the setup to launch many very
> similar child processes.
It's similar to std::thread in that way; if you want to reuse the
parameters you can use std::tuple to store them.
> 5. The default should not be that launching a child blocks until it
> exits in any API. It should return an object matching the
> promise-future protocol i.e. it has a .wait() and a .get(). This type
> should be told to the Coroutines TS where available so Coroutines can
> treat it as an await point. For non-Coroutines users, if they want to
> wait until the child exits, they can manually call .get() on their
> own.
That's just wrong. A child does not block it is a type matching the
thread, it has a detach and a join. It leaves the ctor as soon as the
process has started. system is the blocking function.
> 6. It is a *very* serious design flaw that i/o deadlocks if you try
> to read after the child exits. The documentation suggests this is
> because maybe because we are redirecting to a file. This is not a
> good reason to have i/o deadlocking seeing as every system I've ever
> used it is possible to configure the pipes to error if you try to
> read from a closed pipe. The default should be to always error on a
> closed pipe read, and if the user is foolish enough to redirect to a
> file then it's on them to work around it rather than forcing very
> unhelpful and bug inducing behaviour on every user.
That is not the reason. In order to use boost.asio with pipes, you need
to use named pipes on windows, which behave like file-handles. Therefore
(afaik) the pipe does not get closed automatically, hence I cannot
guarantee that a pipe will be closed automatically. The deadlock is
caused for that reason: the pipe is still open, but no one is writing
into it.
> 7. Another *very* serious design flaw is ever terminating a child
> process as part of any normal operation. If you're trying to destruct
> a running child process, blocking until it exits is the correct
> semantics and push the problem of the blocking onto the user.
> Terminating processes is a very severe act, I would even recommend
> that you remove all ability to terminate a child from Process
> entirely. If Process were ever standardised, that ability would
> definitely be removed, after all if a user super really needs it they
> can fetch the native handle and call the native terminate process
> API.
That was the original behaviour I intended. But consider this example:

opstream os;
child c("foo.exe", std_in < os);
os << std::stoi("invalid string"); //throws

Now we have a deadlock, because foo waits for input, while the program
waits for it to exit. That's bad, really bad. Therefore child is modeled
after std::thread.

Also, though std::thread doesn't have a terminate, I think that child
needs that for one simple reason: you don't have control over the
process you launch. Unlike with std::thread, where the content of the
thread is at least in your codebase, you might not even know the content
of the binary you execute. Additionally, there are applications which
don't exit automatically, like "c++filt". That's why there's no way
around providing a terminate.
Before we start the discussion: there is not portable way to provide a
terminate-request, that's why it's not in the library.
> 8. I was surprised to see async_pipe exposed in the tutorial. I would
> have thought a correctly designed Process library would have merely
> provided getters for the async stdin, stdout and stderr which
> returned some opaque thing which can be fed to ASIO's async_XXX()
> functions and it all works as expected.
It works as expected if I know what the user wants. I don't.
> This brings me onto the next set of design problems: Process as
> proposed today does far too much stuff which it doesn't need to do,
> nor has any business doing.
>
> 1. Grouping is superfluous to needs and should be removed. If you
> used the design pattern I mentioned above where the constructor
> configures the launch but the call operator does the launch with some
> args, you no longer need grouping.
As stated in the documentation, the child processes of your child
process, will not be terminated, i.e.:

child c("gdb", "foo.exe");
c.terminate(); //foo.exe still runs (on windows)

Therefore groups are provided, implemented with process groups on posix
and job objects on windows. I added them because I needed them for my
application.
> 2. I already mentioned the tagged parameter idiom is gratuituous and
> adds very little to the user experience for a high usability cost.
It only adds compile time cost and reduces runtime-cost if compared to
the alternatives.
> 3. Handlers are unnecessary and should be removed. If you were
> returning a future protocol matching object, it would implement a
> .then() anyway. For process launched callbacks etc they are very hard
> to implement bug free anyway and should be removed.
One thing I have thought of adding, is the possibility of passing
std::future<int> to get the behaviour. You can however already use the
on_exit when using boost.asio, so you can do such things. This library
is C++11, so future does not have .then.
> 4. You are providing wchar_t support on POSIX, totally unnecessary.
> In fact given how the unicode parts of all this are not ever on
> critical time paths relative to the cost of a process launch, I think
> you should make the entire API UTF-8 on all platforms apart from the
> filesystem::path. On Windows do a UTF-8 to UTF-16 conversion just
> before you call CreateProcess if and only if the environment has been
> modified from the calling process (passing NULL means inherit the
> calling process). Simpler is better.
Well there are more differences on windows (e.g. the environment
variables) and to check this is for performance. A conversion would just
be inefficient if I don't need it.
The reason it's added on posix, is so the interface stays the same. I
know that it's unnecessary, but it should be the same on both platforms.
> 5. There is zero need for your own args and environment class
> implementations when an iterator pair, an arbitrary iterable STL
> container, an initialiser list or a gsl::span is everything needed.
> Those implementations should be removed entirely as unnecessary
> complication and baggage over what's already in the STL. They add no
> value at all over STL classes.
this_process::environment allows you to modify the environment of the
current process and is convertible to process::environment. It also adds
the seperated lists like in PATH, which have different seperators on
posix and windows.
> 6. I am unconvinced that you need to be implementing your own STL
> iostreams layer for pipe i/o. I am not sure if such conventional
> boilerplate one can take from any C++ fundamentals book should be
> duplicated in a library. I'd remove it personally.
Well I need them, since I don't have any STL implementation for that
that I can use. It was you who recommended me to not encourage the usage
of boost.iostreams.
> 7. I think the vfork support needs to be dropped as unnecessary.
Explicitly requested by a potential user for embedded development.
> 8. I think shell launching support needs to be dropped as
> unnecessary, and unportable. If the user wants to invoke the shell
> they can pass a shell program to Process and do it by hand, it's not
> hard.
It has a few more properties, like already using a changed
PATH-Variable. I.e. it will find an executable in my modified path. Not
sure it's a bad thing to allow users to do more.
> 9. The windows hide, maximised, minimized etc stuff all needs to be
> removed. Launch all child processes as console apps like POSIX does.
> Make behaviour the same on both.
It's default behaviour if nothing is used and it is explicitly marked as
an extension. Not sure why this is bad.
>> - How much effort did you put into your evaluation of the review?
> A lunch break at work :) But I've been using Boost.Process for
> something like seven years now, and I was lurking through the
> previous 2011 review
> http://lists.boost.org/boost-announce/2011/03/0292.php where many of
> my above issues were also raised.
>
> As you might be gathering, my recommendation is to reject this
> library as submitted. Too many severe design flaws in its API and
> semantics, and it does far too much which is unnecessary and unneeded
> yet simultaneously does not do enough of what is necessary and is
> needed. Same as the 2011 review if I remember correctly, though for
> different reasons.
>
> I should make it clear that I think this library has *greatly*
> improved over where it was even a year ago. I thank the author for
> making so much substantial effort. But it's trying to retain some
> backwards compatibility with previous Boost.Process, and hence this
> API design is not dissimilarly flawed to all Boost.Process till now.
>
> I think the author needs to start again from scratch in API design
> and semantics. Ask himself what API design and semantics would
> dovetail perfectly with the upcoming C++ 17 standard such that if
> Process were standardised, it would look perfectly a part of the
> standard C++ 17 or 20 library. Do *the minimum necessary* as a
> standard C++ child process library.
Well for the reason statet above I think this API design is the best
solution; it has very little to do with backwards-compatibility, I broke
nearly everything there. But as statet in our last discussion: I'd be
really interested in how you would design that.
Just fyi: it actually started out as a C++14 library, and I made the
painful decision to go down to C++11, since there is a real need for a
process library, including those people still using old compiler. I also
had people asking me if I could go down to C++98, since they were
forced to use old compilers at work. Though I personally always use the
newest compiler versions, we shouldn't forget that many people are bound
to older ones. I wrote this library so it may be used and help people,
therefore I will not rewrite it in C++17, at least not now.

Thanks,

Klemens


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk