Boost logo

Boost :

Subject: [boost] [process] Formal Review
From: Nat Goodspeed (nat_at_[hidden])
Date: 2016-11-05 19:44:27


I would like to surface a number of issues, large and small.

You are clearly proud to support a number of different syntactic ways
to express the same semantic operation. To me this is a minor
negative. While in theory it sounds nice to be able to write Process
consumer code any way I want, in a production environment I spend more
time reading and maintaining other people's code than I do writing
brand-new code. Since each person contributing code to our (large)
code base will select his/her own preferred Process style, I will be
repeatedly confused as I encounter each new usage.

I admire the support for synchronous pipe I/O, while remaining
skeptical of its practical utility. Synchronous I/O with a child
process presents many legitimate opportunities for deadlock: traps for
the unwary. I would be content with a combination of:

* async I/O on pipes (yes, using Asio)
* system() for truly simple calls
* something analogous to Python's subprocess.Popen.communicate(): pass
an optional string for the child's stdin, retrieve a string (or, say,
a stringstream) for each of its stdout and stderr.

The example under I/O pipes the stdout from 'nm' to the stdin of
'c++filt'. But the example code seems completely unaware that c++filt
could be delayed for any of a number of reasons. It assumes that as
soon as nm terminates, c++filt must have produced all the output it
ever will. I worry about the Process implementation being confused
about such issues.

I'm dubious about the native_environment / environment dichotomy. As
others have questioned, why isn't 'environment' a typedef for a
map<string, string> (or unordered_map)?

I understand that you desire to avoid copying the native process
environment into such a map until necessary, but to me that suggests
something like an environment_view (analogous to string_view) that can
perform read-only operations on either back-end implementation.

Operations involving splitting and joining on ':' or ';' should be
defined purely in terms of strings and ranges of strings. They should
not be conflated with environment-map support.

The documentation so consistently uses literal ';' as the PATH
separator that I worry the code won't correctly process standard Posix
PATH strings.

At this moment in history, an example showing integration of Process
with Boost.Fiber seems as important as examples involving coroutines.

Why is the native_handle_t typedef in the boost::this_process namespace?

While in full context it makes sense to speak of "assigning" an
individual process to a process group, the method name assign() has
conventional connotations. Use add() instead.

There's a Note that says: "If a default-constructed group is used
before being used in a process launch, the behaviour is undefined." I
assume you mean "destroyed before being used," but this is a concern.
If a program has already instantiated a process group, but for any
reason decides not to (or fails to) launch any more child processes,
does that put the entire parent process at risk? What remedial action
can it take? Move the group object to the heap somewhere and let it
leak? Spawn a bogus child process for the purpose of defusing the
ticking process group instance?

If you're going to reify process group at all, you should wrap more
logic around it to give it well-defined cross-platform behavior. And
it should definitely be legal to create and destroy one without
associating any child process with it.

Given support elsewhere for splitting/joining PATH strings, the
string_type path parameter to search_path() feels oddly low-level.
Maybe accept a range of strings?

> At a minimum, kindly state:
> - Whether you believe the library should be accepted into Boost
> * Conditions for acceptance

YES, IF the present Boost.Process preserves the ability of its
predecessor to extend it without modifying it. I'm sorry, thorough
reading of the documentation plus some browsing through the code
leaves me doubtful.

Examples of such extensions:

* With Boost.Process 0.5 I can use Boost.TypeErasure to pass an
"initializer" which is a runtime collection of other initializers.
Such a feature in Process 0.6 would support people's requests to be
able to assemble an arbitrary configuration with conditional elements
and use it repeatedly.
* I can create an initializer (actually one for each of Posix and
Windows, presenting the same API, so portable code can use either
transparently) to implicitly forcibly terminate the child process
being launched when the parent process eventually terminates in
whatever way.

Please understand that I am not asking for the above features to be
absorbed into the Process library: I am asking for a Process library
extensible enough that I can implement such things with consumer code,
without having to modify the library. Perhaps Process 0.6 already is!
It's just hard for me tell.

Another feature should, in my opinion, be incorporated into the library:

* On Windows, if you desire to pass any file handles at all to the new
child process, it is completely whimsical what *other* open file
handles you may inadvertently pass -- unless you play games with
STARTUPINFOEXA and PROC_THREAD_ATTRIBUTE_LIST to enumerate exactly the
set of handles you intend to pass.
  If the library doesn't natively support that, I *must* be able to
pass a custom initializer with deep access to the relevant control
blocks to set it up. This is a showstopper, as in "you can't remove
that file because, unbeknownst to you, some other process has it
open."

> - Your knowledge of the problem domain

I have hand-written process-management code in C++ several times
before -- each with multiple implementations to support multiple
platforms. I have tested the previous candidate Boost.Process 0.5 with
a number of programs exercising a number of features.

> You are strongly encouraged to also provide additional information:
> - What is your evaluation of the library's:
> * Design

Design notes are at the top of this mail.

> * Implementation

I have only glanced over parts of the implementation. It seems
somewhat more obscure than the corresponding pieces of Process 0.5,
which is why I couldn't quickly satisfy myself as to the library's
extensibility.

> * Documentation

Others have noted that the documentation is very sketchy in places. I
too wish for more explanation.

Much of the time I spent on this review was reading through the
documentation. Apologies if I have misunderstood the library's
capabilities.

> * Usefulness

I have felt for years now that it is essential to get a child-process
management library into Boost. It grieves me to have to write
platform-specific API calls into different applications.

> - Did you attempt to use the library?

I did not, sorry, ran out of time -- as you can infer from this review
arriving at the end of the final day!

> - How much effort did you put into your evaluation of the review?

Most of my time was spent reading the documentation. I looked through
a couple of the implementation files for further information.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk