Boost logo

Boost Users :

Subject: Re: [Boost-users] Review Results for Proposed Boost.Process library
From: Boris Schaeling (boris_at_[hidden])
Date: 2011-03-08 18:45:34


On Tue, 08 Mar 2011 00:40:36 +0100, Marshall Clow <mclow.lists_at_[hidden]>
wrote:

> [...]Thank you to everyone who participated, and especially to Boris and
> those who wrote a review.

And thank you for having volunteered to be the review manager!

Now how do we proceed from here after the library has been rejected? Here
are some comments from me:

There were a couple of ideas and new proposals how a process management
library could look like. If you feel you'd like to step in and take over
Boost.Process to give it another boost - drop me a mail and let's discuss
how I can support you! I would very much welcome new developers actively
working on the library!

In case there is not much happening until summer and Boost is accepted in
this year's Google Summer of Code program again, I maybe offer to be
available as a mentor for another Boost.Process GSoC project. In the short
term I probably won't have time to work on Boost.Process. Nevertheless, I
want to write down some conclusions I drew from the review on what I think
should be done next.

Jeff had proposed an executor concept. I'm not sure if I understand
everything about the executor concept but I can see a few advantages over
the current design. For example the current design requires to use
stream_ends to configure streams of a child process. A stream_ends object
contains two handles in case the parent process wants to communicate with
the child process. However there are use cases where you don't need two
handles - for example when you want to redirect stdout of the child
process to a file. Jeff's executor concept makes it possible to pass only
one handle without a second dummy handle in a stream_ends object.

The stream_ends class wouldn't be required anymore and could be removed
 from Boost.Process. A library user would still need to create a handle for
the file to redirect to. But with the executor concept one could use
boost::path. Boost.Process could support different types and the
implementation would do the rest.

Many stream behaviors which are currently used to create stream_ends could
be removed. Others can be redesigned to make them more useful - also
outside of Boost.Process. This includes for example pipe, named_pipe and
async_pipe. These could be functions which return a read and a write end
(instead of a parent and a child end). I guess we still want a class for a
pair of handles (stream_ends was introduced to make it clear what the
parent and child end is; with std::pair you would need to remember).

With different classes like boost::path and a handle for pipes (whatever
the class would be; some classes from Boost.Iostreams were proposed) to
configure streams, the context class currently used by Boost.Process would
turn into something hard to describe. There would need to be all kind of
containers as different types could be used to configure streams. I think
this also speaks for Jeff's idea. Instead of defining a context class
which needs to support each and every possible combination of stream
settings, providing overloaded functions seems to make more sense.

There was the requirement of reusing a context (to create multiple child
processes with the same configuration). This would need to be given up
with the executor concept (stream behaviors from context are used to
create handles on demand; with the executor concept the library user would
for example need to create pipes himself to pass read or write ends to
child processes). I would be fine with this as I think a better design and
API is more important than this requirement (so I agree with Jeremy that
this is not worth the extra complexity).

When the executor is used to create a child process - whatever "use" means
here - I wonder whether it should return a child object (currently done by
Boost.Process) or a handle (eventually even a native handle?). With a
handle a library user could decide himself what to do and whether he needs
the features of the child class. The current implementation of the child
class is in so far problematic as it does something extra on Windows only
to match the behavior on POSIX (the handle of the child process is kept
open to guarantee that the exit code can be fetched). Instead of a child
class with methods, one could use a handle and free-standing functions.

There is another reason why the child class wouldn't be required anymore:
It currently stores the stream ends the parent process uses to communicate
with the child process. If we give up the context class and require
library users to create handles themselves (via pipe, named_pipe and
async_pipe) the parent process has already access to its "stream ends".
The child class would only store a handle of the child process - but for
that we don't need a child class of course. No process class wouldn't be
required either. The methods wait() and terminate() would become
free-standing functions.

Supporting asynchronous read/write operations would be as easy as it is:
The handle can be used with Boost.Asio's classes. Whether we can reuse
code from other libraries (Boost.Iostreams?) to support streams I don't
know. Currently Boost.Process defines pistream and postream (but only
because a close() method had to be added) which are based on a systembuf
class (not sure if anything is available in other Boost libraries which
could be reused here).

Even though I added support for an asynchronous wait() operation I argue
now to remove that part (at least for POSIX platforms). The function is
very easy to use because it blurs differences between systems. But this is
more Java's than C++'s realm. Or to quote Artyom: "SIGCHLD is used for
wait operation on the child to exit, it is its purpose." While one could
try to implement an asynchronous wait() operation with SIGCHLD there are
other problems. In my opinion a cleaner solution is to simply support
signal handling. Here we give up the goal of being platform-independent.
But if I look at all options I think having a signal handler for POSIX
comes closest to what C++ developers and Boost users want. The signal
handler from Dmitry Goncharov could be used for a start.

For developers on Windows support for an asynchronous wait() looks very
different. The current implementation is based on WaitForMultipleObjects()
which is called in a helper thread. This can be implemented as a
Boost.Asio service (it wouldn't be the first one based on a helper
thread). Boost.Asio doesn't ship a service based on
WaitForMultipleObjects() (and there is currently no plan that such a
service will be added to Boost.Asio) so it would need to be shipped with
Boost.Process.

The Windows implementation of find_executable_in_path() has to be changed:
It should use FindFirstFile()/FindNextFile() and PATHEXT.

If an utility function like shell() will be provided, it should use
COMSPEC on Windows (and possibly something similar on POSIX instead of a
hardcoded path to /bin/sh).

There was an interesting remark that create_child() (or an executor)
should tell the parent process if the child process couldn't be created -
even if fork() has been successful and there are already two processes. I
guess we should indeed not write an error message to STDERR_FILENO but
throw an exception. The user can then handle the error himself. In that
case we need to help him though to find out whether the exception was
thrown before fork() (then it's handled in the current process) or after
fork() (then it's handled in the copy of the current process) - for
example with two different exception classes.

Max proposed to use a DSEL. Whether I pass command line arguments to a
process via an overloaded operator[] or overloaded functions like in
Jeff's executor concept doesn't make a big difference to me in the end. I
would probably prefer Jeff's proposal as I currently don't see what
operator[] and operator% buy us. Max's proposal puts some emphasis on
piping processes though which is not supported by the current draft. I
guess it would make indeed sense to add support again (we had support in
earlier versions) - and if it's only to think about how the library would
need to evolve in the future. The biggest problem I have with Max's
proposal is that it seems to be based on the assumption that declarative
programming is automatically better than imperative programming. That may
be true for certain use cases and some examples. However once you want to
do something else not covered by the DSEL, you have to leave the DSEL
behind. From the examples I've seen the DSEL seems to be optimized for
piping processes. But changing the working directory of a child process,
accessing a STARTUPINFO structure, setting up environment variables for a
child process, terminating child processes or waiting for them and
whatever other developers would like to do would all need to be done
differently. I'm not sure whether a DSEL can be created which provides
that much flexibility - especially as some complained that not even the
current draft provides enough flexibility.

There were many more remarks about implementation details. But I think I
would bore you if I wrote them all down. I stop for now - the email is
long enough to give you lots of new opportunities to disagree. ;)

Boris


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net