Boost logo

Boost :

Subject: Re: [boost] [process] Formal Review starts today, 27 October
From: Klemens Morgenstern (klemens.morgenstern_at_[hidden])
Date: 2016-10-30 11:39:08


Am 30.10.2016 um 14:21 schrieb Bjorn Reese:
> On 10/27/2016 08:26 AM, Antony Polukhin wrote:
>
>> - Whether you believe the library should be accepted into Boost
>> * Conditions for acceptance
>
> I withhold judgement for now to give the author an opportunity to
> respond. At the moment I am unsure whether to vote for conditional
> acceptance or rejection.
Awesome, then let's do this. Thanks for checking the library out.
>
>> - Your knowledge of the problem domain
>
> Advanced on Unix.
>
>> You are strongly encouraged to also provide additional information:
>> - What is your evaluation of the library's:
>> * Design
>
> I have a couple of serious concerns.
>
> My first concern is about the use of tagged arguments, and especially
> the redirection operators, in constructors.
>
> Instead I propose that the library uses the Named Parameter idiom (see
> [1], not to be confused with what Niall labelled "tagged name parameter"
> idiom.)
>
> The bp::child object should not execute in the constructor, but rather
> via operator(). That way we can use the named parameter idiom to build
> up redirections and similar modifications. For example:
>
> auto compile = bp::child("c++").output(bp::null);
> compile("main.cpp"); // Executes
>
I personally don't like that approach though it certainly would work. I
think it's partly a matter of taste, here's my (C) association:

struct X {int std_in; const char* exe;};
X x{.std_in = 2, exe ="stuff"};

I.e. your setting properties.

Now the problem I see is this: what happens here?

auto compile = bp::child("c++");
compile.output(bp::null);
compile("main.cpp");
compile.env({"PATH", "/foo"});

So in order to do that, we'd need a child_builder class, that can be
converted, similar to boost.format. I consider this style pre C++11,
i.e. before variadics. Now granted: it seems not very C++ish, but it's a
very unique problem: most properties are inaccesible after you launched
the process, which is the reason they are not members.

The next problem is: your proposed syntax solves a very special problem,
how would I know if you want to pass the first arg (as in your example)
or the exe or anything else to the operator()? I would constraint what
you can express.

If you want to store settings, you can in fact do that:

template<typename Tuple, int ...Indexes>
child launch_impl(const std::string & file, Tuple && tup,
std::integer_sequence<Indexes...> = )
{
     return child(args=file, std::get<Indexes>(tup)...);
};

template<typename Tuple
child launch(const std::string & file, Tuple && tup)
{
     return launch_impl(args=file, std::forward<Tuple>(tup),
std::make_index_sequence<std::tuple_size<Tupel>::value>());
};

auto compile = std::tie("c++", std_out>null);
auto c = launch("main.cpp", std::move(compile));

//or much easier:
auto compile = [](const std::string & st){return child("c++", st,
std_out>null);};

I don't know how that would be possible with your approach. The idea of
this library is to be integrated in whatever framework you need; though
to be fair, I could've given an example of that. And you can't copy most
initializers - which also makes sense, considering, that you wouldn't
want several processes to write on the same stream.

To the other point with launching in the ctor-call: I think that's the
right way to implement RAII here. It's similar to std::thread in that
way. You can launch later, since child is movable:

child c;
//do something else
c = child("foo");
> My second concern is about deadlocking. The FAQ shows two ways of
> deadlocking. I would expect the latter (the ls example) to work
> without further ado because that is how any newcomer would write
> their code.
First of all: that is a problem with the nature of pipes. As explained
in the faq, windows doesn't provide the proper facilities like FD_CLOEXEC,
so I did not put that in, so the same happens on windows and posix. The
async api is safe in this regard, which is why I warn in the sync IO
sections.
This is not an issue with the library, but with the OS and the
workaround is asio. However: boost.asio might be an overkill in many
instances, hence it's not enforced.
>
> My third concern is about the chaining of pipes between processes. The
> tutorial shows an example where the output of the nm command is
> forwarded to the c++filt command. However, the forwarding is not done
> directly from the nm process to the c++filt process, but goes via the
> parent process. This can incur a performance degradation when nm
> generates much output.
>
> Is it possible to chain them directly? Besides "nm myfile | c++filt"
> that is.
>
That is what happens; the pipe just opens a file-descriptor/handle and
assigns them to the streams in both processes. This is exactly what the
shell does,
you just can additionally access the pipe from a third process if you
don't destroy it. This is directly chained, the pipe class is a thin
wrapper around the actual pipe.

> I also have the following minor comments / questions.
>
> Is it possible to redirect stderr to stdout? (like 2>&1). The reference
> documentation only mentions this under asynchronous pipe output.
If you mean to redirect the stdout to the stderr of the current process
- yes it is. This is what the FILE* overload is for, you can just write
std_out > stderr. I know it's a bit confusing, but std::cerr cannot be
used, because (1) it doesn not give access to the and secondly you can
change the rdbuf, redirecting std::cerr that way.
>
> The bp::starting_dir should be renamed to bp::current_path to be
> consistent with Boost.Filesystem.
>
Thought about that, but it's not the current_path, since the child
process can change it's execution directory.
> Consider renaming bp::spawn to avoid confusion with boost::asio::spawn
> which is how you launch coroutines with Boost.Asio.
>
Also thought about that, the other variants would be "launch" and
"execute", but they collide with older names in boost.process and spawn
reminds me of posix_spawn. Well it's not entirely correct, but that was
my assciation.
> I am missing a this_process::get_id() function to return the current
> process identifier.
>
Is defined in environment.hpp. I thought about a this_process header,
but that would just include environment.hpp anyway, so that's where it went.
>> * Implementation
>
> My major concern with the implementation is that the library is only
> header-only, so I cannot compile it as a static library if I wish to.
> There are several Boost header-only libraries that allows you to do
> that. The reason why I find this important for Boost.Process is because
> I want to avoid getting platform-specific C header files included into
> my project when I include a Boost.Process header.
Fair point, though it does only include the posix-headers, not
windows.h. If I had to include the latter, I would've done that, but I
consider the posix headers to be tolerable.
If that is an issue, I could may solve that similar to boost/winapi, but
to be honest: the posix stuff doesn't pollute that much, in my opinion.
It's mainly a header-only library, since nearly everything is a tempalte
in one form or another.
>
> The library should throw its own process_error exception which inherits
> from system_error, similar to Boost.Filesystem.
>
Not sure what this would do, the boost.filesystem error gives you more
info about which paths are concerned. This is not true for this library
so I'd need a new exception class for every error, which would be a bit
of an overkill.
> I did not investigate the implementation in greater detail.
>
>> * Documentation
>
> The documentation is lacking an entire chapter with an overall
> description of the library and the concepts used. For instance:
>
Well that's correct, but on purpose. It's a documentation meant to
explain the library, not system concepts. I chose names that correspond
to what the stuff is called in the system, and linked to the systems
documentation occasionaly. I don't think such a documentation is a book,
it should only explain what it does, not what the system does.
> * What are processes and pipes?
>
> * What are the possible invocation mechanisms? Some of this is
> mentioned in the Design Rationale, but that is not the place
> where developers seek their information.
>
You mean child/spawn/system? That's also in the tutorial under the name
"Launch Functions".
> * How to set search paths or the current working directory.
>
> * How to use environment variables.
>
> * this_process
Ok, that's so little stuff, that it's only in the reference.
>
> Tutorial / Synchronous I/O: What is the default behaviour when I/O
> redirections are not specified?
System dependent, that's in the documentation. I think normally it just
redirects to the same streams as the father process. I don't think
there's a good default to enforce.
>
> Tutorial / Asynchronous I/O: Consider including an example with
> boost::asio::spawn as a way to obtain the yield context.
I linked to the boost.asio part of the documentation. Though I
personally like coroutines very much, it's more a feature for people
already familiar with boost.asio, not one for people starting with
boost.process.
>
> Reference / child: Does not mention how the life-time of the object
> and the child process are related. For instance, the documentation on
> the destructor only mentions that it is a destructor.
The destructor has a note on it saying that it will terminate the child.
>
> Reference / cmd: bp::cmd will parse the input string as a sequence of
> space-separated arguments. Is it spaces or whitespaces? Is it possible
> to have escapes for arguments that do contain spaces?
Spaces. I don't know what you mean by the latter, but "foo arg 1 \"arg
2\"" will yield {"foo", "arg", "1", "arg 2"}. It's written in a note in
design.
>
>> * Tests
>
> Did not investigate.
>
>> * Usefulness
>
> A process library is a very useful addition to Boost.
>
>> - Did you attempt to use the library? If so:
>> * Which compiler(s)
>> * What was the experience? Any problems?
>
> Did not try.
>
>> - How much effort did you put into your evaluation of the review?
>
> About 4-5 hours (one of which was gratuitously donated by daylight
> saving time.)
>
> [1] https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Named_Parameter
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk