Boost logo

Boost :

Subject: Re: [boost] [process] Formal Review starts today, 27 October
From: Klemens Morgenstern (klemens.morgenstern_at_[hidden])
Date: 2016-11-10 04:05:00


>> Yes
>>> I know there is some difficulty because the Posix fork + exec design
>>> does not work the same as way as the Windows API. But I think there
>>> is some value in specifying some basic concept of an `executor` and
>>> its relationship with a `property`. For instance, the executor will
>>> have some function (call operator) that starts a new process using
>>> internally stored string arguments and returns a handle to the new
>>> process. The string arguments should arguably be non-modifiable
>>> once the executor is created, etc.
>> The string are only set after the executor is created, I don't know
>> how that would work. Also the flags need to be accumulated through
>> disjunction, I don't see how it would be better to this on
>> construction time, and not in on_setup. It's internal, so why bother
>> with a strange workaround there?
> What I noticed in the source is that some of the properties require
> special logic because they are manipulating other properties instead of
> the "executor". For example, the `shell` property is manipulating the
> path and arg properties but nothing else.
>
> Earlier I mentioned on working out a loose way of specifying a
> `property`. I think such a specification would not consider properties
> that only manipulate other properties to be a property (since it could
> do that separately from the `child` constructor).
Well that depends on how you model it, I consider the question if a
process is launched through the shell or not to be a property of this;
it must be set prior to launching and it affects how it runs. There is
no need to model this exactly after the excutor; that's is actually not
possible.

Also it does not modify a property, but rather modifies an initializer,
which is build from several properties.

So do I understand you correctly, that you would want this syntax?

bp::system(cmd="foo");
bp::system(shell="foo");

If so - why? Why should'n you be able to launch the same process on it's
own or through the shell? It just makes more sense to me to enable the
shell as an additional feature.
>
>> [...]
>> [...]
>>> I was asking whether you could store a handle to the `group` in some
>>> object whose call operator starts the process (like Niall originally
>>> mentioned). So the handle would be present, but potentially invalid
>>> depending on how it was implemented.
>> Yes it could, but I stated enought times why this would be a horrible
>> idea.
> Yes, and I disagreed with your analysis. I recall the primary objection
> was poor performance - specifically that space for each type of possible
> property was needed. This is not true, the binding approach (at least
> how I saw it), would return a new callable type after each binding. In
> other words, the binding was conceptually calling `push_back` on a
> fusion like tuple. Moves + ref-qualifiers meant that dynamic memory
> would not copied in many cases. So
> `launcher("foo").group(...).stdout(...)` would return the equivalent of
> `tuple<launcher, group, stdout>` while moving `std::string` memory as
> needed.
>
> After thinking about it, all that I (and I think Bjorn and Niall) were
> trying to accomplish is separating the path and arguments from the
> remainder of the properties. If the path and arguments were in a single
> property, `shell` and `cmd` could return that property instead of
> requiring custom logic internally.
Well that is one reason, though I also assumed, that you wouldn't
construct a tuple like that.

The primary reason I really dislike the idea, is that every conceivable
property would need to be in a member defninition of this tuple, and
that is horrible.
>
>>>
>> [...]
>> [...]
>> [...]
>>> `launch` returns a callable that accepts 0 or more `properties` to
>>> launch the process. Example:
>>>
>>> class launcher {
>>> std::string exe_;
>>> std::vector<std::string> args_;
>>> public:
>>> launcher(std::string exe, std::vector<std::string> args)
>>> : exe_(std::move(exe)), args_(std::move(args)) {
>>> }
>>>
>>> template<typename... Props>
>>> child operator()(Props&&... props) const {
>>> return detail::exec(exe_, args_,
>>> std::forward<Props>(props)...); }
>>> };
>>> struct launch_ {
>>> template<typename... Args>
>>> launcher operator()(std::string exe, Args&&... args) const {
>>> return {std::move(exe), {std::forward<Args>(args)...}};
>>> }
>>> };
>>> constexpr launch_ launch{};
>>>
>>> // syntax:
>>> // launch("ls", "-la", "/")(stdout(my_pipe));
>> Yeah, that still doesn't work. When do you launcht process? Please
>> think those Ideas through, if you want me to take them serious. If
>> you launch at the call of "launch("ls", "-la", "/") than you couldn't
>> bind stdout. If you launch after the binding of stdout, how would you
>> append another paremeter? It's broken.
> How is it broken? Look at the code snippet, the second function call
> takes a variadic argument list like the `child` constructor. The string
> arguments are the only thing "bound" to an object. So appending another
> property would be done like:
>
> launch("ls, "-la", "/")(stdout(my_pipe1), stdin(my_pipe2));
Oh ok, sry, misunderstood that one. But then why is that any better?
Just fyi: since you can pass environment variable to the property-list,
this doesn't even have to be the same executable.
> That style is simply separating string arguments into a separate
> variadic function call. Or another way of looking at it - there is a
> single `property` for specifying the path and args that has a fixed
> positional location.
What about the cmd style, i.e. bp::system("gcc --version"); ?
>
>>> I don't see a good reason for the `child` class to actually launch
>>> the process. A global or static function that returns a `child`
>>> object should do it instead. This would remove the need for the
>>> `shell` property, which would be a distinct function.
>> Well we had a global function for that in previous versions, but that
>> wasn't like that much; especially since people got reminded of
>> std::async. I have to agree, it's not the common C++ idiom to use
>> functions to construct object, you also don't write
>>
>> std::thread t = std::launch_thread([]{});
>>
>> Also this would be bad, and probably a common mistake. It would
>> launch the process and immediatly terminate it.
>>
> Why would it immediately terminate the process? Or do you mean a
> non-joined `child` should terminate the process like `std::thread`?
Exactly. Also this would produce a deadlock if waiting in ~child and foo
throws.

bopstream os;
bp::child c("foo", std_in < os);

os << foo() << endl; //boom, deadlock.
> Admittedly, I did forget about that case - its probably best to follow
> `std::thread` closely and terminate the current process if not `join`ed
> or `detach`ed. I don't there is anything wrong with a function
> returning a `child`, provided the behavior of ignoring the `child`
> is well documented and defined. However, there is little difference to
> the code snippet above and:
>
> bp::child c1(bp::exec("ls", "-la", "/"), bp::stdout(my_pipe1));
> bp::child c2(bp::shell("ls", "-la", "/"), bp::stdout(my_pipe2));
> bp::child c3("ls", {"-la, "/"}, bp::stdout(my_pipe3));
>
> Where the constructor in `c3` forwards to the constructor in `c1`; `c2`
> uses the `bp::exec` object returned by `bp::shell`; `c1` uses the
> `bp::exec` property to set the path and args of the executor.
>
Ok, so I consider exe/args, cmd and shell different properties, but in
the end they all get put into one initializer.
(https://github.com/klemens-morgenstern/boost-process/blob/develop/include/boost/process/detail/windows/basic_cmd.hpp#L92)
So if anything this could be made public, though at the time the
values get put into the exe-args initializrs they are already parsed and
formatted. I really don't see the benefit.

>>> Also, is there an advantage to allowing string arguments anywhere in
>>> the function call? A single `std::vector<std::string>` argument can
>>> be initialized implicitly with a `{...}`. This would easily
>>> separate the arguments from the rest of the properties without the
>>> lazy binding provided above:
>>>
>>> shell(std::string cmd, Props&&...)
>>> launch(std::string exe, std::vector<std::string> args,
>>> Props&&...)
>>>
>>> Which can be invoked like:
>>>
>>> bp::shell("ls -la /", bp::stdout(my_pipe));
>>> bp::launch("ls", {"-la", "/"}, bp::stdout(my_pipe));
>>> bp::launch("foo", {}, bp::stdout(my_pipe));
>>>
>>> Is requiring a `{}` pair less desirable than the variadic syntax?
>> Yes there is, bp::system("gcc", "main.cpp", "-o", "main.o"); looks
>> awesome. Also keep in mind, that you have the cmd-syntax, which is a
>> single string and that you can path a boost::filesystem path. What
>> you are basically proposing are positional arguments, which might
>> make sense for exe/args or cmd, but not for the test of that. There
>> is no logical order of the other arguments and secondly it is
>> variadic to allow the user to skip some arguments.
> Requiring a `{}` hardly seems detrimental to me, but I guess people
> have different tastes. I hope you at least consider positional
> arguments for the path and args. For the remainder - I agree the
> variadic constructor is probably preferable since its behavior can be
> closer to std::thread.
The only reason you can't use that is that the parameter list is
forwarded internally, hence the type-deduction fails. That syntax is
already valid:

bp::system("foo", std::vector<std::string>{"--arg", "bar"});

or

auto args = {"--arg", "bar"};
bp::system("foo", args);

Though I just checked my code, and that would work for
`std::initializer_list<char*>` but not for `std::initializer_list<const
char*>`, so I'll add a bugfix and test for the latter.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk