Boost logo

Boost :

Subject: Re: [boost] [process] Formal Review starts today, 27 October
From: Lee Clagett (forum_at_[hidden])
Date: 2016-11-12 07:35:27


On Thu, 10 Nov 2016 10:05:00 +0100
Klemens Morgenstern <klemens.morgenstern_at_[hidden]> wrote:

> [...]
> [...]
> [...]
> > What I noticed in the source is that some of the properties require
> > special logic because they are manipulating other properties
> > instead of the "executor". For example, the `shell` property is
> > manipulating the path and arg properties but nothing else.
> >
> > Earlier I mentioned on working out a loose way of specifying a
> > `property`. I think such a specification would not consider
> > properties that only manipulate other properties to be a property
> > (since it could do that separately from the `child` constructor).
> Well that depends on how you model it, I consider the question if a
> process is launched through the shell or not to be a property of this;
> it must be set prior to launching and it affects how it runs. There
> is no need to model this exactly after the excutor; that's is
> actually not possible.

Yes, whether a process is being launched in a shell must be set before
starting the process, but this is irrelevant. Is `shell` a property of
the `child` or a property of the path / arguments pair? In other words,
does the internal logic of the `child` constructor actually need to know
whether its launching a shell or just some other executable? The
implementation would suggest that `child` does not need to know.

> Also it does not modify a property, but rather modifies an
> initializer, which is build from several properties.
>
> So do I understand you correctly, that you would want this syntax?
>
> bp::system(cmd="foo");
> bp::system(shell="foo");
>
> If so - why? Why should'n you be able to launch the same process on
> it's own or through the shell? It just makes more sense to me to
> enable the shell as an additional feature.

You can launch the process on its own or through the shell with that
syntax. Its just that `shell` is modifying the path + args properties
given to the `child`.

> >
> [...]
> [...]
> [...]
> > Yes, and I disagreed with your analysis. I recall the primary
> > objection was poor performance - specifically that space for each
> > type of possible property was needed. This is not true, the binding
> > approach (at least how I saw it), would return a new callable type
> > after each binding. In other words, the binding was conceptually
> > calling `push_back` on a fusion like tuple. Moves + ref-qualifiers
> > meant that dynamic memory would not copied in many cases. So
> > `launcher("foo").group(...).stdout(...)` would return the
> > equivalent of `tuple<launcher, group, stdout>` while moving
> > `std::string` memory as needed.
> >
> > After thinking about it, all that I (and I think Bjorn and Niall)
> > were trying to accomplish is separating the path and arguments from
> > the remainder of the properties. If the path and arguments were in
> > a single property, `shell` and `cmd` could return that property
> > instead of requiring custom logic internally.
> Well that is one reason, though I also assumed, that you wouldn't
> construct a tuple like that.
>
> The primary reason I really dislike the idea, is that every
> conceivable property would need to be in a member defninition of this
> tuple, and that is horrible.

The current implementation isn't without boilerplate - traits and
processing hooks for fusion::for_each are needed for the properties
too. Although I do agree that launching the process inside of the
`child` constructor is preferable since the destructor should terminate
if not detached or joined. However, I think some of the properties need
some work.

> >
> [...]
> [...]
> [...]
> [...]
> > How is it broken? Look at the code snippet, the second function call
> > takes a variadic argument list like the `child` constructor. The
> > string arguments are the only thing "bound" to an object. So
> > appending another property would be done like:
> >
> > launch("ls, "-la", "/")(stdout(my_pipe1), stdin(my_pipe2));
> Oh ok, sry, misunderstood that one. But then why is that any better?
> Just fyi: since you can pass environment variable to the
> property-list, this doesn't even have to be the same executable.
> > That style is simply separating string arguments into a separate
> > variadic function call. Or another way of looking at it - there is a
> > single `property` for specifying the path and args that has a fixed
> > positional location.
> What about the cmd style, i.e. bp::system("gcc --version"); ?
> >
> [...]
> [...]
> > Why would it immediately terminate the process? Or do you mean a
> > non-joined `child` should terminate the process like
> > `std::thread`?
> Exactly. Also this would produce a deadlock if waiting in ~child and
> foo throws.
>
> bopstream os;
> bp::child c("foo", std_in < os);
>
> os << foo() << endl; //boom, deadlock.
> > Admittedly, I did forget about that case - its probably best to
> > follow `std::thread` closely and terminate the current process if
> > not `join`ed or `detach`ed. I don't there is anything wrong with a
> > function returning a `child`, provided the behavior of ignoring the
> > `child` is well documented and defined. However, there is little
> > difference to the code snippet above and:
> >
> > bp::child c1(bp::exec("ls", "-la", "/"), bp::stdout(my_pipe1));
> > bp::child c2(bp::shell("ls", "-la", "/"), bp::stdout(my_pipe2));
> > bp::child c3("ls", {"-la, "/"}, bp::stdout(my_pipe3));
> >
> > Where the constructor in `c3` forwards to the constructor in `c1`;
> > `c2` uses the `bp::exec` object returned by `bp::shell`; `c1` uses
> > the `bp::exec` property to set the path and args of the executor.
> >
> Ok, so I consider exe/args, cmd and shell different properties, but
> in the end they all get put into one initializer.
> (https://github.com/klemens-morgenstern/boost-process/blob/develop/include/boost/process/detail/windows/basic_cmd.hpp#L92)
> So if anything this could be made public, though at the time the
> values get put into the exe-args initializrs they are already parsed
> and formatted. I really don't see the benefit.

The benefit is it removes the special internal logic for a few specific
properties and makes what the code is _actually_ doing more clear. I've
also been pushing for a more formal specification of `properties`
because some of the IO variants are doing too much in `child`. The
constructor for `std::thread` does not throw once the thread is
created, but what about some of the IO properties for `process::child`?
Several of the `on_exec`/`on_success` hooks are doing various calls that
can throw - does `child` block internally in the constructor until the
other process completes? Does it detach? Terminate? And why is so much
being done inside of the `child` constructor when the `child` should
only need to know the pipe handles for communication. I'm mainly
thinking of the code for piping to future or mutable buffer.

>
> [...]
> [...]
> > Requiring a `{}` hardly seems detrimental to me, but I guess people
> > have different tastes. I hope you at least consider positional
> > arguments for the path and args. For the remainder - I agree the
> > variadic constructor is probably preferable since its behavior can
> > be closer to std::thread.
> The only reason you can't use that is that the parameter list is
> forwarded internally, hence the type-deduction fails. That syntax is
> already valid:

Yes I know that the initializer list deduction would fail. Which is why
I suggested the process arguments be positional.

> bp::system("foo", std::vector<std::string>{"--arg", "bar"});
>
> or
>
> auto args = {"--arg", "bar"};
> bp::system("foo", args);
>
> Though I just checked my code, and that would work for
> `std::initializer_list<char*>` but not for
> `std::initializer_list<const char*>`, so I'll add a bugfix and test
> for the latter.

Lee


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk