Boost logo

Boost Users :

Subject: Re: [Boost-users] Subject: Formal Review of Proposed Boost.Process library starts tomorrow
From: Jeff Flinn (TriumphSprint2000_at_[hidden])
Date: 2011-02-14 09:29:57


Ilya Sokolov wrote:
> On Fri, 11 Feb 2011 19:01:35 +0500, Jeff Flinn
> <TriumphSprint2000_at_[hidden]> wrote:
>> Ilya Sokolov wrote:
>>> On Thu, 10 Feb 2011 19:41:25 +0500, Jeff Flinn
>>> <TriumphSprint2000_at_[hidden]> wrote:
>>>
>>>> I've used the process library as a replacement for Qt's QProcess(which
>>>> interfers with the main event loop in our application on Mac OSX). We
>>>> were able to make boost process work across Windows and Mac OSX, but
>>>> did
>>>> not have that sense of satisfaction at the end that usually comes with
>>>> using most boost libraries.
>>>>
>>>> I do not think the library should be accepted into boost for the
>>>> following reasons.
>>>>
>>>> Problems:
>>>> [Ilya Sokolov: I've reordered some items]
>>>
>>>> Still had to create wrapper function and classes and use ifdef's to
>>>> handle platform dependencies in client code.
>>>> Stung multiple times when passing a relative path when setting the
>>>> working directory on unix systems.
>>>> Customization points lack access to many of the required Windows
>>>> CreateProcess arguments.
>>>> Overly complex io specification.
>>> Could you tell us where? Please show also how your library would
>>> solve it.
>>
>> The myriad of stream behaviour, stream ends, and methods of
>> specification to me do not best represent the problem domain.
>
> We disagree here. The only problem I see is absence of the 'Rationale'
> section in the docs.
>
>> Thinking about this last night, I came up with the concept of a Ray,
>> which is a connected pair of boost iostreams' Source and Sink concepts.
>> Pipe is too general in that it conveys flow can be in either direction.
>
> The concept of pipe is known to almost every programmer.

So is the raw pointer concept, but most programmers see the constraints
imposed by scoped/shared_ptr as being useful. Perhaps the term
file_descriptor_directed_pipe is better.

>> So assuming unix still: "uses just one internal buffer for temporary
>> storage of the flowing data, hence what goes in one direction
>> overwrites whatever is a naive programmer tries to send backward"
>
> I never saw such errors.
>
>> as stated in: http://goo.gl/RXY7p, makes Ray a more accurate term than
>> pipe. Then a model of a Ray would be file_descriptor_ray, which
>> comprises a boost iostream file_descriptor_sink and a
>> file_descriptor_source.
>>
>> So having stdin, stdout, stderr Initializers with constructors or
>> factory functions:
>>
>> stdin_from()
>> stdin_from(const path&)
>> stdin_from(const file_descriptor_source&)
>> stdin_from(const file_descriptor_ray&)
>>
>> stdout_to()
>> stdout_to(const path&)
>> stdout_to(const file_descriptor_sink&)
>> stdout_to(const file_descriptor_ray&)
>>
>> stdout_to()
>> stderr_to(const path&)
>> stderr_to(const file_descriptor_sink&)
>> stderr_to(const file_descriptor_ray&)
>>
>> is to me conceptually much simpler than dealing with stream behaviour,
>> stream end and context classes.
>
> The code above has the problem outlined here:
> http://article.gmane.org/gmane.comp.lib.boost.devel/207711

I'm starting to look at the posix io issues. There is nothing in the
above interface that prevents handling the problem mentioned. Are you
saying there is no amount of file descriptor shuffling to solve the problem?

>> It also minimizes copying and heap allocated handles.
>
> CreateProcess() and fork()/exec() calls are much more greedy than any
> copying or heap allocation.
>
>> The default constructed initializers allow me to launch apps that
>> don't have std io, such as GUI apps, without carrying that baggage
>> around.
>
> Current interface allows you the same.

The last I saw boost::process::null still creates and opens temporary
files.

>> The path overload opens the appropriate file_descriptor sink/source,
>> keeping it open until initialization is complete.
>
> If all paths in your library is of type fs::path, then what is the type(s)
> of the arguments and environment of the child process?

In the attached zip, I've updated the windows executor implementation to
use CreateProcessW. It uses wide string lpCommandLine as converted from
the interface args and arg classes using boost
lexical_cast<std::wstring>. I've not completed env var handling yet. The
CreateProcess LPVOID lpEnvironment parameter can be either wide or
narrow as long as dwCreationFlags includes CREATE_UNICODE_ENVIRONMENT
flag. This further demonstrates the flexibility and openness of the
executor architecture. There could be wide/narrow environment
initializers because the initializers have access to set the startup
state of the child process as appropriate. In fact it should be easy to
parameterize the windows executor to be narrow or wide.

The posix executor would remain string agnostic.

>> The file_descriptor_source / file_descriptor_sink overloads ensure at
>> compile time that the handle has been opened as a source/sink as
>> appropriate. The file_descriptor_ray overloads know whether to get a
>> source/sink handle. It may be that file_descriptor_ray belongs to
>> boost iostreams.
>>
>>>> Lack of synchronicity when launching.
>>>> Pay for process handles even when not used.
>>> Not sure what that means.
>>
>> Sorry, my thoughts didn't make it all the way out of my finger tips. I
>> meant boost::process::child has a std::map<stream_id, handle> data
>> member. Once the child process has been initialized/launched there is
>> no need to know about any handles. Only the process id/handle is
>> required when one needs to wait on the child process to complete. My
>> view is that any stream definition info needing to be accessed after
>> initialization does not belong in the child process class.
>
> It might be better to return something like
> boost::tuple<process, std::map<stream_id, handle> > from create_child().

Again, in my view of this problem domain, streams are used only in the
launching of the child process. There is no need to pass anything
related to streams back from 'create_child'! The parent already defines
its communication entities with it's launched children. The following is
taken from libs/process/example/create_child.cpp in the attached zip.

namespace bp = boost::process;
namespace fus = boost::fusion;

typedef bp::io_initializer::sink_type sink_type;
typedef boost::iostreams::stream<sink_type> sink_stream;
typedef bp::file_descriptor_ray ray_type;

sink_type s("C:\\TEMP\\log.txt");
sink_stream log(s, std::ios_base::out|std::ios_base::binary);

ray_type ray1;
sink_stream to_child(ray1.m_sink, std::ios_base::out|std::ios_base::binary);

to_child << "testing_input_to_child " << std::endl;
log << "- before exec:" << std::endl;

try
{
     bp::executor().exec(fus::make_vector
     ( bp::paths ("C:\\TEMP\\some.exe")
     , bp::args ("C:\\TEMP\\file.txt")
     , bp::throw_on_launch_error()
     , bp::std_out_to_path ("C:\\TEMP\\y.txt")
     , bp::std_err_to (s)
     , bp::std_in_from (ray1)
     ));

     boost::xtime xt;
     boost::xtime_get(&xt, boost::TIME_UTC);
     xt.sec += 1;

     boost::this_thread::sleep(xt);
}
catch(const std::exception& e)
{
     std::string s = e.what();

     std::cout << s;
}

>>>> Different semantics regarding success/failure of CreateProcess/exec.
>>>> Lack of launch error reporting.
>>> That could be solved by using a separate pipe for error reporting.
>>> The only library where I've seen this technique is subprocess.py.
>>>
>>>> Replication of existing boost library functionality, and functionality
>>>> that better belongs to other libraries.
>>> Where and what functionality, exactly?
>>
>> Boost filesystem IMO, is the proper currency when dealing with paths.
>> It provides path decomposition which is done manually in
>> operations.hpp. It also insulates it's clients from the whole
>> unicode/utf8 issue(or at the least mitigates the issue) as described
>> in several of the longest mailing list threads in my memory.
>
> See above. IMHO, the only problem that fs::path solves is the lack
> of string type with known encoding. fs::path doesn't have any useful
> invariant
> and all of its methods could be implemented as free functions.

The lack of filesystem invariants is inconsequential here. Using
filesystem path self documents the interface in addition to all of the
facilities it provides. Whether all of path's methods could be
implemented as free functions or as class methods, 'they' should be used
rather than duplicating the code in process.

>>>> Lack of separation of concerns in create_child
>>>> The monolithic code lacks ability to properly unit test components.
>>> I'm not sure that the 'separation of concerns' would help us to
>>> unit-test better.
>>
>> How you can unit test all that's going on within create_child as it's
>> not exposed?
>
> I'd add more extension points to the bp::context class, something like
> boost::function<void()> createproces, boost::function<void()> fork and
> boost::function<void()> exec members. It would allow to call
> shellexecute(ex)()
> instead of CreateProcess() on windows and vfork() or spawn() instead of
> fork()/exec() on posix.

IMO, that seems a lot more complicated than supplying a mock executor
and/or unit tests for the accessible nested classes supplying services
for args, environtment, file_descriptor manipulation.

Thanks, Jeff




Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net