Boost logo

Boost :

Subject: Re: [boost] [coroutine][review] Very Late Review
From: Oliver Kowalke (oliver.kowalke_at_[hidden])
Date: 2012-09-18 16:37:33


Am 17.09.2012 18:24, schrieb Giovanni Piero Deretta:
> I like the idea that has been proposed during the review that the result of
> a coroutine should be retrivable at any time via a get method.I dislike
> the proposed optional<T> result for operator(). If the previous delayed
> result functionality is implemented, operator() should just return *this,
> so that the subsequent get() call can be chained or the result tested for
> non empty. This change has the potential of deeply affecting the interface,
> so I would really like to see it subject to a mini review before acceptance.
I would not add a coroutine<>::get() method because a coroutine is a
'special function' (in the sense of multiple entry/leave). AFAIK
boost::function also does not have a get() method.

> The coroutine object has both the concept of a not-a-coroutine and of a
> completed coroutine. This is confusing, and should be conflated to 'empty'
> state, analogous to an empty std::function. In addition to simplifying the
> documentation and the interface, it should also slightly improve the
> implementation. Operator bool should return !empty as per std::function.
OK

> It has been suggested that coroutine::self be removed an have a thread
> specific object. I strongly disagree with this option. First of all it will
> make it too easy to break type safety:
>
> typedef this_coroutine<int(int)> this_coro;
>
> typedef this_coroutine<void()> that_coro;
>
> coroutine<int(int)>([]{
> that_coro::yield(); // oops compiles fine, UB at runtime.
> });
that_coro::yield() could assert if the internal 'control block'
(coroutine_base) is not
that of the current coroutine - of course at runtime.

> In practice, one need to pass the signature to any code that wants to yield
> anyway, so might as well pass the self object around, at least one can rely
> on template type deduction.
some of the community members are not happy with tracking the self
object trough the code

> Also, one does not really want arbitrary code to yield out of the coroutine
> as this can be dangerous (think of code between yield statements as a sort
> of critical section) as invariants can be broken or mutices held.
you can not prevent user doing such stuff, IMHO

> Instead I suggest doing away with a separate self type and instead making
> the first parameter to the coroutine functor a full fledged coroutine, with
> simmetric signature:
>
> coroutine<int(void)> my_coroutine([](coroutine<void(int)> caller) {
> caller(10);
> });
>
> int i = my_coroutine();
I'm sure that users asking for a typedef in coroutine<int(void)> for the
first parameter of the coroutine function

typedef coroutine< int(void) > coro_t;
typedef coro_t::caller_t caller_t;

coro_t my_coroutine( []( caller_t caller) {
     caller( 10);
});

int i = my_coroutine();

in the case of the reviewed library the coro_t::caller_t from the
example above is equivalent to
coro_t::self_t provided by the lib. The coro_t::self_t type is not a
instance of the original coroutine, it is a new instance
of coroutine_self<> (not derived from a coroutine's base classes).
If I would rename coroutine_self<>::yield() to
coroutine_self<>::operator() I get pretty much the same as you described
in your example. Of course self_t is not a coroutine but it provides
'coroutine' - semantics
(== context jumps).

> I do not like the separation between coroutines and generators. Instead the
> generator class should be dropped.
I disagree because a coroutine<> is a 'specialized' routine and a
generator<> provides a sequence of elements. the requirement of
coroutine<> is that its coroutine-fn must have return value (a last
return statement in its body) and
a generator<> requires that the generator-fn returns always void (return
values passed via self_t::yield()).

> The reason that I had generator as a
> separate class in the original coroutine SoC is because I wanted to give it
> iterator semantics (this also required internally using a reference counted
> pointer to make the iterator copiable, which I disliked).
a requirement of iterators is that its source/container is still valid
during its live-time.
why not require the same for generators? the iterators of generator<>
take only a reference to the associated generator object.

> Instead I propose
> to make every coroutine<T()> fulfill the the input range concept and every
> coroutine<void(T)> fulfill the output range concept (this is a
> functionality not covered by generators). In practice it means that for
> these classes of coroutines, begin and end are defined returning
> appropriate iterator (i.e. some very thin layers around the coroutine
> itself). As a range is required to survive its iterator anyway, the
> iterators can simply use plain pointers to the coroutine.
>
> The last suggestion works very well if coupled with the suggestion of
> making 'self' a full coroutine:
>
> std::vector<int> a = { ... };
> std::vector<int> b = { ... };
>
> std::cout << "The result of 'merge(a,b) is:\n";
> for(auto x : callcc([&](coroutine<void(int)> c) { std::merge(a.begin(),
> a.end(), b.begin(), b.end(), c.begin()); })
> {
> std::cout << x << "\n";
> }
>
> I thought modelling generalized coroutines<T(T)> as ranges, but I have yet
> to find a good model.
I've read the docu of boost.range - the requirements of a single pass
range is:

- For any Range |a|, |[boost::begin(a),boost::end(a))| is a valid range,
that is, |boost::end(a)| is reachable from |boost::begin(a)| in a finite
number of increments.
    -> might be violated by an infinite loop inside the coroutine-fn

- . A Range provides iterators for accessing a half-open range
|[first,one_past_last)| of elements and provides information about the
number of elements in the Range.
    -> it is for a coroutine or a generator impossible to determine the
number of elements

is it correct to apply the range concept to coroutine/generator?

> The nest of base
> classes required to handle all possible coroutine signatures is quite
> overwhelming though and I think it could be significantly reduced. I was
> able to trace through the execution paths I was interested into, but it
> required quite a bit of back and forward between implementation files.
I used static polymorphism and curiously recurring template pattern in
order to reduce the partial template specialisations (regarding to
return type and parameter types of the coroutine-fn).

> The coroutine object itself holds a reference counted pointer to the
> implementation. As the coroutine is movable but not copyable, I do not
> think that reference counting is needed.
The implementation uses a intrusive pointer - it has the footprint
(memory) of a raw pointer and the functions
used to increment/decrement the ref-counter are inlined - you pay only
of incrementing/decrementing an integer.
The reference count is required because I creating coroutine_self<>
which gets the ref-counted ptr.

> The coroutine itself should
> consists of two raw pointers: a pointer to the target context and a pointer
> to the result slot. You need this pointer anyway, because, if I understand
> correctly, you are planning to allow retriving the result slot at any time
> after the operator() and yield call.
no, I don't think not.

> The first branch can be removed by moving the complexity to the
> constructor: a redy-to-start context should be indistinguishable from a
> suspended context.
I was thinking about how to remove the branch in the resume() member
function - I'll try to implement your suggestion.

> The branch on is_complete can be removed by making sure that a resume on
> on-complete also pass the pointer to the return slot via the native_resume
> result.
OK

comments?

regards,
Oliver


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk