|
Boost : |
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2008-07-03 08:32:00
On Wed, Jul 2, 2008 at 6:25 PM, Giovanni Piero Deretta
<gpderetta_at_[hidden]> wrote:
>
> Wait, there is a bit of confusion. There are two kind of lazyness: the
> first kind are lazily evaluated ranges, like those that use
> transform_iterator, filter_iterator, etc... the other kind are lazily
> evaluated functions like those in Phoenix (i.e. higher order functions
> that return other functions).
>
> Let's take a map function. It can be lazy in two ways:
>
> map(range, mapper) [1]
>
> returns a lazy range (i.e. the moral equivalent of two transform_iterators).
>
> map(_1, mapper) [2]
>
> returns a lazy unary function that takes a range and returns another
> (lazy range).
> There is a third syntax that can be used:
>
> range | map(mapper) [3]
>
> Now, in the usual phoenix world, the 'map' in [1] and [2] would
> actually be two different functions (functions in phoenix are always
> lazy even if all arguments are provided). Map in [3] must be again a
> different function. This means that a lazy range algorithm should
> provide all three variants to satisfy all the use cases. Unless...
>
> Maps in [1] and [2] can actually be the same function by adding
> "lazyness on demand" (which is not hard to do). Also lazy functions
> let's us implement the third syntax easily:
> Let's assume that 'operator|(r, f)' is a global operator that returns
> 'f(r)'. This doesn't work quite right though:
>
> range | map(_1, mapper)
>
> Because map(_1, mapper) doesn't simply return an unary function, but
> it is actually an expression template, which will try to make
> operator| 'its own'. You need to stop the expression template
> machinery. In boost.lambda you use 'unlambda', in phoenix I think that
> the magic keyword is 'lambda[]':
>
> range | lambda[ map(_1, mapper) ]
>
> It should works, but the 'lambda' is a bit ugly. If we introduced a
> new placeholder, like _, we could make the syntax more lightweight:
>
> range | map(_, mapper) [4]
>
> I.e. _ instructs phoenix to automatically wrap the lazy function in a
> lambda block. Note that this syntax is quite similar to the syntax in
> [3]. if we followed FC++ lead and swapper the order of the range and
> mapper argument in 'map', we could add a rule that says that any
> missing argument is automatically substituted with a _ (in practice,
> currying, in fact FC++ itself introduced the '_' syntax for
> generalized currying):
>
> range | map(mapper, _) -> range | map(mapper) [5]
>
> Nice! Unfortunately, automatic currying is (very arguably) error
> prone, and most importantly, it makes it impossible to have variadic
> functions (for example ,in sort(range), how do you know if this is a
> curried expression or you are using the default comparator?).
>
> I think that [4] is desirable, but [5] is not worth it.
>
Thanks for the detailed explanation. I on the other hand am thinking
of something a little bit different:
Consider:
operator|(r, f) -> lazy_range<typeof(r), apply<typeof(f)> >
Now that allows you to chain operator|(r, f) infinitely, just making
sure that along the way you create an instance of a specific type
(lazy_range) which when applied a begin(...) or end(...) function will
yield the appropriately constructed iterator.
Note that in the examples we've been giving:
source | take(10) | filter(filter_function()) | map(mapper())
We'd have a lazy_range<...> which has a type:
lazy_range<
lazy_range<
lazy_range<
range_type,
taken_range_generator
>,
filer_function
>,
mapper
>
An instance of this is returned by the whole expression template (the
argument to the construction of the instance would be the original
source range), which we can call begin(...) or end(...) against, and
get the correct iterator type. If we have the above typedefed as
'some_lazy_range', doing:
begin(some_lazy_range(source)) -> the beginning of the lazy range
end(some_lazy_range()) -> the 'end' iterator given the lazy range
>>
>> for_each ( range(source) | take(10) | map(mapper()), cout << arg1 <<
>> endl ); // with Phoenix
>>
>> Assuming that there's a boost::for_each(range<T>, F) -- I don't think
>> I need to know the result type of the reducer in that example. Am I
>> missing something?
>
> There is no reducer (i.e. a fold) in the above expression. If you want
> to reduce a range you need to know the result type of the reducer
> function if you want to actually save the final result (which is
> usually the case :) ).
>
Right. Maybe that's a bad example. Considering that we can make a
reducer an external function and adapted using
'make_function_output_iterator', we can use it in std::copy (of course
something like boost::copy(range, function) ):
reducer reducer_instance;
copy ( source | take(10) | map(mapper()) ,
make_function_output_iterator(reducer_instance)
);
In this case I really don't need to know the exact type of the
range(s) created by the pipe syntax, because type deduction will do
the trick for me.
>>
>>>> Maybe this is
>>>> why Dave is asking for a better BOOST_TYPEOF. ;-)
>>>
>>> Well, as long as the result type of the final fold has been
>>> registered, it should work out of the box.
>>>
>>
>> Of course, registered manually you mean? Or do you rely on
>> compiler-specific 'typeof'?
>
> manually if there is no native typeof available.
>
Okay... I think this is the part that sucks somehow (at least before C++0x).
>>
>> Shouldn't the deforestation already have happened in the zipping and
>> application of the function to yield another tuple?
>>
>
> let say you have the rewrite rule:
> map<map<Range,A>,B> -> map<Range, compose<A,B> >
>
> If zip_with returns a map<zip<R1, R2>, A>, you can simplify the
> expression map(zip_with(r1, r2, a), b) as:
>
> map<map<zip<R1, R2>, A>, B> -> map<zip<R1, R2>, compose<A,B> >
>
> without any other special rule. If zip_with were to return a special
> zip<R1, R2, A>, you
> need a specific rule to handle this case. I.e. zip_with shouldn't be a
> primitive, but should be constructed on top of map and zip.
>
True, but zip_with can return a range: map<map<zip<transform<R1, F1>,
transform<R2, F2> > >, A>, B> which can still be re-written.
BTW, are the rewrite rules already there, or will this have to be
something that has to still be written using Boost.MPL?
-- Dean Michael C. Berris Software Engineer, Friendster, Inc.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk