Boost logo

Boost :

Subject: Re: [boost] [Phoenix] Some questions and notes...
From: Giovanni Piero Deretta (gpderetta_at_[hidden])
Date: 2008-09-26 09:24:55


On Fri, Sep 26, 2008 at 3:06 AM, Joel de Guzman
<joel_at_[hidden]> wrote:
> Giovanni Piero Deretta wrote:
>
>>> It's not broken. As Doug noted in his review, phoenix lambda
>>> is like lambda protect (http://tinyurl.com/3sx7bo).
>>
>> I would be perfectly fine if it lambda[f] worked as protect(f), but it
>> actually is subtly different
>> What I do not like is the extra '()' you have to use to actually get
>> the protected lambda:
>>
>> int i = 0;
>> std::cout << protect(arg1)(i) ; // print 0
>
> Have you tried it? I did and I get compiler error.

Yes I did. But I just found out that I had mixed lambda placeholders
with phoenix ones (In my own code I use the argN placeholers with
lambda too for compatiblity with boost.bind). It is interesting that
when using the arg1 phoenix placeholder with lambda.protect everything
compiles and gives me what I expected... werid.

Anyways, as I said in another email, I always confuse protect with
unlambda. I initially expected lambda[] to behave as unlambda.

>
>> I now understand why you need another evaluation round, and I see the
>> need for local variables in lambdas (I've missed them in boost.lambda,
>> and it was one of the reasons I was eagerly waiting for phoenix to be
>> reviewed).
>>
>> My only objection is that a lambda[f] which doesn't have any local
>> variables should just return 'f' and not a nullary. In fact I think
>> this should be a global propery of lambda expressions:
>>
>> Let 'add' be an binary lazy function:
>>
>> 'add(arg1, 0)'
>>
>> should return an unary function (as it is currently the case). OTOH:
>>
>> 'add(1, 2)'
>>
>> should immediately be evaluated and not return a nullary function. In
>> practice, 'add' would be 'optionally lazy'. This is in fact not that
>> surprising: let's substitute add with its corresponding operator:
>>
>> 'arg1 + 0'
>>
>> returns an unary funciton, but
>>
>> '1 + 2'
>>
>> is immediately evaluated. I know this is a bit controversial and would
>> probably require large code changes, but probably a review is the best
>> place to comment on design aspects.
>
> IMO, it's not controversial. I've considered this approach a long
> time ago. It's actually doable: evaluate immediately when there are no
> placeholders in an expression. I'm not sure about the full effect
> of this behavior, OTOH. Such things should be taken very carefully.
> Mind you, the very impact of immediate evaluation on expressions like
> above already confuses people. Sometimes, the effect is subtle
> and is not quite obvious when you are dealing with complex
> lambda expressions. I know, from experiences with ETs (prime
> example is Spirit), that people get confused when an expression
> is immediate or not. The classic example:
>
> for_each(f, l, std::cout << 123 << _1)
>
> Oops! std::cout << 123 is immediate. But that's just for starters.
> With some heavy generic code, you can't tell by just looking at the
> code which expresions are being evaluated immediately or lazily.

Ok, I'll start by saying that for me this is not just "it would be
great if...". I routinely use an extension to boost.lambda that allows
one to define lazy functions a-la phoenix, except that they lazy only
if one or more of their parameter is a lambda expression.

I usually assume that an expression is not lazy unless I see a
placeholder in the right position. I do not think that generic code
makes thing worse: I carefully wrap all function parameters with the
equivalent unlambda [1] *before* I pass them to higher order
functions. In fact I always have to because in general I do not know
if a certain function is optionally lazy or not. The additional
advantages of always using lambda[] are that:

- a lambda[] stands out in the code better than a placeholder, so it
is clearer what is going on (if you think about it, pretty much every
other language that support a lambda abstraction has a lambda
introducer).

- the rule to determine the scope of a placeholder is simpler: it
doesn't cross a lambda[] barrier.

The biggest problem with optional lazyness is in fact not in generic
code, but in simple top level imperative code: most of my lazy
function objects are pure functions, with the most common exception
the 'for_each' wrapper. Sometimes I forget to protect the function
argument with lambda[], which makes the whole function call a lambda
expression.
You always use the result of a pure function, so the compiler will
loudly complain if it gets a lambda expression instead of the expected
type, but it is not usually the case with for_each. So the code will
compile, but will be silently a nop.

[1] I have rolled my own lambda[] syntax for this (which in addition
makes a a boost.lambda expression result_of compatible), and this is
why the behavior of phoenix::lambda surprised me.

>
>> Anyways, I can live with the current design, 'optional lazyness' could
>> be built on top of phoenix lazy functions. My only compliant is that
>> the 'lambda[]' syntax is already taken.
>
> I'd like to get convinced. Can you give me a nice use case
> for this 'optional lazyness' thing that cannot be done with
> the curent interface?
>

I find it very convenient to define functions that I use very often
(for example tuple accessors, generic algorithms, etc...) as
polymorphic function objects. I put a named instance in an header file
and I use them as normal functions. Except that I can pass them to
higher order functions without monomorphizing them. In addition I can
use the exact same name in a lambda expression without having to pull
in additional headers or namespaces. After a while it just feel
natural and you wish that all functions provided the same
capabilities.

The following snippet is taken from production code:

            map(
               ents.to_tokens(entities, tok),
               lambda[
                    tuple(
                         ll::bind(to_utf8, arg1)
                       , newsid, quote_id, is_about
                    )
                 ]
            ) | copy(_, out) ;

Some explainations:
* 'a | f' is equivalent to 'f(a)'. In practice I often use it to
chain range algorithm, but can be used for anything.
* '_' is similar to 'arg1', except that it can only used as a
parameter to a lazy lambda function and the resulting unary function
is no longer a lambda expression (so the 'lambdiness' doesn't
propagate).

map returns a pair of transform iterators, copy is the range
equivalent of std::copy and tuple is equivalent to make_tuple. All
three functions can be used both inside and outside of lambdas.

So, it is not really a question of power, just of convenience. You can
always have to functions one lazy and one not, but I like to have
everything packaged in a single place.

Compile times are of course not pretty and requires lots of code to
roll this syntax on top of boost.lambda. I think that a port to
Phoenix will be much simpler and probably lighter at compile time.

>>>
>>> BOOST_TEST((val(ptr)->*&Test::value)() == 1);
>>
>> This doesn't compile
>> BOOST_TEST((arg1->*&Test::value)(test) == 1);
>>
>> because 'test' is not a pointer. OTOH boost::bind(&Test::value,
>> arg1)(x) compiles for both x of type "Test&" and "Test*".
>
> Ah! Good point. But, hey, doesn't -> imply "pointer"? bind
> OTOH does not imply pointer. But sure I get your point and it's
> easy enough to implement. Is this a killer feature for you?

No, not really, in fact I can see many would consider it needless
obfuscation. I just thought that was a nicer notation than bind.
Anyways, -> doesn't necessarily imply pointer. See optional. In fact I
think that -> is, in general, a good substitute for the lack of an
overloadable 'operator.', sometimes I wish that reference_wrapper did
provide it.

>
>> BTW, what if the member function is not nullary?
>>
>> struct foo {
>> void bar(int){}
>> };
>>
>> foo * x =...;
>> int y = 0;
>> ((arg1->*&foo::bar)(arg2))(x, y);
>>
>> The above (or any simple variation I could think of) doesn't compile.
>> There is a way to make something like this to work without using bind?
>> Not that it is very compelling, I'm just curious.
>
> The examples show how. I don't know why your example does not compile.
> Again, see /test/operator/member.cpp for examples on this. Here's
> one that's not nullary:
>
> struct Test
> {
> int func(int n) const { return n; }
> };
>
> ...
>
> BOOST_TEST((val(ptr)->*&Test::func)(3)() == 3);
>
> and I just added this test for you:
>
> int i = 33;
> BOOST_TEST((arg1->*&Test::func)(arg2)(cptr, i) == i);
>
> Compiles fine.

Yes, it compiles fine... I think I was trying to stream out the result
of a void function, sorry for the noise ;).
Thanks.

>
>>>> Well, that's all for now, those questions are mostly to get the
>>>> discussion rolling, more to come.
>>
>> An additional question: is it possible to make lambdas always
>> assignable? ATM, if you close around a reference (something which I
>> think is very common), the lambda expression is only
>
> Not sure what you mean by "close around a reference".

Hum, "the lambda expression closure captures a reference to a local object"

int i = 2;
auto plus_i = arg1 + ll::ref(i); //closes around i by reference

decltype(plus_i) y; // error, not default constructible
plus_i = plus_i; //error plus_i not assignable

>
>> CopyConstructible. This means that an iterator that holds a lambda
>> expression by value internally (think about filter_iterator or
>> transform iterator) technically does no longer conform to the Iterator
>> concept.
>
> Good point.

In fact I think the problem is not limited to iterators. AFAIK
standard algorithms require that their funcitonal parameters be
assignable.

>
>> A simple fix, I think, is internally converting all references to
>> reference wrappers which are assignable.
>
> I wish it was that simple. Anyway, gimme some time to ponder on
> the impact of this on the implementation. Perhaps an easier way
> is to do as ref does: store references by pointer.

Sure.

BTW, boost.lambda has the same problem; I use a wrapper that hides the
lambda in an optional (another service provided by my lambda[]). With
in_place construction you can implement assignment as an copy
construction. But it adds overhead.

-- 
gpd

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk