Boost logo

Boost :

Subject: Re: [boost] [Phoenix] Some questions and notes...
From: Joel de Guzman (joel_at_[hidden])
Date: 2008-09-26 10:44:14


Giovanni Piero Deretta wrote:

>> IMO, it's not controversial. I've considered this approach a long
>> time ago. It's actually doable: evaluate immediately when there are no
>> placeholders in an expression. I'm not sure about the full effect
>> of this behavior, OTOH. Such things should be taken very carefully.
>> Mind you, the very impact of immediate evaluation on expressions like
>> above already confuses people. Sometimes, the effect is subtle
>> and is not quite obvious when you are dealing with complex
>> lambda expressions. I know, from experiences with ETs (prime
>> example is Spirit), that people get confused when an expression
>> is immediate or not. The classic example:
>>
>> for_each(f, l, std::cout << 123 << _1)
>>
>> Oops! std::cout << 123 is immediate. But that's just for starters.
>> With some heavy generic code, you can't tell by just looking at the
>> code which expresions are being evaluated immediately or lazily.
>
> Ok, I'll start by saying that for me this is not just "it would be
> great if...". I routinely use an extension to boost.lambda that allows
> one to define lazy functions a-la phoenix, except that they lazy only
> if one or more of their parameter is a lambda expression.

Are you suggesting that you want phoenix functions to be "optionaly
lazy" too? Currently, they are not. That can be done, but I need
more convincing.

> I usually assume that an expression is not lazy unless I see a
> placeholder in the right position. I do not think that generic code
> makes thing worse: I carefully wrap all function parameters with the
> equivalent unlambda [1] *before* I pass them to higher order
> functions. In fact I always have to because in general I do not know
> if a certain function is optionally lazy or not.

This is exactly the problem I see with generic code.

> The additional
> advantages of always using lambda[] are that:
>
> - a lambda[] stands out in the code better than a placeholder, so it
> is clearer what is going on (if you think about it, pretty much every
> other language that support a lambda abstraction has a lambda
> introducer).

Yes.

> - the rule to determine the scope of a placeholder is simpler: it
> doesn't cross a lambda[] barrier.

Good point.

> The biggest problem with optional lazyness is in fact not in generic
> code, but in simple top level imperative code: most of my lazy
> function objects are pure functions, with the most common exception
> the 'for_each' wrapper. Sometimes I forget to protect the function
> argument with lambda[], which makes the whole function call a lambda
> expression.
> You always use the result of a pure function, so the compiler will
> loudly complain if it gets a lambda expression instead of the expected
> type, but it is not usually the case with for_each. So the code will
> compile, but will be silently a nop.

I'm not sure I understand this part. Can you explain this with
simple examples?

> [1] I have rolled my own lambda[] syntax for this (which in addition
> makes a a boost.lambda expression result_of compatible), and this is
> why the behavior of phoenix::lambda surprised me.
>
>>> Anyways, I can live with the current design, 'optional lazyness' could
>>> be built on top of phoenix lazy functions. My only compliant is that
>>> the 'lambda[]' syntax is already taken.
>> I'd like to get convinced. Can you give me a nice use case
>> for this 'optional lazyness' thing that cannot be done with
>> the curent interface?
>>
>
> I find it very convenient to define functions that I use very often
> (for example tuple accessors, generic algorithms, etc...) as
> polymorphic function objects. I put a named instance in an header file
> and I use them as normal functions. Except that I can pass them to
> higher order functions without monomorphizing them. In addition I can
> use the exact same name in a lambda expression without having to pull
> in additional headers or namespaces. After a while it just feel
> natural and you wish that all functions provided the same
> capabilities.
>
> The following snippet is taken from production code:
>
> map(
> ents.to_tokens(entities, tok),
> lambda[
> tuple(
> ll::bind(to_utf8, arg1)
> , newsid, quote_id, is_about
> )
> ]
> ) | copy(_, out) ;
>
> Some explainations:
> * 'a | f' is equivalent to 'f(a)'. In practice I often use it to
> chain range algorithm, but can be used for anything.
> * '_' is similar to 'arg1', except that it can only used as a
> parameter to a lazy lambda function and the resulting unary function
> is no longer a lambda expression (so the 'lambdiness' doesn't
> propagate).
>
> map returns a pair of transform iterators, copy is the range
> equivalent of std::copy and tuple is equivalent to make_tuple. All
> three functions can be used both inside and outside of lambdas.
>
> So, it is not really a question of power, just of convenience. You can
> always have to functions one lazy and one not, but I like to have
> everything packaged in a single place.
>
> Compile times are of course not pretty and requires lots of code to
> roll this syntax on top of boost.lambda. I think that a port to
> Phoenix will be much simpler and probably lighter at compile time.

Those are pretty cool code. I'm still not sure of the implications
of all these though. I know for sure that people less smarter than
you are tend to get bitten by expressions that are intended to
be lazy but are actually immediate. Perhaps we can be arrange
for an "optionaly-lazy" layer on top of phoenix: it can be done,
phoenix is modular enough to have that layer.

In general though, I tend to avoid special cases. This
"optional laziness" is based on special casing depending
on some qualities of a lambda function. This may be outside
our subject, but this same special casing is the reason why
I rejected lambda's design to have optional-reference-capture
on the LHS. For example, this is allowed in lambda:

    int i = 0;
    i += _1;

But in Phoenix, it would have to be:

    int i = 0;
    ref(i) += _1;

All variables are captured by value, always. I know that's
off the subject, but I'm sure you see what I mean.

>>>> BOOST_TEST((val(ptr)->*&Test::value)() == 1);
>>> This doesn't compile
>>> BOOST_TEST((arg1->*&Test::value)(test) == 1);
>>>
>>> because 'test' is not a pointer. OTOH boost::bind(&Test::value,
>>> arg1)(x) compiles for both x of type "Test&" and "Test*".
>> Ah! Good point. But, hey, doesn't -> imply "pointer"? bind
>> OTOH does not imply pointer. But sure I get your point and it's
>> easy enough to implement. Is this a killer feature for you?
>
> No, not really, in fact I can see many would consider it needless
> obfuscation. I just thought that was a nicer notation than bind.
> Anyways, -> doesn't necessarily imply pointer. See optional.

Hah, I don't want to get in that optional pointer semantics argument
again :P Least I can say is that I never liked it. OptionalPointee
(http://tinyurl.com/3fwlp9) does imply pointer.

>>>>> Well, that's all for now, those questions are mostly to get the
>>>>> discussion rolling, more to come.
>>> An additional question: is it possible to make lambdas always
>>> assignable? ATM, if you close around a reference (something which I
>>> think is very common), the lambda expression is only
>> Not sure what you mean by "close around a reference".
>
> Hum, "the lambda expression closure captures a reference to a local object"
>
> int i = 2;
> auto plus_i = arg1 + ll::ref(i); //closes around i by reference
>
> decltype(plus_i) y; // error, not default constructible
> plus_i = plus_i; //error plus_i not assignable
>
>>> CopyConstructible. This means that an iterator that holds a lambda
>>> expression by value internally (think about filter_iterator or
>>> transform iterator) technically does no longer conform to the Iterator
>>> concept.
>> Good point.
>
> In fact I think the problem is not limited to iterators. AFAIK
> standard algorithms require that their funcitonal parameters be
> assignable.
>
>>> A simple fix, I think, is internally converting all references to
>>> reference wrappers which are assignable.
>> I wish it was that simple. Anyway, gimme some time to ponder on
>> the impact of this on the implementation. Perhaps an easier way
>> is to do as ref does: store references by pointer.
>
> Sure.

Understood. I'll see what I can do in that regard. I agree with
you. That's good observation and thanks for pointing it out.

Regards,

-- 
Joel de Guzman
http://www.boostpro.com
http://spirit.sf.net

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk