Boost logo

Boost :

Subject: Re: [boost] [Phoenix] Some questions and notes...
From: Giovanni Piero Deretta (gpderetta_at_[hidden])
Date: 2008-09-26 11:34:21


On Fri, Sep 26, 2008 at 4:44 PM, Joel de Guzman
<joel_at_[hidden]> wrote:
> Giovanni Piero Deretta wrote:
>
>>> IMO, it's not controversial. I've considered this approach a long
>>> time ago. It's actually doable: evaluate immediately when there are no
>>> placeholders in an expression. I'm not sure about the full effect
>>> of this behavior, OTOH. Such things should be taken very carefully.
>>> Mind you, the very impact of immediate evaluation on expressions like
>>> above already confuses people. Sometimes, the effect is subtle
>>> and is not quite obvious when you are dealing with complex
>>> lambda expressions. I know, from experiences with ETs (prime
>>> example is Spirit), that people get confused when an expression
>>> is immediate or not. The classic example:
>>>
>>> for_each(f, l, std::cout << 123 << _1)
>>>
>>> Oops! std::cout << 123 is immediate. But that's just for starters.
>>> With some heavy generic code, you can't tell by just looking at the
>>> code which expresions are being evaluated immediately or lazily.
>>
>> Ok, I'll start by saying that for me this is not just "it would be
>> great if...". I routinely use an extension to boost.lambda that allows
>> one to define lazy functions a-la phoenix, except that they lazy only
>> if one or more of their parameter is a lambda expression.
>
> Are you suggesting that you want phoenix functions to be "optionaly
> lazy" too? Currently, they are not. That can be done, but I need
> more convincing.
>

I would love if they were. But I can live with a layer on top of that.
About the convincing part... hum I guess the only way is to try them
and see if one find them convenient. For me it started as an
experiment that worked.

>
>> I usually assume that an expression is not lazy unless I see a
>> placeholder in the right position. I do not think that generic code
>> makes thing worse: I carefully wrap all function parameters with the
>> equivalent unlambda [1] *before* I pass them to higher order
>> functions. In fact I always have to because in general I do not know
>> if a certain function is optionally lazy or not.
>
> This is exactly the problem I see with generic code.
>

In general the compiler will loudly tell you if you get something
wrong (even if the errors might be unintelligible). Except of course
for something like the example below...

>> The biggest problem with optional lazyness is in fact not in generic
>> code, but in simple top level imperative code: most of my lazy
>> function objects are pure functions, with the most common exception
>> the 'for_each' wrapper. Sometimes I forget to protect the function
>> argument with lambda[], which makes the whole function call a lambda
>> expression.
>> You always use the result of a pure function, so the compiler will
>> loudly complain if it gets a lambda expression instead of the expected
>> type, but it is not usually the case with for_each. So the code will
>> compile, but will be silently a nop.
>
> I'm not sure I understand this part. Can you explain this with
> simple examples?

Ok, let's assume for_each is optionally lazy:

   for_each(range, lambda[cout << arg1]);

will print all the elements. What if I forget the lambda?

  for_each(range, cout << arg1);

D'oh, now everything is a big unary lambda expression. It compiles,
but as the lambda is never evaluated, it is a nop. At least gcc
doesn't warn that the code doesn't do anything.

If your lazy functions are pure functions, you will always use the
value (to pass it to another function or store it in a variable or
whatever:

  int i = map(range_of_ints, lambda[ arg1 * 2]).front();

If you forget the lambda, it will complain that the the result of map
doesn't have a front() parameter.

C++0x auto will make thing a bit trickier though:

  auto r2 = map(range_of_ints, arg1 * 2);

The user wanted r2 to be a range, but, as he forgot lambda[], it is
actually a lambda expression. The compiler will probably complain when
he tries to use r2, but the error will be more incompressible than
usual.

>>
>>>> Anyways, I can live with the current design, 'optional lazyness' could
>>>> be built on top of phoenix lazy functions. My only compliant is that
>>>> the 'lambda[]' syntax is already taken.
>>>
>>> I'd like to get convinced. Can you give me a nice use case
>>> for this 'optional lazyness' thing that cannot be done with
>>> the curent interface?
>>>
>>
>> I find it very convenient to define functions that I use very often
>> (for example tuple accessors, generic algorithms, etc...) as
>> polymorphic function objects. I put a named instance in an header file
>> and I use them as normal functions. Except that I can pass them to
>> higher order functions without monomorphizing them. In addition I can
>> use the exact same name in a lambda expression without having to pull
>> in additional headers or namespaces. After a while it just feel
>> natural and you wish that all functions provided the same
>> capabilities.
>>
>> The following snippet is taken from production code:
>>
>> map(
>> ents.to_tokens(entities, tok),
>> lambda[
>> tuple(
>> ll::bind(to_utf8, arg1)
>> , newsid, quote_id, is_about
>> )
>> ]
>> ) | copy(_, out) ;
>>
>> Some explainations:
>> * 'a | f' is equivalent to 'f(a)'. In practice I often use it to
>> chain range algorithm, but can be used for anything.
>> * '_' is similar to 'arg1', except that it can only used as a
>> parameter to a lazy lambda function and the resulting unary function
>> is no longer a lambda expression (so the 'lambdiness' doesn't
>> propagate).
>>
>> map returns a pair of transform iterators, copy is the range
>> equivalent of std::copy and tuple is equivalent to make_tuple. All
>> three functions can be used both inside and outside of lambdas.
>>
>> So, it is not really a question of power, just of convenience. You can
>> always have to functions one lazy and one not, but I like to have
>> everything packaged in a single place.
>>
>> Compile times are of course not pretty and requires lots of code to
>> roll this syntax on top of boost.lambda. I think that a port to
>> Phoenix will be much simpler and probably lighter at compile time.
>
> Those are pretty cool code. I'm still not sure of the implications
> of all these though. I know for sure that people less smarter than
> you are tend to get bitten by expressions that are intended to
> be lazy but are actually immediate. Perhaps we can be arrange
> for an "optionaly-lazy" layer on top of phoenix: it can be done,
> phoenix is modular enough to have that layer.
>

It would be great.

> In general though, I tend to avoid special cases. This
> "optional laziness" is based on special casing depending
> on some qualities of a lambda function.

Well, I guess that is a point of view.. as I see it, functions are
usually evaluated, unless some of the arguments are suspended: it is
not eager evaluation that is special, but lazyness (or partial
application or whatever you want to call it).

> This may be outside
> our subject, but this same special casing is the reason why
> I rejected lambda's design to have optional-reference-capture
> on the LHS. For example, this is allowed in lambda:
>
> int i = 0;
> i += _1;
>
> But in Phoenix, it would have to be:
>
> int i = 0;
> ref(i) += _1;
>
> All variables are captured by value, always. I know that's
> off the subject, but I'm sure you see what I mean.

I didn't even know that lambda captured by reference in this case (if
it is documented, I missed it)! I always use ref(i)... which begs
another question: could you add a shorthand syntax to tell that a
specific variable must be captured by reference? Lambda has it, but is
a bit cumbersome:

  int i =0;
  var_type<int> r_i = var(i);
  (_1 + r_i)(0); //capture by ref

something like this is desirable (and in fact can be done easily, but
IMHO should be part of the library):

  byref<int> i = 0;
  (_1 + i)(0); // capture by ref

BTW, yet another reason for always requiring lambda[] or something
equivalent: the library could delay the decision of capturing by
reference or by value by default up to the lambda introducer (I think
I already mentioned something like this in the past):

  int i = 0;

  lambda[ i+= arg1]; //default capture by value
  lambda_r[ i += arg1]; //default capture by reference

It would make Phoenix similar to C++0x lambdas.
Without requiring lambda[] [1] I do not see how this could be
implemented (expecially if you want the default capture behavior to be
by value).

[1] or a different set of placeholders... messy!

>
>>>>> BOOST_TEST((val(ptr)->*&Test::value)() == 1);
>>>>
>>>> This doesn't compile
>>>> BOOST_TEST((arg1->*&Test::value)(test) == 1);
>>>>
>>>> because 'test' is not a pointer. OTOH boost::bind(&Test::value,
>>>> arg1)(x) compiles for both x of type "Test&" and "Test*".
>>>
>>> Ah! Good point. But, hey, doesn't -> imply "pointer"? bind
>>> OTOH does not imply pointer. But sure I get your point and it's
>>> easy enough to implement. Is this a killer feature for you?
>>
>> No, not really, in fact I can see many would consider it needless
>> obfuscation. I just thought that was a nicer notation than bind.
>> Anyways, -> doesn't necessarily imply pointer. See optional.
>
> Hah, I don't want to get in that optional pointer semantics argument
> again :P Least I can say is that I never liked it. OptionalPointee
> (http://tinyurl.com/3fwlp9) does imply pointer.
>

Ok, as I said, it is not a killer feature :)

I want to comment about switch_, but I'll do it by replying to another
thread. After that, time to write a review :)

-- 
gpd

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk