Boost logo

Boost :

Subject: Re: [boost] [yap] review part 3: tests + misc + summary
From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2018-02-21 06:16:18

On Tue, Feb 20, 2018 at 9:29 AM, Steven Watanabe via Boost <
boost_at_[hidden]> wrote:

> On 02/19/2018 08:38 PM, Zach Laine via Boost wrote:
> > On Mon, Feb 19, 2018 at 11:13 AM, Steven Watanabe via Boost <
> > boost_at_[hidden]> wrote:


> >> Unfortunately,
> >> transform/evaluate doesn't work well if evaluation
> >> involves any special constructs that affect control-flow.
> >> Just try to handle something like this:
> >> let(_a=2) [ _a + 1 ] // evaluates to 3
> >>
> >
> > That does not look problematic to me, though I don't know what the
> intended
> > semantics are.
> > If _a is a terminal and refers to something for which "= 2"
> > and "+1" are well-formed, I would expect even evaluate() to do the right
> > thing.
> >
> _a is a placeholder. _a = 2 is not an assignment
> to some external object. It is a variable definition
> that is scoped within the let expression. It's not really
> possible to transform this into something that can be
> evaluated in a single pass. I claim that it is impossible
> to implement `let` using Yap without duplicating all the
> work of evaluate. (Keep in mind that `let` can be nested
> and can be mixed with if_else.)

That looks like a great candidate for an example, so I made one out of it:

It took most of the examples from the documentation page you posted above,
and it works for them, including nesting. For instance:

            let(_a = 1_p, _b = 2_p)
                // _a here is an int: 1

               let(_a = 3_p) // hides the outer _a
                   cout << _a << _b // prints "Hello, World"
            1, " World", "Hello,"

That's verbatim from the Phoenix docs (except for the yap::evaluate() call
of course), with the same behavior. The entire example is only 158 lines,
including empty lines an some comment lines. The trick is to make let() a
regular eager function and leave everything else lazy Yap expression
stuff. I don't know if this counts as evaluation in "a single pass" as you
first mentioned, but I don't care, because the user won't care either --
she can't really tell.


> I don't yet get what you're suggesting here. Right now, transform()
> > returns whatever you tell it to,
> The issue is not what I'm telling it to do,
> but rather what it does on its own when I
> don't tell it anything.

I take it from this that you mean the copying of nodes unmatched by the
transform object. If so, I think this is covered by transform_strict() (as
I'm provisionally calling it), that hard-errors on unmatched nodes. Does
that suffice, or are there other aspects you find problematic?

> > except that it doesn't special-case void
> > return. You can (I have have extensively in the examples) return an
> > Expression or non-Expression from a transform, or from any subset of
> > overloads your transform object provides. What is the behavior that
> you're
> > suggesting?
> >
> You've chosen to make it impossible to customize
> the behavior of evaluate. I believe that Brook
> brought up evaluate_with_context, which is basically
> what I want.

Yes, I have. I don't believe you actually want otherwise, though. Such a
tailored evaluate() would look something like this:

evaluate_with_context(expr, context, placeholder_subs...); // the subs are
of course optional

What does evaluate_with_context now do? Let's say expr is "a + b". Does
the context only apply to the evaluation of a and b as terminals, or does
it apply to the plus operation as well? Are such applications of the
context conditional? How does the reader quickly grasp what the
evaluate_with_context() call does? This seems like really muddy code to
me. If you have something else in mind, please provide more detail -- I
may of course be misunderstanding you.

> >> This has the side effect that you must explicitly wrap
> >> terminals when returning, but I think that's actually a
> >> good thing, as a transform that returns unwrapped terminals,
> >> expecting them to be wrapped by the caller, may have
> >> inconsistent behaviour.
> >> In addition, there is another possible mode that has
> >> better type safety which I will call:
> >> 4. manual: No default behavior. If a node is not handled
> >> explicitly, it is a hard compile error.
> >>
> >
> > But this isn't transform() at all, because it doesn't recurse. It only
> > matches the top-level expression, or you get a hard error. Why not just
> > write my_func() that takes a certain expression and returns another?
> You're right. The only benefit is that tag
> transforms are a bit more convenient. Also,
> it allows for a more consistent interface.
> > Calling it with the wrong type of expression will result your desired
> hard
> > error. Wait, is the use case that I think my transform matches all
> > subexpressions within the top-level expression, and I want to verify that
> > this is the case? I don't know how often that will come up. I can't
> think
> > of a single time I've written a Yap transform expecting it to match all
> > nodes, except to evaluate it. It could be useful in those cases now
> that I
> > think of it.
> >
> If you're building a completely new grammar
> whose meaning has no relationship to the built
> in meaning of the operators, (e.g. Spirit),
> then you basically have to handle everything
> explicitly.

Ok, I'm convinced this is a good idea. I've added a GitHub ticket about
creating a transform_strict(), as I mentioned above.

> >> - Combining transforms isn't exactly easy, because of
> >> the way transforms recurse. For example, if I have
> >> two transforms that process disjoint sets of nodes,
> >> I can't really turn them into a single transform.
> >>
> >
> > I'm not necessarily opposed to the idea, but how would combining
> transforms
> > work? Would each node be matched against multiple transform objects,
> using
> > whichever one works, or something else?
> >
> Probably the behavior is to choose the
> first one that matches. That would make
> it easy to write a transform that overrides
> some behavior of another transform. (Note
> that I really have no idea how to make
> something like this actually work.)

Well if *you* don't... :) I'll wait until you get back to me with more
details before I bite off such a feature.

> >> - How useful is it to separate the Expression concept
> >> from the ExpressionTemplate concept?
> >
> >
> > Types and templates are not the same kind of entity. I don't know how to
> > even go about combining these. Moreover, I think they should remain
> > separated, because sometimes a function takes an Expression, sometimes
> not,
> > sometimes it requires an ExpressionTemplate template parameter, and
> > sometimes not. These are orthogonal requirements for the caller, no?
> >
> >
> >> For example,
> >> transform requires an ExpressionTemplate for nodes
> >> that are not handled explicitly, but that isn't very
> >> clear from the documentation, which just says that
> >> it takes an Expression.
> >>
> >
> > I don't understand what you mean. transform() does not require an
> > extrinsic ExpressionTemplate template parameter, and does not use one.
> It
> > just recycles the ExpressionTemplate that was originally used to
> > instantiate whatever it needs to copy.
> >
> Let me rephrase the question:
> Is it useful to allow Expressions that are not
> instantiations of an ExpressionTemplate, given
> that transform can choke on such classes.

Hm. I think this is a documentation failure. I need to look and see what
the actual requirements are. I think it should always be fine to have
non-template-derived Expressions in terminals. This is where one usually
writes them anyway.

> >> - You say that it's fine to mix and match expressions that
> >> are instantiated from different templates. Why would
> >> I want to do this? The first thing that comes to mind
> >> is combining two independent libraries that both use YAP,
> >> but I suspect that that won't work out very well.
> >> It seems a bit too easy for a transform to inadvertently
> >> recurse into an unrelated expression tree in this case.
> >>
> >
> > I have two types, M and S (m and s are two objects of those respective
> > types). M is matrix-like, and has m*m in its set of well-formed
> > expressions; m[x] is ill-formed for any x. S is string-like, and has
> > s[int(2)] in its set of well-formed expressions; s*s is ill-formed. M
> and
> > S are in the same library, one that I maintain.
> >
> > If I want to use these in Yap expressions, I probably want to be able to
> > write m*m and s[5], but not m[2] or s*s. So I write two expression
> > templates with the right operations defined:
> >
> > template <...>
> > struct m_expr
> > {
> > // ...
> > };
> >
> > template <...>
> > struct s_expr
> > {
> > // ...
> > };
> >
> > Now I can write a Yap expression like:
> >
> > lookup_matrix(S("my_matrix")) * some_matrix
> >
> > and transform() it how ever I like. Requiring the two *_expr templates
> to
> > be unified would be weird.
> >
> It seems a bit odd to have matrices and strings in
> one library, rather than matrices and vectors, but
> I get your point. Please consider adding something
> like this to the documentation.

Will do.

> > - The value of a terminal cannot be another expression.
> >> Is this a necessary restriction?
> >
> >
> > Not really; I probably put that restriction there due to lack of
> > imagination. Terminals' values should be treated as simple types, but
> I'm
> > pretty sure there's no real reason those types can't happen to model
> > Expression.
> >
> >
> >> (This is related to
> >> mixing different libraries using YAP, as the most
> >> common form is for library A to treat the expressions
> >> from library B as terminals. Since a terminal can't
> >> hold an Expression, we would need to somehow un-YAP
> >> them first.)
> >>
> >
> > As I outlined above, I don't think that's going to be the most common
> case
> > of mixing ExpressionTemplates in a single Yap expression.
> >
> That's why I said mixing different *libraries*, not
> just mixing different templates from a single library
> (which are most likely intended to work together).
> The basic idea is that library A creates expressions
> that are models of some concept C, while library B
> expects a terminal that models the same concept C.
> library A and library B know nothing about each other,
> and are only connected because they both know about C.

I want to see a compelling example of this causing a problem. My
expectation is that my transform overloads will only match the types or
patterns of types I specify. If I visit N nodes in a tree, and half of
them are from library A and half from library B, my transform overloads are
probably going to match only within the half that I anticipated. If this
is not the case, it's probably because I've written the overloads loosely
on purpose, or too loosely and need to fix my code.

> >> - Unwrapping terminals redux:
> >> Unwrapping terminals is convenient when you have
> >> terminals of the form:
> >> struct xxx_tag {};
> >> auto xxx = make_terminal(xxx_tag{});
> >> AND you are explicitly matching this terminal with
> >> auto operator()(call_tag, xxx_tag, T...) or the like.
> >> I expect that matching other terminals like double
> >> or int is somewhat more rare in real code. If you
> >> are matching any subexpression, then unwrapping
> >> terminals is usually wrong.
> >
> >
> > Why do you say that? The only way it is the way it is now is
> convenience,
> > as you might have guessed. When I know I want to match a particular
> > terminal of a particular type, I'd rather write:
> >
> > auto operator(some_unary_tag, some_type && x);
> > auto operator(some_unary_tag, some_type const & x);
> >
> > versus:
> >
> > auto operator(some_unary_tag, my_expr<yap::expr_kind::terminal,
> some_type
> > &&> const & x);
> > auto operator(some_unary_tag, my_expr<yap::expr_kind::terminal,
> some_type
> > const &> const & x);
> >
> > This latter form is what I started with, and I very quickly grew tired of
> > writing all that.
> >
> You can always use a typedef or alias. i.e. my_term<some_type&&>.

I decided to conduct this experiment and see how it went. I removed the
terminal_value() function and all its uses from default_eval.hpp; this is
all that was required to disable terminal unwrapping. Almost immediately,
I ran into something I did not want to deal with. From one of the tests:

        decltype(auto) operator()(
            yap::expr_tag<yap::expr_kind::call>, tag_type, double a, double
b) { /* ... */ }

Now, the tag type part is not so bad. As you mentioned, I can just
write my_term<tag_type>
or similar. Now, what about the doubles? In order to match one of those,
I need to pick my_term<double>, my_term<double &>, my_term<double const &>,
etc., because the normal reference binding and copying rules of parameter
passing can no longer help me. It doesn't matter that I only want to use
"b" as a double by-value, it matters what the user happened to write in the
expression I'm transforming. The alternative is of course to write generic
code to catch a plain ol' double, which I object to philosophically and
pragmatically. We're back to writing a and b as AExpr and BExpr
template-parameterized types, but because those might also match some ints
or std::strings or whatever, we're also going to need to write some sfinae
or concept constraints.

> > If I match a generic expression that may or may not be a terminal, I do
> > have to use yap::as_expr(), which is definitely an inconvenience:
> >
> > template <typename Expr>
> > auto operator(some_unary_tag, Expr const & x) {
> > return some_function_of(yap::as_expr(x));
> > }
> >
> My intuition is that this is the most common pattern,
> and should be what we optimize for. Also, changing
> terminals to yap::expression may cause random
> weirdness if the transform returns an Expression.
> In addition, I feel that needing to call as_expr
> is more surprising than needing to write out the
> the terminal as an expression.

I agree that this is the most likely more common pattern. It's just that
when you *do* want to match terminals, especially more than one at the same
time, the pain of doing it without terminal unwrapping if far greater than
the pain of using as_expr() in code like the common case above.

> Actually... I have a much better idea. Why
> don't you allow transform for a non-expr to match
> a terminal tag transform?

I've read this a few times now and cannot parse. Could you rephrase?


Boost list run by bdawes at, gregod at, cpdaniel at, john at