Boost logo

Boost :

Subject: Re: [boost] [yap] review part 3: tests + misc + summary
From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2018-02-20 03:38:09


On Mon, Feb 19, 2018 at 11:13 AM, Steven Watanabe via Boost <
boost_at_[hidden]> wrote:

> AMDG
>
> I vote to *ACCEPT* YAP into Boost.
>
> Critical issues (I believe that Zach has already addressed
> most of these):
> - The eval_xxx customization points.
> - Handling of terminals for tag transforms.
> - BOOST_YAP_CONVERSION_OPERATOR_TEMPLATE.
> - Short-circuiting in evaluate.
> - A single macro for symmetric binary operators.
>

Yes, these are all done except the last one. It turns out there's a larger
usability concern, which is that there simply aren't enough macros to cover
different use cases. I think we need more that take constraining type
traits, and a call operator that only takes N parameters, perhaps with
constraints as well.

> TESTS:
>
> - External dependencies are strongly discouraged. (i.e. gtest).
> Especially do not include a copy of external components.
>

Right. I intended all along to transliterate those to Boost.Test. The
Google benchmark stuff will probably just stay on a non-Boost-submodule
branch. I never intended to check in either of those Google libraries to
an eventual Boost submodule, just to be clear.

> - Boost.Build is required.
>

Right. Will do.

> compile_test_main.cpp:
>
> - This is basically unnecessary, as the compile tests
> don't need to be linked.
>

That's just a CMake-ism. I don't know of an easier way to get CMake to
compile compile-only tests.

> copy_only_types.cpp:
>
> - This test is not useful, which is why it's #if'ed out.
> There is never a reason to create a type that can be
> copied from an lvalue but not an rvalue. (unless a copy
> isn't really a copy, in which case YAP can't handle either
> one anyway.)
>

That's cruft. I'll remove it.

> deref.cpp:
>
> - the test cases in this file only show pass/fail, they
> don't indicate the actual type returned by deref, which
> will make debugging failures harder. I typically use
> BOOST_MPL_ASSERT((is_same<X, Y>)); specifically for
> this reason (even when static_assert is available).
>

Good idea. I've made this change throughout the test files.

> - Actually, in general, these tests need to check both
> the type and the value. Checking just the type is
> insufficient.
>

Will do.

> user_expression_transform_3.cpp:
>
> - It took me a while to find these tests for transform,
> because I (incorrectly) assumed that user_expression_tranform*
> were all tests for the transform_expression customization
> point after I looked at the first two.
>

The first two are gone; they went away with the customization points. It's
a lot clearer now.

> vector_alloc_test.cpp:
>
> - Your replacement of operator new is technically
> illegal, because it can return a null pointer.
>
> - I don't think the standard guarantees that vector makes
> any specific number of allocations. The correct
> way is to set allocations = 0 after initializing
> the vectors and then verify that no further allocations
> happen. Alternately, make operator new throw a bad_alloc
> during the region where you expect no allocations.
> As a bonus, it'll be easier to find the cause of
> unexpected allocations, when they do happen.
>

Will do.

> You need some compile-fail tests:
> - invalid arguments to accessors, such as left, right, and get.
> - possibly switching the arguments of transform. (I find
> it mildly disturbing that this currently compiles, as
> transform is one function for which I may not always remember
> the correct argument order.)
>

Agreed. Will do.

> gcov says the following are not executed by the tests
> (line numbers as of master):
>
> algorithm.hpp:
> 130 (value_expr_impl),
> 232, 246 (get_impl),
> 300 (get_c),
> 408 (callable),
> 427 (argument)
> 701 (most of op_string)
>
> expression.hpp:
> 54, 62 (value),
> 69, 77 (left),
> 84, 92 (right),
> most unary operators (except negate)
> most binary operators (except multiplies and plus)
> 247 (call),
> 310 (value),
> 428-420 (pre_dec, post_inc, post_dec)
> most binary operators (except multiplies, plus, and minus)
>
> operators.hpp:
> many unary and binary operators.
>
> print.hpp
> 42 (print_value) - Isn't this overload for
> placeholders, rather than hana::llong?
> 50,52,54 (print_type),
> 87,89 (print_impl)
>
> detail/default_eval.hpp:
>
> 20 (eval_placeholder)
> many unary and binary operators.
>

I'll add better coverage for these.

> EXAMPLES:
>
> doc/user_macro_snippets.cpp:268
> - This should be is_vector not is_string.
>

Thanks, fixed.

> doc/other_snippets.cpp:90
> std::cout << evaluate(expr) << "\n"; // Prints "Yay." Wierd!
> s/Wierd/Weird/
>

I think this already got fixed. A recursive, case-insensitive grep found no
"wierd"s anywhere under yap/.

> General Notes:
>
> - There are three primary modes for processing an
> expression tree, which I will describe as:
> 1. transform: Takes a tree and produces another tree.
> 2. evaluate: Takes a tree and evaluates it in some way.
> 3. for_each: Processes each node and returns nothing.
> yap::transform can handle all three as long as you explicitly
> handle all nodes. The default behavior is (1), which
> makes (2) and (3) somewhat inconvenient. Your solution
> in the examples seems to be transform/evaluate for (2)
> and returning a dummy result for (3). Unfortunately,
> transform/evaluate doesn't work well if evaluation
> involves any special constructs that affect control-flow.
> Just try to handle something like this:
> let(_a=2) [ _a + 1 ] // evaluates to 3
>

That does not look problematic to me, though I don't know what the intended
semantics are. If _a is a terminal and refers to something for which "= 2"
and "+1" are well-formed, I would expect even evaluate() to do the right
thing.

> Returning a dummy result is a bit annoying, but shouldn't
> cause any real problems, as long as terminals are captured
> by reference in the result. All in all, I'd like to have
> some way to change the default behavior of transform, or
> perhaps have separate functions with different default behavior.
> Just an idea: switch the default behavior based on the
> result of transforming the subexpressions:
> Expression/non-Expression/void.
>

I don't yet get what you're suggesting here. Right now, transform()
returns whatever you tell it to, except that it doesn't special-case void
return. You can (I have have extensively in the examples) return an
Expression or non-Expression from a transform, or from any subset of
overloads your transform object provides. What is the behavior that you're
suggesting?

> This has the side effect that you must explicitly wrap
> terminals when returning, but I think that's actually a
> good thing, as a transform that returns unwrapped terminals,
> expecting them to be wrapped by the caller, may have
> inconsistent behaviour.
> In addition, there is another possible mode that has
> better type safety which I will call:
> 4. manual: No default behavior. If a node is not handled
> explicitly, it is a hard compile error.
>

But this isn't transform() at all, because it doesn't recurse. It only
matches the top-level expression, or you get a hard error. Why not just
write my_func() that takes a certain expression and returns another?
Calling it with the wrong type of expression will result your desired hard
error. Wait, is the use case that I think my transform matches all
subexpressions within the top-level expression, and I want to verify that
this is the case? I don't know how often that will come up. I can't think
of a single time I've written a Yap transform expecting it to match all
nodes, except to evaluate it. It could be useful in those cases now that I
think of it.

> - Combining transforms isn't exactly easy, because of
> the way transforms recurse. For example, if I have
> two transforms that process disjoint sets of nodes,
> I can't really turn them into a single transform.
>

I'm not necessarily opposed to the idea, but how would combining transforms
work? Would each node be matched against multiple transform objects, using
whichever one works, or something else?

> - How useful is it to separate the Expression concept
> from the ExpressionTemplate concept?

Types and templates are not the same kind of entity. I don't know how to
even go about combining these. Moreover, I think they should remain
separated, because sometimes a function takes an Expression, sometimes not,
sometimes it requires an ExpressionTemplate template parameter, and
sometimes not. These are orthogonal requirements for the caller, no?

> For example,
> transform requires an ExpressionTemplate for nodes
> that are not handled explicitly, but that isn't very
> clear from the documentation, which just says that
> it takes an Expression.
>

I don't understand what you mean. transform() does not require an
extrinsic ExpressionTemplate template parameter, and does not use one. It
just recycles the ExpressionTemplate that was originally used to
instantiate whatever it needs to copy.

> - You say that it's fine to mix and match expressions that
> are instantiated from different templates. Why would
> I want to do this? The first thing that comes to mind
> is combining two independent libraries that both use YAP,
> but I suspect that that won't work out very well.
> It seems a bit too easy for a transform to inadvertently
> recurse into an unrelated expression tree in this case.
>

I have two types, M and S (m and s are two objects of those respective
types). M is matrix-like, and has m*m in its set of well-formed
expressions; m[x] is ill-formed for any x. S is string-like, and has
s[int(2)] in its set of well-formed expressions; s*s is ill-formed. M and
S are in the same library, one that I maintain.

If I want to use these in Yap expressions, I probably want to be able to
write m*m and s[5], but not m[2] or s*s. So I write two expression
templates with the right operations defined:

template <...>
struct m_expr
{
    // ...
    BOOST_YAP_USER_BINARY_OPERATOR_MEMBER(times, ::m_expr)
};

template <...>
struct s_expr
{
    // ...
    BOOST_YAP_USER_BINARY_OPERATOR_MEMBER(aubscript, ::s_expr)
};

Now I can write a Yap expression like:

lookup_matrix(S("my_matrix")) * some_matrix

and transform() it how ever I like. Requiring the two *_expr templates to
be unified would be weird.

- The value of a terminal cannot be another expression.
> Is this a necessary restriction?

Not really; I probably put that restriction there due to lack of
imagination. Terminals' values should be treated as simple types, but I'm
pretty sure there's no real reason those types can't happen to model
Expression.

> (This is related to
> mixing different libraries using YAP, as the most
> common form is for library A to treat the expressions
> from library B as terminals. Since a terminal can't
> hold an Expression, we would need to somehow un-YAP
> them first.)
>

As I outlined above, I don't think that's going to be the most common case
of mixing ExpressionTemplates in a single Yap expression.

> - BOOST_YAP_BINARY_OPERATOR_MEMBER(assign) needs
> special handling because of the copy assignment
> operator.
>

Ah, thanks for pointing that out!

> - Unwrapping terminals redux:
> Unwrapping terminals is convenient when you have
> terminals of the form:
> struct xxx_tag {};
> auto xxx = make_terminal(xxx_tag{});
> AND you are explicitly matching this terminal with
> auto operator()(call_tag, xxx_tag, T...) or the like.
> I expect that matching other terminals like double
> or int is somewhat more rare in real code. If you
> are matching any subexpression, then unwrapping
> terminals is usually wrong.

Why do you say that? The only way it is the way it is now is convenience,
as you might have guessed. When I know I want to match a particular
terminal of a particular type, I'd rather write:

auto operator(some_unary_tag, some_type && x);
auto operator(some_unary_tag, some_type const & x);

versus:

auto operator(some_unary_tag, my_expr<yap::expr_kind::terminal, some_type
&&> const & x);
auto operator(some_unary_tag, my_expr<yap::expr_kind::terminal, some_type
const &> const & x);

This latter form is what I started with, and I very quickly grew tired of
writing all that.

If I match a generic expression that may or may not be a terminal, I do
have to use yap::as_expr(), which is definitely an inconvenience:

template <typename Expr>
auto operator(some_unary_tag, Expr const & x) {
    return some_function_of(yap::as_expr(x));
}

> Even in the first case,
> where unwrapping terminals is sometimes convenient,
> it prevents usage like:
> auto operator()(call_tag, decay_t<decltype(xxx)> const &);
> (Which is how I might write a matcher if xxx is not defined
> by me and xxx_tag is not a documented public symbol).
>

My intuition is that this is a far less common case than wanting to match a
particular type of terminal.

Zach


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk