Boost :

Date view	Thread view	Subject view	Author view

Subject: Re: [boost] [Boost-users] Yap's formal review is starting now!
From: Brook Milligan (brook_at_[hidden])
Date: 2018-02-16 20:25:28

Next message: Steven Watanabe: "Re: [boost] Yap's formal review is starting now!"
Previous message: Zach Laine: "Re: [boost] Yap's formal review is starting now!"
In reply to: Louis Dionne: "[boost] Yap's formal review is starting now!"
Next in thread: Zach Laine: "Re: [boost] [Boost-users] Yap's formal review is starting now!"
Reply: Zach Laine: "Re: [boost] [Boost-users] Yap's formal review is starting now!"

> On Feb 5, 2018, at 8:53 AM, Louis Dionne via Boost via Boost-users <boost-users_at_[hidden]> wrote:
>
> We encourage your participation in this review. At a minimum, please state:
>
> - Whether you believe the library should be accepted into Boost

See below.

> - Your name

Brook Milligan

> - Your knowledge of the problem domain

I have been working with Yap in production for the past year and with Proto before that. The transition has perhaps colored how I originally approached the use of Yap, which is evident in some of my earlier comments and questions. However, having implemented domain-specific languages in both Proto and Yap, I feel that I have gathered some expertise in the area of expression templates.

> * Design

Overall, the design of Yap seems excellent, although I do agree with the comments regarding potential modifications to the expression concept. In particular, I like Peter's idea of making the tag a type and the idea of making the expression template variadic (instead of using a tuple):

template<expr_kind kind> struct expression_tag {};

using plus_tag = expression_tag<expr_kind::plus>;

template < typename Tag, typename ... T > class expression;

It seems to me that this would make matching expressions much more flexible, but perhaps I am missing something. It might also some dispatching in the implementation to make things simpler?

It is clear that the key to flexibility is the ability to write transforms, which is something I have not done a lot of. Thus, it is crucial to get the expression concept right so that generic transforms are easy and matching specific expressions is also easy. My intuition suggests the definition above would be good, but this should be hashed out carefully, and the rationale documented.

> * Implementation

I have looked through much of the implementation at various points, but feel that others are better positioned to comment more substantively than me. Mostly, I have looked through this trying to figure out how various parts are working, because I could not find the information in the documentation; see below.

> * Documentation

The documentation is the weakest point of Yap. In particular, it does not establish (in my mind at least) a clear understanding of the rationale and design of the primary use case, which seems to be the evaluate(transform()) idiom discussed on the list.

Introduction: This needs to guide the reader to think about Yap idioms in the right way. The motivation sections seems clear: we need Yap because expression templates are valuable but Proto is too complicated. Perhaps, however, this needs to be recast a bit to avoid bringing in Proto at the first. Perhaps a better motivation is (i) what are expression templates and why are they useful (briefly; refer to a more detailed section that develops this more completely) and (ii) what are some quick examples (really short code snippets, not necessarily complete programs).

The features are also clear, but only with a fair bit of background. It also seems that the features are not necessarily in an order that reflects either their importance or the way a user would encounter them. A better list might be:

- A simple Expression concept easily modeled by user code. Member and non-member functions on Expressions can be added with compact macros, and a reference implementation exists for prototyping or experimentation.

- Evaluation of expression trees matches the semantics of builtin C++ expressions as closely as possible. All operator expressions are evaluated using the corresponding operator definitions for the types involved. This leads to clearer understanding of the semantics of expression tree evaluation, because the definitions are local to the types involved. This is one point at which the design of Yap diverges significantly from that of Proto; in the latter, the semantics of expression tree evaluation is determined by context classes. Consequently, it is much more difficult to reason about the semantics in Proto than in Yap. [ This last bit could also be in a Proto v. Yap section. ]

- Expression trees may be transformed explicitly in a user-defined way. This is accomplished with overloaded member functions in a transform class, each one corresponding to subexpressions in the expression tree. While these member functions may transform a subexpression into anything, a common pattern is to transform terminals into either new terminals or appropriate values and to leave other subexpressions unchanged. This evaluate(transform()) idiom is expected to be one of the most common ways of using Yap to manipulate and evaluate expression trees. Even though this use case involves copying the expression tree, it is expected that the new terminals or values will be inexpensive to copy so that the cost of transformation is small.

- Functions that operate on expression trees or create expressions. A set of functions is provided (and used by the library) to manipulate expression trees or their components. These should simplify the process of, for example, writing user-defined transforms.

- Evaluation of expression trees is generally [ what does this mean? ] equivalent to evaluating the corresponding native code. Thus, there should be no [ true? ] performance cost to using expression trees.

- Expression trees may be encapsulated within function objects that evaluate the expression when called. This provides a nice bridge with other function-based libraries.

- ... [ Maybe there are other (or fewer) features to emphasize. The point is to expand a bit on what the features mean to the user rather than just list them. It is also to organize them linearly from most to least important/relevant to the user. ]

Tutorial: if there are examples that illustrate points in the tutorial, there should be links from the tutorial page to the corresponding example.

Expressions: there is an example at the bottom of the page of a problem that recurs elsewhere. The phrase "there are lots of examples of how to do this in the Examples section" is not helpful without references to which particular examples illustrate this point. It should also be clear in the examples what points they are examples of.

Kinds of expressions: over the past year I have found myself consulting this page many times expecting a list of the kinds of expressions, i.e., all the tags, etc. Why is that not here? Why does this only refer to "special" kinds?

Operators: comments like "you can write your own operators that return Expressions if you like" would be much more useful with the following: (i) an indication of when that would be appropriate (is this just another joke, as in "be a masochist, write your own", or is this "sometimes you legitimately need to and so you can"?) and (ii) examples of how to do this. I believe there are similar points throughout the documentation to which analogous comments apply. It would also be useful to have a link to all the operator macros here.

Explicit transformations: the phrases "that will be placed into result at the corresponding place in result" seems confusing. Especially given the central role played by transforms, I feel that a much clearer description of their operation is needed. This could leverage a detailed section on expression trees generally and (another one?) on the representation of them in Yap. The latter is implicit in the expression concept, but clarity would be improved by making it explicit and by relating it to the more general concepts.

The sentence "Otherwise, N is used, and visitation of nodes in expr continues below N." seems misleading if I am understanding transforms correctly. First, the "N is used" part is very vague; used for what specifically? Second, the "visitation continues" part seems wrong. Isn't the recursion entirely under the control of the transform? If so, then how can this assertion be made? Perhaps you mean something like "sane transforms should often be written to have this property". Again, this gets at the need to provide much clearer best practices information that is clearly distinct from tutorials and reference material. These all serve different needs and should not be interspersed in the documentation.

The tag and expression forms of matching in transforms are mentioned on this page, but documented elsewhere. This is difficult organization. Reorganize this to bring the conceptually related parts together.

Evaluating expressions: Given the direction of the discussion (i.e., dropping customization points), it seems that this will be reduced to "Boost.YAP expressions are explicitly evaluated by calling the evaluate()". It seems, however, that this could be a good point to include information on (i) evaluation of operators follows native semantics for the types involved, (ii) the value of local reasoning about semantics, (iii) a brief discussion of equivalencies and differences between the underlying native code and what is evaluated by evaluate().

Operator macros: I think this makes more sense in "Operators".

How expression operands are treated: most of the information here (i.e., not the table and specifics relating to is) seem more appropriate in a section dealing with the design of Yap expressions and the implementation of them. This is trying to motivate the correspondence between C++ expressions and Yap expression trees; that needs unification elsewhere. The last bit is more about specifics and should be reference material.

Customization points: I assume this will disappear.

Transform matching: this information needs to be integrated into a unified description of transforms.

Reference: why is the reference organized around the header files? I feel it would make more sense to organize reference information around concepts, classes, functions, and the like. The headers are details that a reader generally does not care about as an index to finding information. They are clearly needed as part of the reference documentation for each class, etc., but that information is provided on those pages.

Index: the index should have links to the reference information for functions and classes. It seems that one must go through the header file links to get them (because not everything is linked here), but that does not match my expectation as a reader. Conversely, everything linked to by the header file pages should be linked in the index. Indeed, those pages are what appear to me to be the primary reference documentation, but they are needlessly difficult to navigate to.

> * Tests

I have not compiled the tests. I have however compiled and run all the examples except the autodiff one. Additionally, I have written many of my own simple use cases / tests to ensure that I understand how Yap works.

> * Usefulness

This library is extremely useful. It has enabled me to build practical applications in ways that are much easier than with Proto. I look forward to the outcome of this review, as I expect it will be even better designed and documented and therefore even more useful. If only viewed from this standpoint, the library should absolutely be accepted.

> - Did you attempt to use the library? If so:

> * Which compiler(s)?

Apple LLVM version 8.0.0 (clang-800.0.42.1); probably earlier versions as well.

> * What was the experience? Any problems?

I have encountered no compilation problems with the library. The only problems I have encountered result from my use of the library, which underscores the need for better documentation that is more precise and provides best practice information.

> - How much effort did you put into your evaluation of the review?

I don't suppose the year of using Yap counts exactly, but that has raised a lot of issues for me that I have discussed here. During the review I have regularly followed the comments on the list and have spent an hour or two on contributing additional comments or testing specific ideas that have come up. I have spent several hours going through the documentation and writing this review.

Verdict:

Yap should be CONDITIONALLY ACCEPTED into Boost. The conditions I have in mind are the following:

- The documentation should be dramatically improved. My comments above should make it clear that this includes reorganization as well as clarification.

- Alternative expression concepts should be evaluated carefully. This adds to comments on the list regarding suitable tuples, etc. Perhaps tuples are not necessary. This condition is not that the design should be changed, but instead that alternatives should be investigated and the rationale for the chosen one documented clearly in contrast to others. Perhaps this will be evident if the documentation includes more specific information about the representation, the use of transforms, etc.

Finally, I want to thank Zach for constructing a really useful library. Despite everything said here, I have found it to be valuable in its current form and would continue to use it even if no changes were made. Thus, I am offering this review from the perspective of trying to make Yap more accessible to people who have not already been through the learning curve.

Next message: Steven Watanabe: "Re: [boost] Yap's formal review is starting now!"
Previous message: Zach Laine: "Re: [boost] Yap's formal review is starting now!"
In reply to: Louis Dionne: "[boost] Yap's formal review is starting now!"
Next in thread: Zach Laine: "Re: [boost] [Boost-users] Yap's formal review is starting now!"
Reply: Zach Laine: "Re: [boost] [Boost-users] Yap's formal review is starting now!"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk