|
Boost : |
From: Larry Evans (cppljevans_at_[hidden])
Date: 2008-03-28 15:01:35
It's all in the attachment. I had several other topics
I wanted to explore; however, this will have to do for now.
NOTE: The following is based on a download on 2007-02-17 of proto.zip
from boost vault.
* What is your evaluation of the design?
Fair. One problem is the "mixing" of grammars with transforms. As
stated here:
<boost-root>/libs/proto/doc/html/boost_proto/users_guide/expression_transformation/canned_transforms/pass_through.html
all expression generator meta-functions (For example, posit<>,
shift_right<>, function<>, nary_expr<>, etc.) have a pass-through
transform by default.
which means that when, for example, matches<posit<_>,Expr> is
compiled, the associated pass-through transform is not used. If
it's not used in matches, then it should be eliminated. Of course
it is used when the transform is applied; however, a transform
template class taking a grammar as argument (the proto::when
template seems appropriate) should be used to do that. This would
make it clearer what the programmer's intention is. That means
another template instantiation, thus slowing compile times; however,
I think the time saved in understanding the code would compensate
for that. Of course that's a judgment call.
This "unused transform" feature is especially obvious with and_.
Only the last transform (call it Xn) associated with the list of
and_ args can ever be used. If transforms were eliminated from the
and_ args and and_ was only used for matches, then:
when<and_<G1,...,Gn>,Xn>
could be used with the same effect as and_ with transforms
associated with the args. Here, when is more obviously a transform,
Tn,enabled by condition and_<G1,...,Gn>. Hm..., but that "looks"
(based on what if means in c++)
more like proto::if_, however I've never used proto::if_ and haven't
read proto::if_ docs; hence, maybe that conclusion is all wrong. I
have read somewhere that proto::if_'s 1st argument is a transform,
which seems odd when the if_ name would suggest it's a grammar with
a noop (unused) transform.
What about proto::or_<T0,T1,...,Tn>? If the T0,T1,...,Tn were
changed to "pure" grammars, G0,G1,...,Gn, then how would the
*very* useful examples of transforms such as CalcArity in:
<boost-root>/libs/proto/doc/html/boost_proto/users_guide/expression_transformation/example__calculator_arity_transform.html
be done? Well, as explained in the in-source comments to
proto::switch_ in matches.hpp, proto::switch_<Cases> could be used
instead. OOPS, looking at switch_ shows it contains:
template<typename This, typename Expr, typename State, typename Visitor>
struct result<This(Expr, State, Visitor)>
{
typedef typename Cases::template case_<typename Expr::proto_tag> which;
typedef typename which::template result<void(Expr, State, Visitor)>::type type;
};
and, along with the example in code in /libs/proto/test/matches.cpp,
shows that the only test is on the tag of the grammar and not the
children. So.. maybe switch_ should be renamed to switch_tag and a
new template, switch_gram, which does what the current or_ does
except the template argument's are instances of a new template,
proto::case_, where proto::case_ is like the current proto::when
except:
1) The first arguments don't have to have a transform
associated with them.
2) There's a default 2nd argument to proto::case_
which is identity_t (identity transform, which is the same as
the transform part of the current proto::terminal):
struct identity_t
{
template<typename Sig>
struct result;
template<typename This, typename Expr, typename State, typename Visitor>
struct result<This(Expr, State, Visitor)>
{
typedef Expr type;
};
template<typename Expr, typename State, typename Visitor>
Expr const &operator ()(Expr const &expr, State const &, Visitor &) const
{
return expr;
}
}
So, with these additional templates, the CalcArity might be:
struct CalcArity
: switch_gram<
case_< terminal_g< placeholder1 >,
mpl::int_<1>()
>
, case_< terminal_g< placeholder2 >,
mpl::int_<2>()
>
, case_< terminal_g<_>,
mpl::int_<0>()
>
, case_< unary_expr_g<_, CalcArity>,
CalcArity(_arg)
>
, case_< binary_expr_g<_, CalcArity, CalcArity>,
mpl::max<CalcArity(_left),
CalcArity(_right)>()
>
>
{};
where:
XXX_g, for XXX={terminal,unary_expr,binary_expr},
is the "grammar part" of the current XXX template.
However, even here there's a sortof mixture of grammar and
transform. The "source" grammar of the transform is the one that
appeared earlier on the example__calculator_arity_transform.html
page (here, renamed CalcGram and with XXX->XXX_g [see where:
paragraph under CalcArity above]):
struct CalcGram
: or_<
terminal_g< placeholder1 >
, terminal_g< placeholder2 >
, terminal_g< _ >
, unary_expr_g< _, CalcGram >
, binary_expr_g< _, CalcGram, CalcGram >
>
{};
In order to completely separate the grammar from the transforms,
something like an algebraic homomorphism:
http://planetmath.org/encyclopedia/HomomorphismBetweenAlgebraicSystems.html
is needed. Each function name, w_A, in source algebra in that
reference would correspond to some grammar, G_w, in the or_:
or_<G1,G2,...,Gn>
The bad thing about that is that the function names would be sorta a
duplication of the grammars, I think (I haven't thought it thru
really). The only way I can be more certain of this is to try and
implement it; however, I'm already way late in doing this review,
so, that suggestion about complete separation of grammar from
transform is just *highly* speculative.
* What is your evaluation of the implementation?
Not done.
* What is your evaluation of the documentation?
Good. However:
1) Some transform template's are barely documented or not documented
(except in-source) at all (e.g. switch_). There really needs a full
toc like the mpl one here:
http://www.boost.org/libs/mpl/doc/refmanual/refmanual_toc.html
I almost always go there when I want to figure out what mpl
templates do.
2) Some transform documents are verify formally documented. IOW, they
have almost a set of rules for guiding the reader, step-by-step,
into how the transforms work. Each step in the process is described
by a table in documentation of the outermost template. These tables
have the before expression in the 1st column and the step's
resulting expression in the 2nd column. This is very good. I used
it to help me understand parts of proto.
However, manual execution of rules on a given expression is hard
because the user has to keep track in his mind the intermediates
steps, and that's a lot of mental bookkeeping. It would be very
helpful if there was a "proto-debugger" which, given an source
expression, executes each of these steps (noting which table is
being executed) and shows the correspondence between names in the
"rules tables" with names in the source expression or intermediate
expression, and finally stops when the expression cannot be reduced
further. Maybe even proto could be used to do this. A sorta
"self-aware" proto ;) This ability to take an expression and simply
it using a set of rules would have uses in other domains, as shown
by Task 3 in Markus' post:
http://lists.boost.org/Archives/boost/2008/03/134376.php
There's a prototype of doing something similar (although no
association is shown between rules and tables) for language grammar
analysis in the vault:
http://www.boost-consulting.com/vault/index.php?&directory=Strings%20-%20Text%20Processing
It just uses proto::eval to go through the steps until not further
change is possible.
2) There needs to be a more detailed description of how the 1st
argument, the expression argument, of proto::matches "relates" to the
2nd argument, the grammar argument. The description here:
<boost-root>/libs/proto/doc/html/boost/proto/result_of/matches.html
only uses "expression grammars" but makes no mention of proto
meta-functions for generating expressions:
<boost-root>/libs/proto/doc/html/boost_proto/users_guide
/expression_construction/tags_and_meta_functions.html
Judging from the example of using a terminal grammar, terminal<X>,
to generate an expression by appending ::type:
terminal<X>::type
I jumped to the rash conclusion that a similar technique would
work for binary_expr<Tag,L,R>. IOW, an expression could be
generated by:
binary_expr<Tag,L,R>::type
However, this is only true if L and R are already expression.
This rash conclusion lead to a time consuming
"debugging of my thinking" as evidenced by the thread:
http://lists.boost.org/Archives/boost/2008/03/133715.php
Instead, the ::type appending must be propagated to all sub
grammar expressions, as done by the morphism template attached to:
http://lists.boost.org/Archives/boost/2008/03/134964.php
Avoiding the rash conclusion mentioned above would probably be
easier if proto grammars were prominently classified on some page
into the following grammar subtypes:
1) expression grammars:
These are the grammars appearing in the matches.html page
mentioned above. These also are expression types; hence, the
can also be used as the first argument to matches.
In contrast to the meta-function grammars [2) below], for any
expression grammar, E, the following expression fails to
compile:
E::type
(Digression to design review: This "schizophrenic" feature of
expression grammars makes understanding match harder in a similar
way that untyped variables in an interpreted language makes
understanding that code's functions harder. If there
were an as_gram<ExprType> meta-function which returned the
corresponding meta-function grammar generator, and only
meta-function and predicate grammars [see below] were allowed as
the 2nd argument to matches, then this would ease
understanding and probably debugging [since, hopefully,
deciphering compiler error messages would be easier].
Alternately, there could be an is_gram<ExprType> that when
used in some kinda debug mode, diagnosed, with clearer error
messages than the compilers, that ExprType was not a valid
grammar. This is_gram could implemented using the boost
concept checking library, AFAICT.
)
2) meta-function grammar:
These (except for the nullary ones) appear in the last
column of the table on:
<boost-root>/libs/proto/doc/html/boost_proto/users_guide
/expression_construction/tags_and_meta_functions.html
As the name, meta-function, implies, they all have a nested
::type which returns the corresponding expression type
[a.k.a. expression grammar, see 1) immediately above].
These are classified as nullary and non-nullary depending on
the number of "sub-grammars":
nullary)
These are the ones formed by the proto::op::terminal
template. They have no sub-grammars.
non-nullary)
These have an associated number (the arity) defining the
number of sub-grammars. Each non-nullary template also is
associated with its own c++ operator. The arity of the
grammars is the same as the arity of the associated c++
operator.
definition: meta-function grammar instance:
basis-step)
A nullary meta-function grammar is an meta-function grammar
instance.
induction-step)
If O is a non-nullary meta-function grammar template with
arity, N, and
O1_inst,O2_inst, ... ,ON_inst
are all meta-function grammar instances, then
O<O1_inst,O2_inst, ... ,ON_inst>
is an meta-function grammar instance.
3) predicate grammar:
These are grammars with no corresponding c++ operators.
In contrast to meta-function grammars, for any predicate
grammar, B, the following expressions fail to compile:
B::type
tag_of<B>::type
One reason for this is because there maybe be more than one
type of proto expression that matches a given predicate
grammar. For example, either an expression
matching T0 or one matching T1 matches or_<T0,T1>.
There one special case of a predicate grammar, the wild-card
predicate:
proto::wildcardns_::_
which matches any proto expression. This is analogous to
a Boolean predicate always returning true no matter what the
argument.
Other important predicate grammars (in proto::control
namespace) are not_, and_ and or_. AFAICT, just like any
predicate logic expression can be constructed using:
true
not
and
or
the equivalent of all other predicate grammars, such as:
if_
can be constructed using just these four:
_
not_
and_
or_
* What is your evaluation of the potential usefulness of the library?
Very.
* Did you try to use the library? With what compiler? Did you
have any problems?
Yes:
http://www.boost-consulting.com/vault/index.php?&directory=Strings%20-%20Text%20Processing
&filename=cfg_lookahead_extends.zip
&filename=toy_attract.pair.zip
* How much effort did you put into your evaluation? A glance? A
quick reading? In-depth study?
In-depth, although till have a long way to go.
* Are you knowledgeable about the problem domain?
Somewhat; however, I'm obviously unfamiliar with the specific
problems proto was designed to solve. For example, I was unaware
of some of proto viewpoints expressed by Eric in the last 2
paragraphs of this post:
http://lists.boost.org/Archives/boost/2008/03/134959.php
* Do you think the library should be accepted as a Boost library?
Be sure to say this explicitly so that your other comments don't
obscure your overall opinion.
Eventually, but not yet. There needs to be more documentation
debugging and also more exploration of alternative designs, in
particular, designs guided by algebraic morphisms and category
theory. One justification for this exploration is the morphism
template in attachment to:
http://lists.boost.org/Archives/boost/2008/03/134964.php
Although that post observed that that morphism template:
is correct in the limited case of those
dual-purpose grammars-and-metafunctions.
that doesn't mean it couldn't be extended to handle more general
cases. In particular, in the case of expression grammars, I'd think
the morphism template in the attachment could just be specialized to
be an identity morphism:
template
< class Tag
, class Args
, int Arity
>
struct morphism
< expr<Tag,Args,Arity>
>
{
typedef
expr<Tag,Args,Arity>
type;
};
Further specialization designed to error when given a predicate
grammar would, AFAICT, correctly calculate from either a
meta-function grammar or an expression grammar the matching
expression type.
Category theory *might* be used to guide the predicate grammar
design, as suggested (very indirectly) by parts of:
http://www.amazon.com/Algebraic-Approaches-Semantics-Monographs-Computer/dp/0387963243
For Section 3.3, "The Boolean Algebra of Guards", contains:
Pascal allows a statement of the form:
if x>0 and not x=5 then S_1 else S_2
If S_1,S_2 are to be semantically interpreted as morphisms X->Y
in a category _C_, how can "P and not Q" be interpreted, where P
is "x>0" and Q is "x=5"? ... Equivalently, they [i.s. P and Q]
may be represented as guard functions as in 1.5.20. ... for an
object X in a partially additive category _C_, there is a subset,
Guard(X) of _C_(X,X) of "guard morphisms" which form a Boolean
algebra.
On the same page where 1.5.20 mentioned in the above quote appears,
there's:
22 Definition. let (p1,...,pn) be an n-way test on X and let
f1,...,fn be in SC(X,Y). Then a natural generalization of the
case statement in Pascal is:
case(p1,...,pn)of(f1,...,fn)=f1 p1+...+fn pn
where the p's correspond to the lhs of the when's in an or_ with
elements decorated with transforms and
the f's correspond to the transforms on rhs of the corresponding
when's.
In conclusion, the somewhat vague similarities, noted above, of
proto with algebraic morphisms and, more generally, with category
theory strongly suggests (to me at least) there could be a large
benefit to trying to make these similarities more concrete and using
that knowledge to improve proto's design.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk