Boost logo

Boost :

Subject: Re: [boost] [metaparse] review
From: Abel Sinkovics (abel_at_[hidden])
Date: 2015-06-03 18:38:16


Hi Gordon,

Thank you for the review.

On 2015-06-03 23:49, Gordon Woodhull wrote:
>
> I only skimmed the user manual, since the content looks a little bit stale (doesn't mention build_parser) and somewhat redundant with the tutorial.
The intention with the manual is to provide more focused answers (eg.
what are the looping constructs and how they relate to each other?). You
can learn it from the tutorial as well, but finding it is more difficult.

> As Andrzej mentions, there may need to be an introduction concerning the scope of the library and what you can and can't do with this kind of EDSL, where you would use Proto instead of Metaparse, etc.
Will be added.

> I think that as the library evolves, the documentation will probably need more general parsing background, of the sort you will find in Spirit documentation. For example, the tutorial covers the unary `-` operator, but it doesn't mention why there is no ambiguity with the binary `-` operator.
>
> Also, it seems (as mentioned in the design discussion below) that the grammar must be unambiguous, and this is mentioned nowhere.
Will be added.

> Very good. However, I think the API might be fleshed out a little bit once the library gets more usage. In particular, I found myself writing a few helper metafunctions that looked like this:
>
> template<typename EdgeDecl>
> struct declare_edge {
> typedef edge_decl<typename mpl::at_c<EdgeDecl, 0>::type,
> typename mpl::at_c<EdgeDecl, 2>::type,
> typename mpl::at_c<EdgeDecl, 3>::type> type;
> };
>
> invoked from a transform like this:
>
> using edge_stmt = transform<
> sequence<ID, arrow_token, ID, opt<attr_list, mpl::vector<>>>,
> mpl::quote1<declare_edge>
> >;
>
> It may be my inexperience, but it struck me that it would be helpful to have a "spread" or "apply" transform so that the items in the mpl::vector produced by sequence would be supplied as parameters to declare_edge, instead of having to pull them out using mpl::at_c. There may be some trick I don't know here.
There are first_of, last_of, middle_of and nth_of for the cases when one
needs one element of a sequence. For your use case (instantiate a
template class with multiple elements) using at_c is what is currently
available, but some better tool can be added.

> First, I started out defining my AST nodes like this:
>
> template<typename Tail, typename Head, typename Attrs>
> struct edge_decl {};
>
> Seems like a reasonable basic type-holder to me. As shown above, this is instantiated and returned by a helper metafunction invoked by the transform which pulls stuff out of an mpl::vector.
>
> But then somewhere in metaparse, it tries to again "invoke" this result with ::type when appending it to another mpl::vector.
>
> Really? So my transform needs to return not the value, but a metafunction which returns a value?
It might be a bug. Do you still have the code that produced it?

> I couldn't figure this out, so I gave each of my AST nodes a self-referential ::type in the way that mpl::vector has:
>
> template<typename Tail, typename Head, typename Attrs>
> struct edge_decl {
> typedef edge_decl type; //why?
> };
>
> Something is not right here!
Note that when I was experimenting with lazy evaluation, then I came to
the conclusion that adding such self-referential ::type to every class
you intend to use as value in template metaprogramming makes life much
easier. You don't need to worry about where you return a value and where
you return a thunk.

However, this does not change the fact that Metaparse should not use
::type where it is not needed.

> The second problem I encountered is about ambiguous grammars. For the "dot" language it seems natural to define a statement this way:
>
> using stmt = one_of<node_stmt, edge_stmt>;
>
> where a node statement looks like
>
> ID [attr=value, attr=value];
>
> and an edge statement looks like
>
> ID -> ID [attr=value];
>
> But there's an ambiguity here, and once the parser successfully reads a node, it is going to get upset if it sees a `->`. So the rule has to be written the other way:
>
> using stmt = one_of<edge_stmt, node_stmt>;
Good point. If node_stmt succeeds, one_of won't check edge_stmt (in your
first attempt). This should be pointed out in the documentation.

> Similarly, the `;` separator for declarations is optional in the "dot" language, but I couldn't figure out how to implement this, because if you try to have an optional element at the beginning of a rule:
>
> template<typename Parser, typename Default = void>
> using opt = one_of<Parser, return_<Default>>;
>
> using stmt_list = foldlp<
> sequence<opt<semi_token>, stmt>,
> transform<stmt, mpl::quote1<mpl::vector1>>,
> // ...
>
> then the parser will greedily eat the epsilon and have nowhere to go (at least according to my amateur understanding of parsers).
>
> I gave up here and made the separator mandatory. Again, this is probably not a flaw in the design, but the documentation should describe what kinds of grammars will work, and perhaps give pointers to other resources about disambiguating grammars. Probably there is a different construct which would help here.
I'm not sure what error you ended up with in this case. The way you
implement the optional semicolon seems to be right. What I'd change here
is using last_of instead of sequence assuming that you only need the
result of stmt and not the entire sequence of the (optional) semicolon
and the stmt result. "How you should implement different kinds of
grammar elements" is difficult to present in the documentation.
Currently the tutorial shows how things intended to be used (eg. it
presents first_of and middle_of).

Regards,
   Ábel


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk