The purpose of this doc is to bring you up to speed on what proto’s compilers do, and to make you proficient at applying transformations to expressions. Maybe someday we can figure out how to move this tree-manipulation logic into Fusion, but for now, proto’s compilers are the best (only?) tool for the job.
By now, you are probably familiar enough with the proto::expr<> tree data structure, and how it is generated by proto’s operator overloads. But just to recap, here is the forward declaration of struct expr<>:
template<typename Tag, typename Args, long Arity = Args::size>
struct expr;
Args is one of proto::argsN<>, and Tag can be anything, but is usually one of the operator tags defined by proto.
proto::argsN<> is a compile-time list of types. With one exception, the types are themselves “expression types,” where an expression type is either another instantiation of expr<> or an instantiation of proto::ref<expr<> >. proto::ref<> is used to hold an expr<> by reference. The exception mentioned above is that when Tag is “tag::terminal,” Args is arg1<T> where T is allowed to be any type.
This framework allows us to build heterogeneous tree data structures where the nodes are stored by value or by reference. Often times, those trees are not in an ideal form for processing. Some tree transformations must be applied first. That’s where proto’s compilers come in.
The goal of proto’s compilers is to provide a general and somewhat declarative means for expressing transformations on proto parse trees. A secondary goal is to automate as much as possible and move common operations into the library.
The top-level function “proto::compile()” is used to apply a transformation to a parse tree. However, it is not an algorithm like “fusion::transform()”. “fusion::transform()” applies a transformation to each element in a flat(tened) sequence. But when transforming a parse tree, one transform is usually not enough. Rather, many small transforms must be applied, where each is only relevant to small parts of the tree. “proto::compile()” is just a shell that dispatches to user-specified transforms (a.k.a., compilers) that are looked up according to the properties of the tree currently being transformed.
The only concern of proto’s compilers is: “What transformation do I apply at a particular node in a parse tree?” There are two considerations:
These considerations lead naturally to a design where transformations are double-dispatched on the node type and domain tag. The exact form the dispatching has taken in proto is called “compiler<>” and it has the following forward declaration:
template<typename OpTag, typename DomainTag, typename EnableIf = void>
struct compiler;
You will direct proto to transform a parse tree by specializing struct compiler<> for the operator tags that appear in your DSEL and on your domain tag(s).
Since proto parse trees are like Fusion data structures in that they are heterogeneous and statically typed, yet contain runtime data, proto compilers have two jobs:
Therefore a compiler<> specialization has both a compile-time meta-function and a runtime function associated with it.
As with all statically bound, heterogeneous algorithms, compilers must operate in a “pure functional” environment where new values are computed from old, and state is accumulated, not mutated. Therefore, in addition to accepting the expression to transform, a compiler must also accept a “state” parameter, which represents the state of the transformation so far. Based on the requirements so far, our compiler<> might look like this:
template<>
struct compiler< tag::right_shift, spirit2 >
{
template<typename Expr,
struct apply
{
typedef … type;
};
template<typename Expr,
static typename apply<Expr, State …>::type
call( Expr const &expr, State const & state … )
{
return …;
}
};
The nested apply<> template is used for the type computation, and the static call() function does the runtime transformation.
For many reasons, it is often desirable to pass along a bundle of mutable data along with the immutable expression and state parameters. For this reason, proto compilers also accept a visitor parameter. You are free to use the visitor for anything. So the final form of the proto compilers is:
template<>
struct compiler< tag::right_shift, spirit2 >
{
template<typename Expr,
struct apply
{
typedef … type;
};
template<typename Expr,
static typename apply<Expr, State, Visitor>::type
call( Expr const &expr, State const & state, Visitor & visitor )
{
return …;
}
};
Note that the visitor is passed by non-const reference. This reflects its role at a bundle of mutable data.
This function is nothing more than a thin wrapper that dispatched to a appropriate user-specified compiler. Aside from some minor details regarding the handing of literals and references, what it does is essentially this:
template<typename Expr,
typename meta::compile<Expr, State, Visitor, Domain>::type
compile(
Expr const & expr, // The expr to compile
State const & state, // The state of the transformation so far
Visitor & visitor, // The visitor (is mutable)
Domain /*tag*/ // The domain tag
)
{
typedef typename meta::tag<Expr>::type tag_type;
return compiler<tag_type, Domain>::call( expr, state, visitor );
}
The compilers are free to do anything they want. Usually, they rip the expression apart, recursively compile the constituents (possibly in a different domain), and then reassemble the results in some interesting way. Some transformations seem to show up again and again. It makes sense to factor these common patterns out. Here are the compilers that proto defines, along with high-level descriptions of their purpose:
fold_compiler:
For binary trees only. Compile the left sub-tree and use the result as “state” while compiling the right sub-tree. This is useful for flattening left-associative binary tree structures into lists.
reverse_fold_compiler:
For binary trees only. Compile the right sub-tree and use the result as “state” while compiling the left sub-tree. This is useful for flattening right-associative binary tree structures into lists.
transform_compiler:
Apply a user-specified transformation to a tree, then continue compiling the result with either the default compiler (the one proto::compile() would dispatch to), or with a user-specified compiler. In continuing with the “compiler” metaphor, the transform_compiler is like a preprocessor.
branch_compiler:
Compile the current tree as a “branch” and replace the branch with the result. This is accomplished in two steps. First, the “state” passed in is ignored, and the tree is compiled instead with a new user-specified “state” (and in a different domain). Then, the result (along with the original state and visitor) is passed to a user-specified compiler for further processing.
conditional_compiler:
Sometimes the double-dispatch on operator type and domain is not enough to pick a transform, and some other properties of the Expr, State and/or Visitor types must be queried in order to select the proper transform to apply. The conditional_compiler handles that. You pass it a compile-time predicate to evaluate against the (Expr,State,Visitor) 3-tuple, an IF compiler to use if the predicate evaluates to “true_” and an ELSE compiler to use otherwise.
pass_through_compiler:
For use when no transformation should be applied at a particular node. It recursively calls compile on the sub-trees and assembles the result back into an expr<>.
list_compiler:
Builds a fusion cons-list by using Expr as the head and State as the tail.
The following program uses the list_compiler and the reverse_fold_compiler<> to turn a tree of terminals in sequence with operator+ into a fusion cons-list.
#include <boost/xpressive/proto/proto.hpp>
#include <boost/xpressive/proto/compiler/list.hpp>
#include <boost/xpressive/proto/compiler/fold.hpp>
using namespace boost;
struct my_domain {};
namespace boost { namespace proto
{
template<>
struct
compiler<tag::terminal, my_domain>
:
list_compiler
{};
template<>
struct
compiler<tag::add, my_domain>
: reverse_fold_compiler<my_domain>
{};
}}
int main()
{
int
visitor = 0;
proto::meta::terminal<int>::type _1 = {1};
// cons<
// terminal<int>::type(1)
// , cons<
// terminal<double>::type(2.0)
// , cons<
// terminal<char>::type('3')
// , nil
// >
// >
// >
proto::compile( _1 + 2.0 + '3', fusion::nil(), visitor, my_domain() );
return 0;
}
This example is like the above example, except that it uses the transform_compiler<> to pull the arguments out of the terminal nodes. The result is a cons-list of primitives.
#include
<boost/xpressive/proto/proto.hpp>
#include
<boost/xpressive/proto/compiler/fold.hpp>
#include
<boost/xpressive/proto/compiler/list.hpp>
#include
<boost/xpressive/proto/compiler/transform.hpp>
using
namespace boost;
struct
my_domain {};
namespace
boost { namespace proto
{
template<>
struct compiler<tag::terminal,
my_domain>
: transform_compiler<
arg_transform // apply the "arg" transform, to
extract the arg
, my_domain // stay in "my_domain"
, list_compiler // send the result to the list compiler
>
{};
template<>
struct compiler<tag::add, my_domain>
: reverse_fold_compiler<my_domain>
{};
}}
int
main()
{
int visitor = 0;
proto::meta::terminal<int>::type _1 =
{1};
//
cons<
//
int(1)
//
, cons<
//
double(2.0)
//
, cons<
// char('3')
// , nil
//
>
//
>
//
>
proto::compile( _1 + 2.0 + '3',
fusion::nil(), visitor, my_domain() );
return 0;
}
Note that this example uses the canned “arg_transform” to extract the argument from the terminal nodes. “Arg_transform” is defined in proto as follows:
struct
arg_transform
{
template<typename Expr, typename,
typename>
struct apply
{
typedef typename
meta::arg<Expr>::type type;
};
template<typename Expr,
static typename apply<Expr, State,
Visitor>::type const &
call(Expr const &expr, State const
&, Visitor &)
{
return proto::arg(expr);
}
};
Proto also provides a “left_transform” and a “right_trannsform”, as well as a “compose_transforms<>” for applying two transforms in sequence, and a “conditional_transform<>” for applying a transform depending on a compile-time predicate.
The following example is an extension of the previous. As before, we use the reverse_fold_compiler to flatten a tree of “plus” nodes, but we use a branch_compiler also to flatten nested trees of “multiply” nodes. The result is a list of lists.
#include
<boost/xpressive/proto/proto.hpp>
#include
<boost/xpressive/proto/compiler/fold.hpp>
#include
<boost/xpressive/proto/compiler/list.hpp>
#include
<boost/xpressive/proto/compiler/transform.hpp>
#include
<boost/xpressive/proto/compiler/branch.hpp>
using
namespace boost;
struct
plus {};
struct
multiply {};
template<typename
T> struct my_domain {};
struct
list_branch : proto::list_compiler
{
typedef fusion::nil state_type;
};
namespace
boost { namespace proto
{
template<typename Domain>
struct compiler<tag::terminal,
my_domain<Domain> >
: transform_compiler<
arg_transform // apply the "arg" transform,
to extract the arg
, my_domain<Domain> // stay in the same domain
, list_compiler // send the result of the transform to
the list compiler
>
{};
template<>
struct compiler<tag::add,
my_domain<plus> >
:
reverse_fold_compiler<my_domain<plus> >
{};
template<>
struct compiler<tag::multiply,
my_domain<plus> >
: branch_compiler<list_branch,
my_domain<multiply> >
{};
template<>
struct compiler<tag::multiply,
my_domain<multiply> >
:
reverse_fold_compiler<my_domain<multiply> >
{};
}}
int
main()
{
int visitor = 0;
proto::meta::terminal<int>::type _1 =
{1};
//
cons<
//
cons<int(1), cons<int(1), nil> >
//
, cons<
//
cons<int(1), cons<double(2.0), nil> >
//
, cons<
//
cons<int(1),
cons<char('3'), nil> >
// , nil
//
>
//
>
//
>
proto::compile( _1*1 + _1*2.0 + _1*'3',
fusion::nil(), visitor, my_domain<plus> () );
return 0;
}
Note that when invoking the branch_compiler, we switch to another domain to compile the multiply operator. If we had stayed in the same domain, it would be an error because the branch_compiler would try to invoke itself recursively. When you want to use a branch_compiler to compile a sub-tree independently, you must switch to a different domain.
As we’ve discussed, the proto compilers are low-level and somewhat difficult to work with. The ultimate goal is to combine the tree transformation facilities with the proto meta-grammar facilities. The idea of “domains” is really just a hack to approximate different production rules within a grammar. In the future, tree transformations will be specified by writing a grammar and attaching transformations to the rules in your grammar.