Boost logo

Boost-Commit :

From: lists.drrngrvy_at_[hidden]
Date: 2007-11-09 17:49:48


Author: drrngrvy
Date: 2007-11-09 17:49:47 EST (Fri, 09 Nov 2007)
New Revision: 40976
URL: http://svn.boost.org/trac/boost/changeset/40976

Log:
Added Subrules, Rule and Quick Start sections (I'm sure I've already done those, but alas! they are lost.
Added:
   sandbox/boost_docs/branches/spirit_qbking/doc/src/quick_start.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/rule.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/subrules.qbk (contents, props changed)
Text files modified:
   sandbox/boost_docs/branches/spirit_qbking/doc/src/spirit.qbk | 11 +++++++++++
   1 files changed, 11 insertions(+), 0 deletions(-)

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/quick_start.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/quick_start.qbk 2007-11-09 17:49:47 EST (Fri, 09 Nov 2007)
@@ -0,0 +1,168 @@
+
+[section Quick Start]
+
+[header Why would you want to use Spirit?]
+
+Spirit is designed to be a practical parsing tool. At the very least, the ability to generate a fully-working parser from a formal EBNF specification inlined in C++ significantly reduces development time. While it may be practical to use a full-blown, stand-alone parser such as YACC or ANTLR when we want to develop a computer language such as C or Pascal, it is certainly overkill to bring in the big guns when we wish to write extremely small micro-parsers. At that end of the spectrum, programmers typically approach the job at hand not as a formal parsing task but through ad hoc hacks using primitive tools such as `scanf`. True, there are tools such as regular-expression libraries (such as __boost_regex__) or scanners (such as __boost_tokenizer__), but these tools do not scale well when we need to write more elaborate parsers. Attempting to write even a moderately-complex parser using these tools leads to code that is hard to understand and maintain.
+
+One prime objective is to make the tool easy to use. When one thinks of a parser generator, the usual reaction is "it must be big and complex with a steep learning curve." Not so. Spirit is designed to be fully scalable. The framework is structured in layers. This permits learning on an as-needed basis, after only learning the minimal core and basic concepts.
+
+For development simplicity and ease in deployment, the entire framework consists of only header files, with no libraries to link against or build. Just put the spirit distribution in your include path, compile and run. Code size? -very tight. In the quick start example that we shall present in a short while, the code size is dominated by the instantiation of the `std::vector` and `std::iostream`.
+
+[header Trivial Example #1]
+
+Create a parser that will parse a floating-point number.
+
+``
+ real_p
+``
+
+(You've got to admit, that's trivial!) The above code actually generates a Spirit `real_parser` (a built-in parser) which parses a floating point number. Take note that parsers that are meant to be used directly by the user end with `"_p"` in their names as a Spirit convention. Spirit has many pre-defined parsers and consistent naming conventions help you keep from going insane!
+
+[header Trivial Example #2]
+
+Create a parser that will accept a line consisting of two floating-point numbers.
+
+``
+ real_p >> real_p
+``
+
+Here you see the familiar floating-point numeric parser `real_p` used twice, once for each number. What's that `>>` operator doing in there? Well, they had to be separated by something, and this was chosen as the "followed by" sequence operator. The above program creates a parser from two simpler parsers, glueing them together with the sequence operator. The result is a parser that is a composition of smaller parsers. Whitespace between numbers can implicitly be consumed depending on how the parser is invoked (see below).
+
+Note: when we combine parsers, we end up with a "bigger" parser, But it's still a parser. Parsers can get bigger and bigger, nesting more and more, but whenever you glue two parsers together, you end up with one bigger parser. This is an important concept.
+
+[header Trivial Example #3]
+
+Create a parser that will accept an arbitrary number of floating-point numbers. (Arbitrary means anything from zero to infinity)
+
+``
+ *real_p
+``
+
+This is like a regular-expression Kleene Star, though the syntax might look a bit odd for a C++ programmer not used to seeing the `*` operator overloaded like this. Actually, if you know regular expressions it may look odd too since the star is [*before] the expression it modifies. C'est la vie. Blame it on the fact that we must work with the syntax rules of C++.
+
+Any expression that evaluates to a parser may be used with the Kleene Star. Keep in mind, though, that due to C++ operator precedence rules you may need to put the expression in parentheses for complex expressions. The Kleene Star is also known as a Kleene Closure, but we call it the Star in most places.
+
+[section:e4 Example #4 \[ A Just Slightly Less Trivial Example \]]
+
+This example will create a parser that accepts a comma-delimited list of numbers and put the numbers in a vector.
+
+[header Step 1. Create the parser]
+
+``
+ real_p >> *(ch_p(',') >> real_p)
+``
+
+Notice `ch_p(',')`. It is a literal character parser that can recognize the comma `','`. In this case, the Kleene Star is modifying a more complex parser, namely, the one generated by the expression:
+
+``
+ (ch_p(',') >> real_p)
+``
+
+Note that this is a case where the parentheses are necessary. The Kleene star encloses the complete expression above.
+
+[header Step 2. Using a Parser (now that it's created)]
+
+Now that we have created a parser, how do we use it? Like the result of any C++ temporary object, we can either store it in a variable, or call functions directly on it.
+
+We'll gloss over some low-level C++ details and just get to the good stuff.
+
+If `r` is a rule (don't worry about what rules exactly are for now. This will be discussed later. Suffice it to say that the rule is a placeholder variable that can hold a parser), then we store the parser as a rule like this:
+
+``
+ r = real_p >> *(ch_p(',') >> real_p);
+``
+
+Not too exciting, just an assignment like any other C++ expression you've used for years. The cool thing about storing a parser in a rule is this: rules are parsers, and now you can refer to it by name. (In this case the name is `r`). Notice that this is now a full assignment expression, thus we terminate it with a semicolon, `";"`.
+
+That's it. We're done with defining the parser. So the next step is now invoking this parser to do its work. There are a couple of ways to do this. For now, we shall use the free parse function that takes in a `char const*`. The function accepts three arguments:
+
+* The null-terminated `const char*` input
+* The parser object
+* Another parser called the [*skip parser]
+
+In our example, we wish to skip spaces and tabs. Another parser named `space_p` is included in Spirit's repertoire of predefined parsers. It is a very simple parser that simply recognizes whitespace. We shall use `space_p` as our skip parser. The skip parser is the one responsible for skipping characters in between parser elements such as the `real_p` and the `ch_p`.
+
+Ok, so now let's parse!
+
+``
+ r = real_p >> *(ch_p(',') >> real_p);
+ parse(str, r, space_p) // Not a full statement yet, patience...
+``
+
+The parse function returns an object (called `parse_info`) that holds, among other things, the result of the parse. In this example, we need to know:
+
+
+* Did the parser successfully recognize the input `str`?
+* Did the parser [*fully] parse and consume the input up to its end?
+
+To get a complete picture of what we have so far, let us also wrap this parser inside a function:
+
+``
+ bool
+ parse_numbers(char const* str)
+ {
+ return parse(str, real_p >> *(',' >> real_p), space_p).full;
+ }
+``
+
+Note in this case we dropped the named rule and inlined the parser directly in the call to parse. Upon calling parse, the expression evaluates into a temporary, unnamed parser which is passed into the `parse()` function, used, and then destroyed.
+
+[note [*`char` and `wchar_t` operands]
+
+The careful reader may notice that the parser expression has `','` instead of `ch_p(',')` as the previous examples did. This is ok due to C++ syntax rules of conversion. There are `>>` operators that are overloaded to accept a `char` or `wchar_t` argument on its left or right (but not both). An operator may be overloaded if at least one of its parameters is a user-defined type. In this case, the `real_p` is the 2nd argument to `operator>>`, and so the proper overload of `>>` is used, converting `','` into a character literal parser.
+
+The problem with omiting the `ch_p` call should be obvious: `'a' >> 'b'` is not a spirit parser, it is a numeric expression, right-shifting the ASCII (or another encoding) value of `'a'` by the ASCII value of `'b'`. However, both `ch_p('a') >> 'b'` and `'a' >> ch_p('b')` are Spirit sequence parsers for the letter `'a'` followed by `'b'`. You'll get used to it, sooner or later.
+]
+
+Take note that the object returned from the parse function has a member called `full` which returns true if both of our requirements above are met (i.e. the parser fully parsed the input).
+
+[header Step 3. Semantic Actions]
+
+Our parser above is really nothing but a recognizer. It answers the question /"did the input match our grammar?"/, but it does not remember any data, nor does it perform any side effects. Remember: we want to put the parsed numbers into a vector. This is done in an [*action] that is linked to a particular parser. For example, whenever we parse a real number, we wish to store the parsed number after a successful match. We now wish to extract information from the parser. Semantic actions do this. Semantic actions may be attached to any point in the grammar specification. These actions are C++ functions or functors that are called whenever a part of the parser successfully recognizes a portion of the input. Say you have a parser *P*, and a C++ function *F*, you can make the parser call *F* whenever it matches an input by attaching *F*:
+
+``
+ P[&F]
+``
+
+Or if *F* is a function object (a functor):
+
+``
+ P[F]
+``
+
+The function/functor signature depends on the type of the parser to which it is attached. The parser `real_p` passes a single argument: the parsed number. Thus, if we were to attach a function *F* to `real_p`, we need *F* to be declared as:
+
+``
+ void F(double n);
+``
+
+For our example however, again, we can take advantage of some predefined semantic functors and functor generators ( [$__lens__] A functor generator is a function that returns a functor). For our purpose, Spirit has a functor generator `push_back_a(c)`. In brief, this semantic action, when called, appends the parsed value it receives from the parser it is attached to, to the container `c`.
+
+Finally, here is our complete comma-separated list parser:
+
+``
+ bool
+ parse_numbers(char const* str, vector<double>& v)
+ {
+ return parse(str,
+
+ // Begin grammar
+ (
+ real_p[push_back_a(v)] >> *(',' >> real_p[push_back_a(v)])
+ )
+ ,
+ // End grammar
+
+ space_p).full;
+ }
+``
+
+This is the same parser as above. This time with appropriate semantic actions attached to strategic places to extract the parsed numbers and stuff them in the vector `v`. The `parse_numbers` function returns `true` when successful.
+
+[$__lens__] The full source code can be [@__example__/fundamental/number_list.cpp viewed here]. This is part of the Spirit distribution.
+
+[endsect][/ e4]
+
+[endsect][/ quick_start]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/rule.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/rule.qbk 2007-11-09 17:49:47 EST (Fri, 09 Nov 2007)
@@ -0,0 +1,198 @@
+
+[section:rule The Rule]
+
+The [*rule] is a polymorphic parser that acts as a named place-holder capturing the behavior of an EBNF expression assigned to it. Naming an EBNF expression allows it to be referenced later. The `rule` is a template class parameterized by the type of the scanner (`ScannerT`), the rule's [link __in_depth_the_parser_context__ context] and its [# tag]. Default template parameters are provided to make it easy to use the rule.
+
+``
+ template<
+ typename ScannerT = scanner<>,
+ typename ContextT = parser_context<>,
+ typename TagT = parser_address_tag>
+ class rule;
+``
+
+Default template parameters are supplied to handle the most common case. `ScannerT` defaults to `scanner<>`, a plain vanilla scanner that acts on `char const*` iterators and does nothing special at all other than iterate through all the `char`s in the null terminated input a character at a time. The rule tag, `TagT`, typically used with [link __trees__ ASTs], is used to identify a rule; it is explained [# tag here]. In trivial cases, declaring a rule as `rule<>` is enough. You need not be concerned at all with the `ContextT` template parameter unless you wish to tweak the low level behavior of the rule. Detailed information on the `ContextT` template parameter is provided elsewhere.
+
+[header Order of parameters]
+
+As of v1.8.0, the `ScannerT`, `ContextT` and `TagT` can be specified in any order. If a template parameter is missing, it will assume the defaults. Examples:
+
+``
+ rule<> rx1;
+ rule<scanner<> > rx2;
+ rule<parser_context<> > rx3;
+ rule<parser_context<>, parser_address_tag> rx4;
+ rule<parser_address_tag> rx5;
+ rule<parser_address_tag, scanner<>, parser_context<> > rx6;
+ rule<parser_context<>, scanner<>, parser_address_tag> rx7;
+``
+
+[header Multiple scanners]
+
+As of v1.8.0, rules can use one or more scanner types. There are cases, for instance, where we need a rule that can work on the phrase and character levels. Rule/scanner mismatch has been a source of confusion and is the no. 1 __FAQ__. To address this issue, we now have multiple scanner support. Example:
+
+``
+ typedef scanner_list<scanner<>, phrase_scanner_t> scanners;
+
+ rule<scanners> r = +anychar_p;
+ assert(parse("abcdefghijk", r).full);
+ assert(parse("a b c d e f g h i j k", r, space_p).full);
+``
+
+Notice how rule `r` is used in both the phrase and character levels.
+
+By default support for multiple scanners is disabled. The macro `BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT` must be defined to the maximum number of scanners allowed in a scanner_list. The value must be greater than `1` to enable multiple scanners. Given the example above, to define a limit of two scanners for the list, the following line must be inserted into the source file before the inclusion of Spirit headers:
+
+``
+ #define BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT 2
+``
+
+[tip __FIX_LINKS_HERE__ See the techniques section for an example of a grammar using a multiple scanner enabled rule, `lexeme_scanner` and `as_lower_scanner`.]
+
+[header Rule Declarations]
+
+The `rule` class models EBNF's production rule. Example:
+
+``
+ rule<> a_rule = *(a | b) & +(c | d | e);
+``
+
+The type and behavior of the right-hand (rhs) EBNF expression, which may be arbitrarily complex, is encoded in the rule named `a_rule`. `a_rule` may now be referenced elsewhere in the grammar:
+
+``
+ rule<> another_rule = f >> g >> h >> a_rule;
+``
+
+[warning [*Referencing rules]
+
+When a rule is referenced anywhere in the right hand side of an EBNF expression, the rule is held by the expression by reference. It is the responsibility of the client to ensure that the referenced rule stays in scope and does not get destructed while it is being referenced.
+]
+
+``
+ a = int_p;
+ b = a;
+ c = int_p >> b;
+``
+
+[header Copying Rules]
+
+The rule is a weird C++ citizen, unlike any other C++ object. It does not have the proper copy and assignment semantics and cannot be stored and passed around by value. If you need to copy a rule you have to explicitly call its member function `copy()`:
+
+``
+ r.copy();
+``
+
+However, be warned that copying a rule will not deep copy other referenced rules of the source rule being copied. This might lead to dangling references. Again, it is the responsibility of the client to ensure that all referenced rules stay in scope and does not get destructed while it is being referenced. Caveat emptor.
+
+If you copy a rule, then you'll want to place it in a storage somewhere. The problem is how? The storage can't be another rule:
+
+``
+ rule<> r2 = r.copy(); // BAD!
+``
+
+because rules are weird and does not have the expected C++ copy-constructor and assignment semantics! As a general rule: Don't put a copied rule into another rule! Instead, use the `stored_rule` for that purpose.
+
+[header Forward declarations]
+
+A `rule` may be declared before being defined to allow cyclic structures typically found in BNF declarations. Example:
+
+``
+ rule<> a, b, c;
+
+ a = b | a;
+ b = c | a;
+``
+
+[header Recursion]
+
+The right-hand side of a rule may reference other rules, including itself. The limitation is that direct or indirect left recursion is not allowed (this is an unchecked run-time error that results in an infinite loop). This is typical of top-down parsers. Example:
+
+``
+ a = a | b; // infinite loop!
+``
+
+[note [*What is left recursion?]
+
+Left recursion happens when you have a rule that calls itself before anything else. A top-down parser will go into an infinite loop when this happens. See the FAQ for details on how to eliminate left recursion.
+]
+
+[header Undefined rules]
+
+An undefined rule matches nothing and is semantically equivalent to `nothing_p`.
+
+[header Redeclarations]
+
+Like any other C++ assignment, a second assignment to a rule is destructive and will redefine it. The old definition is lost. Rules are dynamic. A rule can change its definition anytime:
+
+``
+ r = a_definition;
+ r = another_definition;
+``
+
+Rule `r` loses the old definition when the second assignment is made. As mentioned, an undefined rule matches nothing and is semantically equivalent to `nothing_p`.
+
+[header Dynamic Parsers]
+
+Hosting declarative EBNF in imperative C++ yields an interesting blend. We have the best of both worlds. We have the ability to conveniently modify the grammar at run time using imperative constructs such as `if, else` statements. Example:
+
+``
+ if (feature_is_available)
+ r = add_this_feature;
+``
+
+Rules are essentially dynamic parsers. A dynamic parser is characterized by its ability to modify its behavior at run time. Initially, an undefined rule matches nothing. At any time, the rule may be defined and redefined, thus, dynamically altering its behavior.
+
+[header No start rule]
+
+Typically, parsers have what is called a `start` symbol, chosen to be the root of the grammar where parsing starts. The Spirit parser framework has no notion of a start symbol. Any rule can be a start symbol. This feature promotes step-wise creation of parsers. We can build parsers from the bottom up while fully testing each level or module up untill we get to the top-most level.
+
+[header Parser Tags]
+
+Rules may be tagged for identification purposes. This is necessary, especially when dealing with [link __trees__ parse trees and ASTs] to see which rule created a specific AST/parse tree node. Each rule has an ID of type `parser_id`. This ID can be obtained through the rule's `id()` member function:
+
+``
+ my_rule.id(); // get my_rule's id
+``
+
+The `parser_id` class is declared as:
+
+``
+ class parser_id
+ {
+ public:
+ parser_id();
+ explicit parser_id(void const* p);
+ parser_id(std::size_t l);
+
+ bool operator==(parser_id const& x) const;
+ bool operator!=(parser_id const& x) const;
+ bool operator<(parser_id const& x) const;
+ std::size_t to_long() const;
+ };
+``
+
+[header `parser_address_tag`]
+
+The rule's `TagT` template parameter supplies this ID. This defaults to `parser_address_tag`. The `parser_address_tag` uses the address of the rule as its ID. This is often not the most convenient, since it is not always possible to get the address of a rule to compare against.
+
+[header `parser_tag`]
+
+It is possible to have specific constant integers to identify a rule. For this purpose, we can use the `parser_tag<N>`, where `N` is a constant integer:
+
+``
+ rule<parser_tag<123> > my_rule; // set my_rule's id to 123
+``
+
+[header `dynamic_parser_tag`]
+
+The `parser_tag<N>` can only specifiy a [*static ID], which is defined at compile time. If you need the ID to be [*dynamic] (changeable at runtime), you can use the `dynamic_parser_tag` class as the `TagT` template parameter. This template parameter enables the `set_id()` function, which may be used to set the required id at runtime:
+
+``
+ rule<dynamic_parser_tag> my_dynrule;
+ my_dynrule.set_id(1234); // set my_dynrule's id to 1234
+``
+
+If the `set_id()` function isn't called, the parser id defaults to the address of the rule as its ID, just like the `parser_address_tag` template parameter would do.
+
+[endsect][/ rule]
+

Modified: sandbox/boost_docs/branches/spirit_qbking/doc/src/spirit.qbk
==============================================================================
--- sandbox/boost_docs/branches/spirit_qbking/doc/src/spirit.qbk (original)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/spirit.qbk 2007-11-09 17:49:47 EST (Fri, 09 Nov 2007)
@@ -59,6 +59,7 @@
 [include introduction.qbk]
 
 Quick Start
+[include quick_start.qbk]
 
 Basic Concepts
 [include basic_concepts.qbk]
@@ -75,20 +76,30 @@
 [include primitives.qbk]
 
 Operators
+[include operators.qbk]
 
 Numerics
 [include numerics.qbk]
 
 The Rule
+[include rule.qbk]
+
 Epsilon
+[include epsilon.qbk]
+
 Directives
+[include directives.qbk]
 
 The Scanner and Parsing
 [include scanner.qbk]
 
 The Grammar
+
 Subrules
+[include subrules.qbk]
+
 Semantic Actions
+
 In-depth: The Parser
 
 In-depth: The Scanner

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/subrules.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/subrules.qbk 2007-11-09 17:49:47 EST (Fri, 09 Nov 2007)
@@ -0,0 +1,214 @@
+
+[section Subrules]
+
+Spirit is implemented using expression templates. This is a very powerful technique. Along with its power comes some complications. We almost take for granted that when we write `i | j >> k` where `i`, `j` and `k` are all integers the result is still an integer. Yet, with expression templates, the same expression `i | j >> k` where `i`, `j` and `k` are of type `T`, the result is a complex composite type [see __Basic_Concepts__]. Spirit expressions, which are combinations of primitives and composites yield an infinite set of new types. One problem is that C++ offers no easy facility to deduce the type of an arbitrarily complex expression that yields a complex type. Thus, while it is easy to write:
+
+``
+ int r = i | j >> k; // where i, j, and k are ints
+``
+
+Expression templates yield an endless supply of types. Without the [link __rule__ rule], there is no easy way to do this in C++ if `i`, `j` and `k` are Spirit parsers:
+
+``
+ <what_type???> r = i | j >> k; // where i, j, and k are Spirit parsers
+``
+
+If `i`, `j` and `k` are all `chlit<>` objects, the type that we want is:
+
+``
+ typedef
+ alternative<
+ chlit<> // i
+ , sequence<
+ chlit<> // j
+ , chlit<> // k
+ >
+ >
+ rule_t;
+
+ rule_t r = i | j >> k; // where i, j, and k are chlit<> objects
+``
+
+We deliberately formatted the type declaration nicely to make it understandable. Try that with a more complex expression. While it can be done, explicitly spelling out the type of a Spirit expression template is tedious and error prone. The right hand side (rhs) has to mirror the type of the left hand side (lhs). ([$__lens__] Yet, if you still wish to do it, see this [link __techniques__#no_rules link] for a technique).
+
+[info [*`typeof` and `auto`]
+
+Some compilers already support the `typeof` keyword. This can be used to free us from having to explicitly type the type (pun intentional). Using the `typeof`, we can rewrite the Spirit expression above as:
+
+``
+typeof(i | j >> k) r = i | j >> k;
+``
+
+While this is better than having to explicitly declare a complex type, it is redundant, error prone and still an eye sore. The expression is typed twice. The only way to simplify this is to introduce a macro (See this [link __techniques__#typeof link] for more information).
+
+[@http://boost-consulting.com David Abrahams] proposed in comp.std.c++ to reuse the `auto` keyword for type deduced variables. This has been extensibly discussed in [@http://boost.org Boost]. Example:
+
+``
+auto r = i | j >> k;
+``
+
+Once such a C++ extension is accepted into the standard, this would be a neat solution and a nice fit for our purpose. It's not a complete solution though since there are still situations where we do not know the rhs beforehand; for instance when pre-declaring cyclic dependent rules.
+]
+
+Fortunately, rules come to the rescue. Rules can capture the type of the expression assigned to it. Thus:
+
+``
+ rule<> r = i | j >> k; // where i, j, and k are chlit<> objects
+``
+
+It might not be apparent but behind the scenes, plain rules are actually implemented using a pointer to a runtime polymorphic abstract class that holds the dynamic type of the parser assigned to it. When a Spirit expression is assigned to a rule, its type is encapsulated in a concrete subclass of the abstract class. A virtual parse function delegates the parsing to the encapsulated object.
+
+Rules have drawbacks though:
+
+* It is coupled to a specific scanner type. The rule is tied to a specific scanner [see __The_Scanner_Business__].
+* The rule's parse member function has a virtual function call overhead that cannot be inlined.
+
+[section:skip_overview Static rules: subrules]
+
+The `subrule` is a fully static version of the rule. The `subrule` does not have the drawbacks listed above.
+
+* The `subrule` is not tied to a specific scanner so just about any scanner type may be used.
+* The `subrule` also allows aggressive inlining since there are no virtual function calls.
+
+``
+ template<int ID, typename ContextT = parser_context<> >
+ class subrule;
+``
+
+The first template parameter gives the subrule an identification tag. Like the `rule`, there is a `ContextT` template parameter that defaults to `parser_context`. You need not be concerned at all with the `ContextT` template parameter unless you wish to tweak the low level behavior of the subrule. Detailed information on the `ContextT` template parameter is provided [link __in_depth_the_parser_context__ elsewhere].
+
+Presented above is the public API. There may actually be more template parameters after `ContextT`. Everything after the `ContextT` parameter should not be of concern to the client and are strictly for internal use only.
+
+Apart from a few minor differences, the subrule follows the usage and syntax of the rule closely. Here's the calculator grammar using subrules:
+
+``
+ struct calculator : public grammar<calculator>
+ {
+ template <typename ScannerT>
+ struct definition
+ {
+ definition(calculator const& self)
+ {
+ first =
+ (
+ expression = term >> *(('+' >> term) | ('-' >> term)),
+ term = factor >> *(('*' >> factor) | ('/' >> factor)),
+ factor = integer | group,
+ group = '(' >> expression >> ')'
+ );
+ }
+
+ subrule<0> expression;
+ subrule<1> term;
+ subrule<2> factor;
+ subrule<3> group;
+
+ rule<ScannerT> first;
+ rule<ScannerT> const&
+ start() const { return first; }
+ };
+ };
+``
+
+[$__lens__] A fully working example with [link __semantic_actions__ semantic actions] can be [@__examples__/fundamental/subrule_calc.cpp viewed here]. This is part of the Spirit distribution.
+
+[$../theme/subrule1.png]
+
+The subrule as an efficient version of the rule. Compiler optimizations such as aggressive inlining help reduce the code size and increase performance significantly.
+
+The subrule is not a panacea however. Subrules push the C++ compiler hard to its knees. For example, current compilers have a limit on recursion depth that may not be exceeded. Don't even think about writing a full pascal grammar using subrules alone. A grammar using subrules is a single C++ expression. Current C++ compilers cannot handle very complex expressions very well. Finally, a plain rule is still needed to act as place holder for subrules.
+
+The code above is a good example of the recommended way to use subrules. Notice the hierarchy. We have a grammar that encapsulates the whole calculator. The start rule is a plain rule that holds the set of subrules. The subrules in turn defines the actual details of the grammar.
+
+[info [*Template instantiation depth]
+
+Spirit pushes the C++ compiler hard. Current C++ compilers cannot handle very complex heavily nested expressions very well. One restricting factor is the typical compiler's limit on template recursion depth. Some, but not all, compilers allow this limit to be configured.
+
+g++'s maximum can be set using a compiler flag: -ftemplate-depth. Set this appropriately if you have a relatively complex grammar.
+
+Microsoft Visual C++ can take greater than 1000 for both template class and function instantiation depths. However, the linker chokes with deep template function instantiation unless inline recursion depth is set using these pragmas:
+
+``
+#pragma inline_depth(255)
+#pragma inline_recursion(on)
+``
+
+These limitations may no longer apply to more modern compilers. Be sure to check your compiler documentation.
+]
+
+This setup gives a good balance. The `subrule`s do all the work. Each grammar will have only one `rule`: `first`. The rule is used just to hold the subrules and make them visible to the grammar.
+
+[header The `subrule` definition]
+
+Like the `rule`, the expression after the assignment `operator=` defines the subrule:
+
+``
+ identifier = expression
+``
+
+Unlike rules, subrules may be defined only once. Redefining a subrule is illegal and will result to a compile time assertion.
+
+[header Separators \[`,`\]]
+
+While rules are terminated by the semicollon `';'`. Subrules are not terminated but are separated by the comma: `','`. Like Pascal statements, the last subrule in a group may not have a trailing comma.
+
+``
+ a = ch_p('a'),
+ b = ch_p('b'),
+ c = ch_p('c'), // BAD, trailing comma
+
+ a = ch_p('a'),
+ b = ch_p('b'),
+ c = ch_p('c') // OK
+``
+
+[header The `start` subrule]
+
+Unlike rules, parsing proceeds from the start subrule. The first (topmost) subrule in a group of subrules is called the [*start subrule]. In our example above, `expression` is the start subrule. When a group of subrules is called forth, the start subrule `expression` is called first.
+
+[header IDs]
+
+Each subrule has a corresponding ID; an integral constant that uniquely specifies the subrule. Our example above has four subrules. They are declared as:
+
+``
+ subrule<0> expression;
+ subrule<1> term;
+ subrule<2> factor;
+ subrule<3> group;
+``
+
+[header Aliases]
+
+It is possible to have subrules with similar IDs. A subrule with a similar ID to will be an alias of the other. Both subrules may be used interchangeably.
+
+``
+ subrule<0> a;
+ subrule<0> alias; // alias of a
+``
+
+[header Groups: scope and nesting]
+
+The scope of a subrule and its definition is the enclosing group, typically (and by convention) enclosed inside the parentheses. IDs outside a scope are not directly visible. Inner subrule groups can be nested by enclosing each sub-group inside another set of parentheses. Each group is unique and acts independently. Consequently, while it may not be advisable to do so, a subrule in a group may share the same ID as a subrule in another group since both groups are independent of each other.
+
+``
+ subrule<0> a;
+ subrule<1> b;
+ subrule<0> c;
+ subrule<1> d;
+
+ ( // outer subrule group, scope of a and b
+ a = ch_p('a'),
+ b =
+ ( // inner subrule group, scope of b and c
+ c = ch_p('c'),
+ d = ch_p('d')
+ )
+ )
+``
+
+Subrule IDs need to be unique only within a group. A grammar is an implicit group. Furthermore, even subrules in a grammar may have the same IDs without clashing if they are inside a group. Subrules may be explicitly grouped using the parentheses. Parenthesized groups have unique scopes. In the code above, the outer subrule group defines the subrules a and b while the inner subrule group defines the subrules c and d. Notice that the definition of b is the inner subrule.
+
+[endsect][/ skip_overview]
+
+[endsect][/ subrules]
+


Boost-Commit list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk