Boost logo

Boost-Commit :

From: lists.drrngrvy_at_[hidden]
Date: 2007-10-14 20:56:30


Author: drrngrvy
Date: 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
New Revision: 40038
URL: http://svn.boost.org/trac/boost/changeset/40038

Log:
Started branch to translate Boost.Spirit docs to quickbook format. So far:

* Dynamic, Utility, Symbols, Trees and Iterator sections are 'generally complete'; links aren't sorted yet (intentionally).
* Most of the section on the TOC below the major sub-sections (ie. Debugging, Error Handling, etc). FAQ, Techniques, References, Quickref (doxygen part) aren't there here yet.
* Large parts of the first sections (not sure how much) are already done, but on a 'temporarily lost' drive. They will be added when I can get at them.

Problems noted in the file doc/ISSUES.
Added:
   sandbox/boost_docs/branches/spirit_qbking/
   sandbox/boost_docs/branches/spirit_qbking/doc/
   sandbox/boost_docs/branches/spirit_qbking/doc/ISSUES (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/
   sandbox/boost_docs/branches/spirit_qbking/doc/src/acknowledgements.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/character_sets.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/confix.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/debugging.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/distinct.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/dynamic_parsers.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/error_handling.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/escape_char_parser.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/file_iterator.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/functor_parser.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/includes.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/lazy_parser.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/list_parsers.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/loops.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/multi_pass.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/organisation.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/portability.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/position_iterator.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/rationale.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/refactoring.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/regular_expression_parser.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/scoped_lock.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/select_parser.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/stored_rule.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/style.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/switch_parser.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/symbols.qbk (contents, props changed)
   sandbox/boost_docs/branches/spirit_qbking/doc/src/trees.qbk (contents, props changed)

Added: sandbox/boost_docs/branches/spirit_qbking/doc/ISSUES
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/ISSUES 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,66 @@
+
+
+trees.html
+
+ * 'See the example file xml_grammar.hpp (in libs/spirit/example/...'
+ The file doesn't exist.
+
+organisation.html
+
+ * '[*error_handling]. The framework would not be complete...'
+ The format of this is a bit confusing. Why is error_handling written like a class/function name? Is it one?
+
+confix.html
+
+ * The top-level header 'confix parsers' seems needless. I've removed it.
+ * Link needed for `as_parser`.
+ * Is 'expr' used as an abbreviation? For example in the context of 'expr parser'. It may be confused easily with the `expr` that comes up in the example code.
+
+list_parsers.html
+
+ * 'parser generators are described here and here'
+ Sort out the links here.
+
+functor_parser.html
+
+ * '(see In-depth: The Parser)'
+ Link this up
+
+regular_expression_parser.html
+
+ * Link to 'The Scanner Business'
+
+distinct.html
+
+ * Add link for ASM.1.
+
+lazy_parser.qbk
+
+ * First word should link to docs section on Spirit's Closures.
+
+ * 'When `base` is being parsed, in your semantic action, store a pointer to the selected ?( base | ('`' >> base >> '`') ) in a closure variable'
+
+select_parser.qbk
+
+ * Link at bottom of page referring to PHOENIX_LIMIT should be more direct.
+
+switch_parser.qbk
+
+ * Got a link for Sam Nabialek?
+
+portability.qbk
+
+ * 'Spirit v1.6.x'
+ Link to the newest in the series
+
+ etc... for all pages
+
+style.qbk
+
+ * Sort links on this page.
+ * It would be better if the page 'Boost coding guidelines' was hosted on the boost website, rather than a login-only yahoo groups page.
+
+rationale.qbk
+
+ * 'Spirit is greedy --using straight forward, naive RD'
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/acknowledgements.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/acknowledgements.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,110 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section acknowledgements]
+
+Special thanks to:
+
+[*Dan Nuffer] for his work on lexers, parse trees, ASTs, XML parsers, the multi-pass iterator as well as administering Spirit's site, editing, maintaining the CVS and doing the releases plus a zillion of other chores that were almost taken for granted.
+
+[*Hartmut Kaiser*] for his work on the C parser, the work on the C/C++ preprocessor, utility parsers, the original port to Intel 5.0, various work on Phoenix, porting to v1.5, the meta-parsers, the grouping-parsers, extensive testing and painstaking attention to details.
+
+[*Martin Wille] who improved grammar multi thread safety, contributed the `eol_p` parser, the dynamic parsers, documentation and for taking an active role in almost every aspect from brainstorming and design to coding. And, as always, helps keep the regression tests for g++ on Linux as green as ever :-).
+
+[*Martijn W. Van Der Lee] our Web site administrator and for contributing the RFC821 parser.
+
+[*Giovanni Bajo] for last minute tweaks of Spirit 1.8.0 for CodeWarrior 8.3. Actually, I'm ashamed Giovanni was not in this list already. He's done a lot since Spirit 1.5, the first Boost.Spirit release. He's instrumental in the porting of the Spirit iterators stuff to the new Boost Iterators Library (version 2). He also did various bug fixes and wrote some tests here and there.
+
+[*Juan Carlos Arevalo-Baeza (JCAB)] for his work on the C++ parser, the position iterator, ports to v1.5 and keeping the mailing list discussions alive and kicking.
+
+[*Vaclav Vesely], lots of stuff, the no_actions directive, various patches fixes, the distinct parsers, the lazy parser, some phoenix tweaks and add-ons (e.g. new_). Also, Stefan Slapeta and wife for editing Vaclav's distinct parser doc.
+
+[*Raghavendra Satish] for doing the original v1.3 port to VC++ and his work on Phoenix.
+
+[*Noah Stein] for following up and helping Ragav on the VC++ ports.
+
+[*Hakki Dogusan], for his original v1.0 Pascal parser.
+
+[*John (EBo) David] for his work on the VM and watching over my shoulder as I code giving the impression of distance eXtreme programming.
+
+[*Chris Uzdavinis] for feeding in comments and valuable suggestions as well as editing the documentation.
+
+[*Carsten Stoll], for his work on dynamic parsers.
+
+[*Andy Elvey] and his conifer parser.
+
+[*Bruce Florman], who did the original v1.0 port to VC++.
+
+[*Jeff Westfahl] for porting the loop parsers to v1.5 and contributing the file iterator.
+
+[*Peter Simons] for the RFC date parser example and tutorial plus helping out with some nitty gritty details.
+
+[*Markus Schöpflin] for suggesting the `end_p` parser and lots of other nifty things and his active presence in the mailing list.
+
+[*Doug Gregor] for mentoring and his ability to see things that others don't.
+
+[*David Abrahams] for giving me a job that allows me to still work on Spirit, plus countless advice and help on C++ and specifically template metaprogramming.
+
+[*Aleksey Gurtovoy] for his MPL library from which I stole many metaprogramming tricks especially for less conforming compilers such as Borland and VC6/7.
+
+[*Gustavo Guerra] for his last minute review of Spirit and constant feedback, plus patches here and there (e.g. proposing the new dot behavior of the real numerics parsers).
+
+[*Nicola Musatti, Paul Snively, Alisdair Meredith] and [*Hugo Duncan] for testing and sending in various patches.
+
+[*Steve Rowe] for his splendid work on the TSTs that will soon be taken into Spirit.
+
+[*Jonathan de Halleux] for his work on actors.
+
+[*Angus Leeming] for last minute editing work on the 1.8.0 release documentation, his work on Phoenix and his active presence in the Spirit mailing list.
+
+[*Joao Abecasis] for his active presence in the Spirit mailing list, providing user support, participating in the discussions and so on.
+
+[*Guillaume Melquiond] for a last minute patch to `multi_pass` for 1.8.1.
+
+[*Peder Holt] for his porting work on Phoenix, Fusion and Spirit to VC6.
+
+To my wife [*Mariel] who did the graphics in this document.
+
+My, there's a lot in this list! And it's a continuing list. I add people to this list everytime. I hope I did not forget anyone. If I missed someone you know who has helped in any way, please inform me.
+
+Special thanks also to people who gave feedback and valuable comments, particularly members of Boost and Spirit mailing lists. This includes all those who participated in the review:
+
+[*John Maddock], our review manager
+[*Aleksey Gurtovoy
+Andre Hentz
+Beman Dawes
+Carl Daniel
+Christopher Currie
+Dan Gohman
+Dan Nuffer
+Daryle Walker
+David Abrahams
+David B. Held
+Dirk Gerrits
+Douglas Gregor
+Hartmut Kaiser
+Iain K.Hanson
+Juan Carlos Arevalo-Baeza
+Larry Evans
+Martin Wille
+Mattias Flodin
+Noah Stein
+Nuno Lucas
+Peter Dimov
+Peter Simons
+Petr Kocmid
+Ross Smith
+Scott Kirkwood
+Steve Cleary
+Thorsten Ottosen
+Tom Wenisch
+Vladimir Prus]
+
+Finally thanks to [@http://sf.net SourceForge] for hosting the Spirit project and [@http://boost.org Boost]: a C++ community comprised of extremely talented library authors who participate in the discussion and peer review of well crafted C++ libraries.
+
+[endsect][/ acknowledgements]

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/character_sets.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/character_sets.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,72 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Character Sets]
+
+The character set `chset` matches a set of characters over a finite range bounded by the limits of its template parameter `CharT`. This class is an optimization of a parser that acts on a set of single characters. The template class is parameterized by the character type `CharT` and can work efficiently with 8, 16 and 32 and even 64 bit characters.
+
+``
+ template <typename CharT = char>
+ class chset;
+``
+
+The `chset` is constructed from literals (e.g. 'x'), `ch_p` or `chlit<>`, `range_p` or `range<>`, `anychar_p` and `nothing_p` (see [link __primitives__]) or copy-constructed from another `chset`. The `chset` class uses a copy-on-write scheme that enables instances to be passed along easily by value.
+
+[info Sparse bit vectors
+
+To accomodate 16/32 and 64 bit characters, the `chset` class statically switches from a `std::bitset` implementation when the character type is not greater than 8 bits, to a sparse bit/boolean set which uses a sorted vector of disjoint ranges (`range_run`). The set is constructed from ranges such that adjacent or overlapping ranges are coalesced.
+
+`range_runs` are very space-economical in situations where there are lots of ranges and a few individual disjoint values. Searching is O(log n) where n is the number of ranges.
+]
+
+Examples:
+
+``
+ chset<> s1('x');
+ chset<> s2(anychar_p - s1);
+``
+
+Optionally, character sets may also be constructed using a definition string following a syntax that resembles posix style regular expression character sets, except that double quotes delimit the set elements instead of square brackets and there is no special negation `^` character.
+
+``
+ range = anychar_p >> '-' >> anychar_p;
+ set = *(range_p | anychar_p);
+``
+
+Since we are defining the set using a C string, the usual C/C++ literal string syntax rules apply. Examples:
+
+``
+ chset<> s1("a-zA-Z"); // alphabetic characters
+ chset<> s2("0-9a-fA-F"); // hexadecimal characters
+ chset<> s3("actgACTG"); // DNA identifiers
+ chset<> s4("\x7f\x7e"); // Hexadecimal 0x7F and 0x7E
+``
+
+The standard Spirit set operators apply (see [link __operators__]) plus an additional character-set-specific inverse (negation `~`) operator:
+
+[table Character set operators
+ [[`~a` ] [Set inverse ]]
+ [[`a | b`] [Set union ]]
+ [[`a &` ] [Set intersection]]
+ [[`a - b`] [Set difference ]]
+ [[`a ^ b`] [Set xor ]]
+]
+
+where operands `a` and `b` are both `chsets` or one of the operand is either a literal character, `ch_p` or `chlit`, `range_p` or `range`, `anychar_p` or `nothing_p`. Special optimized overloads are provided for `anychar_p` and `nothing_p` operands. A `nothing_p` operand is converted to an empty set, while an `anychar_p` operand is converted to a set having elements of the full range of the character type used (e.g. 0-255 for unsigned 8 bit chars).
+
+A special case is `~anychar_p` which yields `nothing_p`, but `~nothing_p` is illegal. Inversion of `anychar_p` is asymmetrical, a one-way trip comparable to converting `T*` to a `void*`.
+
+[table Special conversions
+ [[`chset<CharT>(nothing_p)`] [[empty set] ]
+ [[`chset<CharT>(anychar_p)`] [[full range of `CharT` (e.g. 0-255 for unsigned 8 bit chars)]]
+ [[`~anychar_p`] [[`nothing_p`] ]
+ [[`~nothing_p`] [[illegal] ]
+]
+
+[endsect][/ character_sets]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/confix.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/confix.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,144 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:confix Confix Parsers]
+
+Confix Parsers recognize a sequence out of three independent elements: an opening, an expression and a closing. A simple example is a C comment:
+
+``
+ /* This is a C comment */
+``
+
+which could be parsed through the following rule definition:
+
+``
+ rule<> c_comment_rule
+ = confix_p("/*", *anychar_p, "*/")
+ ;
+``
+
+The `confix_p` parser generator should be used for generating the required Confix Parser. The three parameters to `confix_p` can be single characters (as above), strings or, if more complex parsing logic is required, auxiliary parsers, each of which is automatically converted to the corresponding parser type needed for successful parsing.
+
+The generated parser is equivalent to the following rule:
+
+``
+ open >> (expr - close) >> close
+``
+
+If the `expr` parser is an `action_parser_category` type parser (a parser with an attached semantic action) we have to do something special. This happens, if the user wrote something like:
+
+``
+ confix_p(open, expr[func], close)
+``
+
+where `expr` is the parser matching the `expr` of the confix sequence and `func` is a functor to be called after matching the `expr`. If we would do nothing, the resulting code would parse the sequence as follows:
+
+``
+ open >> (expr[func] - close) >> close
+``
+
+which in most cases is not what the user expects. (If this is what you've expected, then please use the `confix_p` generator function `direct()`, which will inhibit the parser refactoring). To make the confix parser behave as expected:
+
+``
+ open >> (expr - close)[func] >> close
+``
+
+the actor attached to the expr parser has to be re-attached to the `(expr - close)` parser construct, which will make the resulting confix parser 'do the right thing'. This refactoring is done by the help of the [link __Refactoring Parsers__]. Additionally special care must be taken, if the expr parser is a `unary_parser_category` type parser as
+
+``
+ confix_p(open, *anychar_p, close)
+``
+
+which without any refactoring would result in
+
+``
+ open >> (*anychar_p - close) >> close
+``
+
+and will not give the expected result (`*anychar_p` will eat up all the input up to the end of the input stream). So we have to refactor this into:
+
+``
+ open >> *(anychar_p - close) >> close
+``
+
+what will give the correct result.
+
+The case, where the expr parser is a combination of the two mentioned problems (i.e. the expr parser is a unary parser with an attached action), is handled accordingly too, so:
+
+``
+ confix_p(open, (*anychar_p)[func], close)
+``
+
+will be parsed as expected:
+
+``
+ open >> (*(anychar_p - end))[func] >> close
+``
+
+The required refactoring is implemented here with the help of the [link __Refactoring Parsers__] too.
+
+[table Summary of Confix Parser refactorings
+ [[You write it as:] [It is refactored to:] ]
+ [[`confix_p(open, expr, close)`] [`open >> (expr - close) >> close`] ]
+ [[`confix_p(open, expr[func], close)`] [`open >> (expr - close)[func] >> close] ]
+ [[`confix_p(open, *expr, close)`] [`open >> *(expr - close) >> close] ]
+ [[`confix_p(open, (*expr)[func], close)`] [`open >> (*(expr - close))[func] >> close]]
+]
+
+[h3 Comment Parsers]
+
+The Comment Parser generator template `comment_p` is helper for generating a correct [link __Confix Parser__] from auxiliary parameters, which is able to parse comment constructs as follows:
+
+``
+ StartCommentToken >> Comment text >> EndCommentToken
+``
+
+There are the following types supported as parameters: parsers, single characters and strings (see `as_parser`). If it is used with one parameter, a comment starting with the given first parser parameter up to the end of the line is matched. So for instance the following parser matches C++ style comments:
+
+``
+ comment_p("//")
+``
+
+If it is used with two parameters, a comment starting with the first parser parameter up to the second parser parameter is matched. For instance a C style comment parser could be constrcuted as:
+
+``
+ comment_p("/*", "*/")
+``
+
+The `comment_p` parser generator allows to generate parsers for matching non-nested comments (as for C/C++ comments). Sometimes it is necessary to parse nested comments as for instance allowed in Pascal.
+
+``
+ { This is a { nested } PASCAL-comment }
+``
+
+Such nested comments are parseable through parsers generated by the `comment_nest_p` generator template functor. The following example shows a parser, which can be used for parsing the two different (nestable) Pascal comment styles:
+
+``
+ rule<> pascal_comment
+ = comment_nest_p("(*", "*)")
+ | comment_nest_p('{', '}')
+ ;
+``
+
+[note
+Please note, that a comment is parsed implicitly as if the whole `comment_p(...)` statement were embedded into a `lexeme_d[]` directive, i.e. during parsing of a comment no token skipping will occur, even if you've defined a skip parser for your whole parsing process.
+]
+
+[@../../fundamental/comments.cpp comments.cpp] demonstrates various comment parsing schemes:
+
+1. Parsing of different comment styles:
+ * parsing C/C++-style comment.
+ * parsing C++-style comment.
+ * parsing PASCAL-style comment.
+2. Parsing tagged data with the help of the confix_parser.
+3. Parsing tagged data with the help of the confix_parser but the semantic action is directly attached to the body sequence parser.
+
+This is part of the Spirit distribution.
+
+[endsect][/ confix]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/debugging.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/debugging.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,194 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Debugging]
+
+The top-down nature of Spirit makes the generated parser easy to micro-debug using the standard debugger bundled with the C++ compiler we are using. With recursive-descent, the parse traversal utilizes the hardware stack through C++ function call mechanisms. There are no difficult to debug tables or state machines that obscure the parsing logic flow. The stack trace we see in the debugger follows faithfully the hierarchical grammar structure.
+
+Since any production rule can initiate a parse traversal , it is a lot easier to pinpoint the bugs by focusing on one or a few rules. For relatively complex parsing tasks, the same way we write robust C++ programs, it is advisable to develop a grammar iteratively on a per-module basis where each module is a small subset of the complete grammar. That way, we can stress-test individual modules piecemeal until we reach the top-most module. For instance, when developing a scripting language, we can start with expressions, then move on to statements, then functions, upwards until we have a complete grammar.
+
+At some point when the grammar gets quite complicated, it is desirable to visualize the parse traversal and see what's happening. There are some facilities in the framework that aid in the visualisation of the parse traversal for the purpose of debugging. The following macros enable these features.
+
+[section:macros Debugging Macros]
+
+[h3 `BOOST_SPIRIT_ASSERT_EXCEPTION`]
+
+Spirit contains assertions that may activate when spirit is used incorrectly. By default these assertions use the assert macro from the standard library. If you want spirit to throw an exception instead, `#define BOOST_SPIRIT_ASSERT_EXCEPTION` to the name of the class that you want to be thrown. This class's constructor will be passed a `const char*` stringified version of the file, line, and assertion condition, when it is thrown. If you want to totally disable the assertion, `#define NDEBUG`.
+
+[/endsect][/ BOOST_SPIRIT_ASSERT_EXCEPTION]
+
+[h3 `BOOST_SPIRIT_DEBUG`]
+
+Define this to enable debugging.
+
+With debugging enabled, special output is generated at key points of the parse process, using the standard output operator (`operator<<`) with `BOOST_SPIRIT_DEBUG_OUT` (default is `std::cout`, see below) as its left operand.
+
+[note
+In order to use spirit's debugging support you must ensure that appropriate overloads of `operator<<` taking `BOOST_SPIRIT_DEBUG_OUT` as its left operand are available. The expected semantics are those of the standard output operator.
+
+These overloads may be provided either within the namespace where the corresponding class is declared (will be found through Argument Dependent Lookup) or [within an anonymous namespace] inside `namespace boost::spirit`, so it is visible where it is called.
+]
+
+[important
+Note in particular that when `BOOST_SPIRIT_DEBUG_FLAGS_CLOSURES` is set, overloads of `operator<<` taking instances of the types used in closures as their right operands are required.
+
+You can find an example of overloading the output operator for `std::pair` in [link output_operator __this FAQ entry__].
+]
+
+By default, if the `BOOST_SPIRIT_DEBUG` macro is defined, all available debug output is generated. To fine tune the amount of generated text you can define the `BOOST_SPIRIT_DEBUG_FLAGS` constant to be equal of a combination of the following flags:
+
+[table Available flags to fine tune debug output
+ [
+ [`BOOST_SPIRIT_DEBUG_FLAGS_NODES`]
+ [print information about nodes (general for all parsers)]
+ ]
+ ]
+ [`BOOST_SPIRIT_DEBUG_FLAGS_TREES`]
+ [print information about parse trees and AST's (general for all tree parsers)]
+ ]
+ [
+ [`BOOST_SPIRIT_DEBUG_FLAGS_CLOSURES`]
+ [print information about closures (general for all parsers with closures)]
+ ]
+ [
+ [`BOOST_SPIRIT_DEBUG_FLAGS_ESCAPE_CHAR`]
+ [print information out of the `esc_char_parser`]
+ ]
+ [
+ [`BOOST_SPIRIT_DEBUG_FLAGS_SLEX`]
+ [print information out of the `SLEX` parser]
+ ]
+]
+
+[/endsect][/ boost_spirit_debug]
+
+[h3 `BOOST_SPIRIT_DEBUG_OUT`]
+
+Define this to redirect the debugging diagnostics printout to somewhere else (e.g. a file or stream). Defaults to `std::cout`.
+
+[h3 `BOOST_SPIRIT_DEBUG_TOKEN_PRINTER`]
+
+The `BOOST_SPIRIT_DEBUG_TOKEN_PRINTER` macro allows you to redefine the way characters are printed on the stream.
+
+If `BOOST_SPIRIT_DEBUG_OUT` is of type `StreamT`, the character type is `CharT` and `BOOST_SPIRIT_DEBUG_TOKEN_PRINTER` is defined to `foo`, it must be compatible with this usage:
+
+``
+ foo(StreamT, CharT)
+``
+
+The default printer requires `operator<<(StreamT, CharT)` to be defined. Additionaly, if `CharT` is convertible to a normal character type (`char`, `wchar_t` or `int`), it prints control characters in a friendly manner (e.g., when it receives `'\n'` it actually prints the `\` and `n` charactes, instead of a newline).
+
+[h3 `BOOST_SPIRIT_DEBUG_PRINT_SOME`]
+
+The `BOOST_SPIRIT_DEBUG_PRINT_SOME` constant defines the number of characters from the stream to be printed for diagnosis. This defaults to the first 20 characters.
+
+[h3 `BOOST_SPIRIT_DEBUG_TRACENODE`]
+
+By default all parser nodes are traced. This constant may be used to redefine this default. If this is 1 (`true`), then tracing is enabled by default, if this constant is 0 (`false`), the tracing is disabled by default. This preprocessor constant is set to 1 (`true`) by default.
+
+Please note, that the following `BOOST_SPIRIT_DEBUG_...() macros are to be used at function scope only.`
+
+[h3 `BOOST_SPIRIT_DEBUG_NODE(p)`]
+
+Define this to print some debugging diagnostics for parser `p`. This macro
+
+* Registers the parser name for debugging
+* Enables/disables the tracing for parser depending on `BOOST_SPIRIT_DEBUG_TRACENODE`
+
+Pre-parse: Before entering the rule, the rule name followed by a peek into the data at the current iterator position is printed.
+
+Post-parse: After parsing the rule, the rule name followed by a peek into the data at the current iterator position is printed. Here, `'/'` before the rule name flags a succesful match while `'#' before the rule name flags an unsuccesful match.
+
+The following are synonyms for ``BOOST_SPIRIT_DEBUG_NODE`
+
+1. `BOOST_SPIRIT_DEBUG_RULE`
+2. `BOOST_SPIRIT_DEBUG_GRAMMAR`
+
+`BOOST_SPIRIT_DEBUG_TRACE_NODE(p, flag)`
+
+Similar to `BOOST_SPIRIT_DEBUG_NODE`. Additionally allows selective debugging. This is useful in situations where we want to debug just a hand picked set of nodes.
+
+The following are synonyms for `BOOST_SPIRIT_DEBUG_TRACE_NODE`
+
+ 1. `BOOST_SPIRIT_DEBUG_TRACE_RULE`
+ 2. BOOST_SPIRIT_DEBUG_TRACE_GRAMMAR`
+
+``BOOST_SPIRIT_DEBUG_TRACE_NODE_NAME(p, name, flag)`
+
+Similar to `BOOST_SPIRIT_DEBUG_NODE`. Additionally allows selective debugging and allows to specify the name used during debug printout. This is useful in situations where we want to debug just a hand picked set of nodes. The name may be redefined in situations, where the parser parameter does not reflect the name of the parser to debug.
+
+The following are synonyms for `BOOST_SPIRIT_DEBUG_TRACE_NODE`
+
+ 1. `BOOST_SPIRIT_DEBUG_TRACE_RULE_NAME`
+ 2. `BOOST_SPIRIT_DEBUG_TRACE_GRAMMAR_NAME`
+
+Here's the original calculator with debugging features enabled:
+
+``
+ #define BOOST_SPIRIT_DEBUG ///$$$ DEFINE THIS BEFORE ANYTHING ELSE $$$///
+ #include "boost/spirit.hpp"
+
+ /***/
+
+ /*** CALCULATOR GRAMMAR DEFINITIONS HERE ***/
+
+ BOOST_SPIRIT_DEBUG_RULE(integer);
+ BOOST_SPIRIT_DEBUG_RULE(group);
+ BOOST_SPIRIT_DEBUG_RULE(factor);
+ BOOST_SPIRIT_DEBUG_RULE(term);
+ BOOST_SPIRIT_DEBUG_RULE(expr);
+``
+
+[tip
+Be sure to add the macros inside the grammar definition's constructor. Now here's a sample session with the calculator.
+]
+
+[pre
+ Type an expression...or [q or Q] to quit
+
+ 1 + 2
+
+ grammar(calc): "1 + 2"
+ rule(expression): "1 + 2"
+ rule(term): "1 + 2"
+ rule(factor): "1 + 2"
+ rule(integer): "1 + 2"
+ push 1
+ /rule(integer): " + 2"
+ /rule(factor): " + 2"
+ /rule(term): " + 2"
+ rule(term): "2"
+ rule(factor): "2"
+ rule(integer): "2"
+ push 2
+ /rule(integer): ""
+ /rule(factor): ""
+ /rule(term): ""
+ popped 1 and 2 from the stack. pushing 3 onto the stack.
+ /rule(expression): ""
+ /grammar(calc): ""
+ -------------------------
+ Parsing succeeded
+ result = 3
+ -------------------------
+]
+
+We typed in `"1 + 2"`. Notice that there are two successful branches from the top rule `expr`. The text in red is generated by the parser's semantic actions while the others are generated by the debug-diagnostics of our rules. Notice how the first integer rule took `"1"`, the first term rule took `"+"` and finally the second integer rule took `"2"`.
+
+Please note the special meaning of the first characters appearing on the printed lines:
+
+ * a single `'/'` starts a line containing the information about a successfully matched parser node (`rule<>, grammar<>` or `subrule<>`)
+ * a single `'#'` starts a line containing the information about a failed parser node
+ * a single `'^'` starts a line containing the first member (return value/synthesised attribute) of the closure of a successfully matched parser node.
+
+Check out [@../../example/fundamental/calc_debug.cpp calc_debug.cpp]` to see debugging in action.
+
+[endsect][/ macros]
+
+[endsect][/ debugging]`
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/distinct.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/distinct.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,92 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:distinct Distinct Parser]
+
+[section:parsers Distinct Parsers]
+
+The distinct parsers are utility parsers which ensure that matched input is not immediately followed by a forbidden pattern. Their typical usage is to distinguish keywords from identifiers.
+
+[h3 `distinct_parser`]
+
+The basic usage of the `distinct_parser` is to replace the `str_p` parser. For example the declaration_rule in the following example:
+
+``
+ rule<ScannerT> declaration_rule = str_p("declare") >> lexeme_d[+alpha_p];
+``
+
+would correctly match an input `"declare abc"`, but as well an input `"declareabc"` what is usually not intended. In order to avoid this, we can use `distinct_parser`:
+
+``
+ // keyword_p may be defined in the global scope
+ distinct_parser<> keyword_p("a-zA-Z0-9_");
+
+ rule<ScannerT> declaration_rule = keyword_p("declare") >> lexeme_d[+alpha_p];
+``
+
+The `keyword_p` works in the same way as the `str_p` parser but matches only when the matched input is not immediately followed by one of the characters from the set passed to the constructor of `keyword_p`. In the example the `"declare"` can't be immediately followed by any alphabetic character, any number or an underscore.
+
+See the full example [@../../example/fundamental/distinct/distinct_parser.cpp here].
+
+[h3 `distinct_directive`]
+
+For more sophisticated cases, for example when keywords are stored in a symbol table, we can use `distinct_directive`.
+
+``
+ distinct_directive<> keyword_d("a-zA-Z0-9_");
+
+ symbol<> keywords = "declare", "begin", "end";
+ rule<ScannerT> keyword = keyword_d[keywords];
+``
+
+[h3 `dynamic_distinct_parser` and `dynamic_distinct_directive`]
+
+In some cases a set of forbidden follow-up characters is not sufficient. For example ASN.1 naming conventions allows identifiers to contain dashes, but not double dashes (which marks the beginning of a comment). Furthermore, identifiers can't end with a dash. So, a matched keyword can't be followed by any alphanumeric character or exactly one dash, but can be followed by two dashes.
+
+This is when `dynamic_distinct_parser` and the `dynamic_distinct_directive` come into play. The constructor of the `dynamic_distinct_parser` accepts a parser which matches any input that must NOT follow the keyword.
+
+``
+ // Alphanumeric characters and a dash followed by a non-dash
+ // may not follow an ASN.1 identifier.
+ dynamic_distinct_parser<> keyword_p(alnum_p | ('-' >> ~ch_p('-')));
+
+ rule<ScannerT> declaration_rule = keyword_p("declare") >> lexeme_d[+alpha_p];
+``
+
+Since the `dynamic_distinct_parser` internally uses a rule, its type is dependent on the scanner type. So, the `keyword_p shouldn't be defined globally, but rather within the grammar.
+`
+See the full example [@../../example/fundamental/distinct/distinct_parser_dynamic.cpp here].
+
+[endsect][/ parsers]
+
+[section:hiw How it works]
+
+When the `keyword_p_1` and the `keyword_p_2` are defined as
+
+``
+ distinct_parser<> keyword_p(forbidden_chars);
+ distinct_parser_dynamic<> keyword_p(forbidden_tail_parser);
+``
+
+the parsers
+
+``
+ keyword_p_1(str)
+ keyword_p_2(str)
+``
+
+are equivalent to the rules
+
+``
+ lexeme_d[chseq_p(str) >> ~epsilon_p(chset_p(forbidden_chars))]
+ lexeme_d[chseq_p(str) >> ~epsilon_p(forbidden_tail_parser)]
+``
+
+[endsect][/ hiw]
+[endsect][/ distinct]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/dynamic_parsers.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/dynamic_parsers.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,82 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:dynamic_p Dynamic Parsers]
+
+We see dynamic parsing everywhere in Spirit. A special group of parsers, aptly named dynamic parsers, form the most basic building blocks to dynamic parsing. This chapter focuses on these critters. You'll notice the similarity of these parsers with C++'s control structures. The similarity is not a coincidence. These parsers give an imperative flavor to parsing, and, since imperative constructs are not native to declarative EBNF, mimicking the host language, C++, should make their use immediately familiar.
+
+Dynamic parsers modify the parsing behavior according to conditions. Constructing dynamic parsers requires a condition argument and a body parser argument. Additional arguments are required by some parsers.
+
+[section Conditions]
+
+Functions or functors returning values convertable to bool can be used as conditions. When the evaluation of the function/functor yields true it will be considered as meeting the condition.
+
+Parsers can be used as conditions, as well. When the parser matches the condition is met. Parsers used as conditions work in an all-or-nothing manner: the scanner will not be advanced when they don't match.
+
+A failure to meet the condition will not result in a parse error.
+
+[h3 `if_p`]
+
+`if_p` can be used with or without an else-part. The syntax is:
+
+``
+ if_p(condition)[then-parser]
+``
+
+or
+
+``
+ if_p(condition)[then-parser].else_p[else-parser]
+``
+
+When the condition is met the then-parser is used next in the parsing process. When the condition is not met and an else-parser is available the else-parser is used next. When the condition isn't met and no else-parser is available then the whole parser matches the empty sequence.
+
+[note Note: older versions of `if_p` report a failure when the condition isn't met and no else-parser is available.]
+
+Example:
+
+``
+ if_p("0x")[hex_p].else_p[uint_p]
+``
+
+[h3 `while_p, do_p`]
+
+`while_p/do_p` syntax is:
+
+``
+ while_p(condition)[body-parser]
+ do_p[body-parser].while_p(condition)
+``
+
+As long as the condition is met the dynamic parser constructed by `while_p` will try to match the body-parser. `do_p` returns a parser that tries to match the body-parser and then behaves just like the parser returned by `while_p`. A failure to match the body-parser will cause a failure to be reported by the while/do-parser.
+
+Example:
+
+``
+ uint_p[assign_a(sum)] >> while_p('+')[uint_p(add(sum)]
+ '"' >> while_p(~eps_p('"'))[c_escape_ch_p[push_back_a(result)]] >> '"'
+``
+
+[h3 `for_p`]
+
+`for_p` requires four arguments. The syntax is:
+
+``
+ for_p(init, condition, step)[body-parser]
+``
+
+`init` and `step` have to be 0-ary functions/functors. `for_p` returns a parser that will:
+
+1. call `init`
+2. check the condition, if the condition isn't met then a match is returned. The match will cover everything that has been matched successfully up to this point.
+3. tries to match the body-parser. A failure to match the body-parser will cause a failure to be reported by the for-parser
+4. calls `step`
+5. goes to 2.
+
+[endsect][/ dynamic_p]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/error_handling.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/error_handling.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,145 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Error Handling]
+
+C++'s exception handling mechanism is a perfect match for error handling in the framework. Imagine a complete parser as a maze. At each branch, the input dictates where we will turn. Given an erroneous input, we may reach a dead end. If we ever reach one, it would be a waste of time to backtrack from where we came from. Instead, we supply guards in strategic points. Beyond a certain point, we put put parser assertions in places where one is not allowed to go.
+
+The assertions are like springs that catapult us back to the guard. If we ever reach a brick wall given a specific input pattern, everything unwinds quickly and we are thrown right back to the guard. This can be a very effective optimization when used wisely. Right back at the guard, we have a chance to correct the situation, if possible. The following illustration depicts the scenario.
+
+[$../theme/error_handling.png]
+
+[section Parser Errors]
+
+The `parser_error` class is the generic parser exception class used by Spirit. This is the base class for all parser exceptions.
+
+``
+ template <typename ErrorDescrT, typename IteratorT = char const*>
+ class parser_error
+ {
+ public:
+ parser_error(IteratorT where, ErrorDescrT descriptor);
+ IteratorT where;
+ ErrorDescrT descriptor;
+ };
+``
+
+The exception holds the iterator position where the error was encountered in its where member variable. In addition to the iterator, `parser_error` also holds information regarding the error (error descriptor) in its `descriptor` member variable.
+
+Semantic actions are free to throw parser exceptions when necessary. A utility function `throw_` may be called. This function creates and throws a `parser_error` given an iterator and an error descriptor:
+
+``
+ template <typename ErrorDescrT, typename IteratorT>
+ void throw_(IteratorT where, ErrorDescrT descriptor);
+``
+
+[endsect][/ parser_errors]
+
+[section Parser Assertions]
+
+Assertions may be put in places where we don't have any other option other than expect parsing to succeed. If parsing fails, a specific type of exception is thrown.
+
+Before declaring the grammar, we declare some assertion objects. `assertion` is a template class parameterized by the type of error that will be thrown once the assertion fails. The following assertions are parameterized by a user defined Error enumeration.
+
+[h4 Examples]
+
+``
+ enum Errors
+ {
+ program_expected,
+ begin_expected,
+ end_expected
+ };
+
+ assertion<Errors> expect_program(program_expected);
+ assertion<Errors> expect_begin(begin_expected);
+ assertion<Errors> expect_end(end_expected);
+``
+
+The example above an `enum` to hold the information regarding the error, we are free to use other types such as integers and strings. For example, `assertion<string>` accepts a string as its info. It is advisable to use light-weight objects though, after all, error descriptors are usually static. Enums are convenient for error handlers to detect and easily catch since C++ treats enums as unique types.
+
+[tip
+The `assertive_parser`
+
+Actually, the expression `expect_end(str_p("end"))` creates an `assertive_parser` object. An `assertive_parser` is a parser that throws an exception in response to a parsing failure. The `assertive_parser` throws a `parser_error` exception rather than returning an unsuccessful match to signal that the parser failed to match the input. During parsing, parsers are given an iterator of type `IteratorT`. This is combined with the error descriptor type `ErrorDescrT` of the assertion (in this case `enum Errors`). Both are used to create a `parser_error<Errors, IteratorT>` which is then thrown to signal the exception.
+]
+
+The predeclared `expect_end` assertion object may now be used in the grammar as wrappers around parsers. For example:
+
+``
+ expect_end(str_p("end"))
+``
+
+This will throw an exception if it fails to see `"end"` from the input.
+
+[endsect][/ parser_assertions]
+
+[section The Guard]
+
+The guard is used to catch a specific type of `parser_error`. guards are typically predeclared just like assertions. Extending our previous example:
+
+``
+ guard<Errors> my_guard;
+``
+
+`Errors`, in this example is the error descriptor type we want to detect. This is the same `enum` as above. `my_guard` may now be used in a grammar declaration:
+
+``
+ my_guard(p)[error_handler]
+``
+
+where p is an expression that evaluates to a parser. Somewhere inside `p`, a parser may throw a parser exception. `error_handler` is the error handler which may be a function or functor compatible with the interface:
+
+``
+ error_status<T>
+ f(ScannerT const& scan, ErrorT error);
+``
+
+Where scan points to the scanner state prior to parsing and error is the error that arose. The handler is allowed to move the scanner position as it sees fit, possibly in an attempt to perform error correction. The handler must then return an `error_status<T>` object.
+
+[tip
+The `fallback_parser`
+
+The expression `my_guard(expr, error_handler)` creates a `fallback_parser` object. The `fallback_parser` handles `parser_error` exceptions of a specific type. Since `my_guard` is declared as `guard<Errors>`, the `fallback_parser` catches `Errors` specific parser errors: `parser_error<Errors, IteratorT>`. The class sets up a `try` block. When an exception is caught, the catch block then calls the `error_handler`.
+
+[endsect][/ the_guard]
+
+[section:error_status `error_status<T>`]
+
+``
+ template <typename T = nil_t>
+ struct error_status
+ {
+ enum result_t { fail, retry, accept, rethrow };
+
+ error_status(
+ result_t result = fail,
+ int length = -1,
+ T const& value = T());
+
+ result_t result;
+ int length;
+ T value;
+ };
+``
+
+Where `T` is an attribute type compatible with the match attribute of the `fallback_parser`'s subject (defaults to `nil_t`). The class `error_status` reports the result of an error handler. This result can be one of:
+
+[table `error_status` result
+ [[fail] [quit and fail. Return a `no_match`.]]
+ [[retry] [attempt error recovery, possibly moving the scanner.]]
+ [[accept] [force success returning a matching length, moving the scanner appropriately and returning an attribute value.]]
+ [[rethrow] [rethrows the error]]
+]
+
+See [@../../examples/fundamental/error_handling.cpp error_handling.cpp] for a compilable example. This is part of the Spirit distribution.
+
+[endsect][/ error_status]
+
+[endsect][/ error_handling]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/escape_char_parser.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/escape_char_parser.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,38 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:escape_char_parser Escape Character Parser]
+
+The Escape Character Parser is a utility parser, which parses escaped character sequences used in C/C++, LEX or Perl regular expressions. Combined with the `confix_p` utility parser, it is useful for parsing C/C++ strings containing double quotes and other escaped characters:
+
+``
+ confix_p('"', *c_escape_ch_p, '"')
+``
+
+There are two different types of the Escape Character Parser: `c_escape_ch_p`, which parses C/C++ escaped character sequences and `lex_escape_ch_p`, which parses LEX style escaped character sequences. The following table shows the valid character sequences understood by these utility parsers.
+
+[table Summary of valid escaped character sequences
+ [
+ [`c_escape_ch_p`]
+ [`\b, \t, \n, \f, \r, \\, \", \', \xHH, \OOO`
+where: `H` is some hexadecimal digit (`0..9, a..f, A..F`) and `O` is some octal digit (`0..7`)]
+ ]
+ [
+ [`lex_escape_ch_p`]
+ [all C/C++ escaped character sequences as described above and additionally any other character, which follows a backslash.]
+ ]
+]
+
+If there is a semantic action attached directly to the Escape Character Parser, all valid escaped characters are converted to their character equivalent (i.e. a backslash followed by a `'r'` is converted to `'\r'`), which is fed to the attached actor. The number of hexadecimal or octal digits parsed depends on the size of one input character. An overflow will be detected and will generate a non-match. `lex_escape_ch_p` will strip the leading backslash for all character sequences which are not listed as valid C/C++ escape sequences when passing the unescaped character to an attached action.
+
+[caution
+Please note though, that if there is a semantic action attached to an outermost parser (for instance as in `(*c_escape_ch_p)[some_actor]`, where the action is attached to the kleene star generated parser) no conversion takes place at the moment, but nevertheless the escaped characters are parsed correctly. This limitation will be removed in a future version of the library.
+]
+
+[endsect][/ escape_char_parser]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/file_iterator.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/file_iterator.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,65 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[/ this could be done purely as a source file with quickified comments]
+
+[section File Iterator]
+
+Since Spirit is a back-tracking parser, it requires at least a forward iterator. In particular, an input iterator is not sufficient. Many times it is convenient to read the input to a parser from a file, but the STL file iterators are input iterators. To get around this limitation, Spirit has a utility class `file_iterator`, which is a read-only random-access iterator for files.
+
+To use the Spirit file iterator, simply create a file iterator with the path to the file you wish to parse, and then create an EOF iterator for the file:
+
+``
+ #include <boost/spirit/iterator/file_iterator.hpp> // the header file
+
+ file_iterator<> first("input.dat");
+
+ if (!first)
+ {
+ std::cout << "Unable to open file!\n";
+
+ // Clean up, throw an exception, whatever
+ return -1;
+ }
+
+ file_iterator<> last = first.make_end();
+``
+
+You now have a pair of iterators to use with Spirit . If your parser is fully parametrized (no hard-coded `<char const *>`), it is a simple matter of redefining the iterator type to `file_iterator`:
+
+``
+ typedef char char_t;
+ typedef file_iterator <char_t> iterator_t;
+ typedef scanner<iterator_t> scanner_t;
+ typedef rule <scanner_t> rule_t;
+
+ rule_t my_rule;
+
+ // Define your rule
+
+ parse_info<iterator_t> info = parse(first, last, my_rule);
+``
+
+Of course, you don't have to deal with the scanner-business at all if you use grammars rather than rules as arguments to the parse functions. You simply pass the iterator pairs and the grammar as is:
+
+``
+ my_grammar g;
+ parse_info<iterator_t> info = parse(first, last, g);
+``
+
+[tip
+Generic iterator
+
+The Spirit file iterator can be parameterized with any type that is default constructible and assignable. It transparently supports large files (greater than 2GB) on systems that provide an appropriate interface. The file iterator can be useful outside of Spirit as well. For instance, the [link boost.tokenizer Boost.Tokenizer] package requires a bidirectional iterator, which is provided by `file_iterator`.
+]
+
+See [@../../example/fundamental/file_parser.cpp file_parser.cpp] for a compilable example. This is part of the Spirit distribution.
+
+
+[endsect][/ file_iterator]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/functor_parser.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/functor_parser.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,81 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Functor Parser]
+
+The simplest way to write your hand coded parser that works well with the rest of the Spirit library is to simply write a functor parser.
+
+A functor parser is expected to have the interface:
+
+``
+ struct functor
+ {
+ typedef T result_t;
+
+ template <typename ScannerT>
+ std::ptrdiff_t
+ operator()(ScannerT const& scan, result_t& result) const;
+ };
+``
+
+where `typedef T result_t;` is the attribute type of the parser that will be passed back to the match result (see In-depth: The Parser). If the parser does not need to return an attribute, this can simply be `nil_t`. The `std::ptrdiff_t` result is the number of matching characters matched by your parser. A negative value flags an unsucessful match.
+
+A conforming functor parser can transformed into a well formed Spirit parser by wrapping it in the `functor_parser` template:
+
+``
+ functor_parser<functor> functor_p;
+``
+
+[h3 Example]
+
+The following example puts the `functor_parser` into action:
+
+``
+ struct number_parser
+ {
+ typedef int result_t;
+ template <typename ScannerT>
+ std::ptrdiff_t
+ operator()(ScannerT const& scan, result_t& result) const
+ {
+ if (scan.at_end())
+ return -1;
+
+ char ch = *scan;
+ if (ch < '0' || ch > '9')
+ return -1;
+
+ result = 0;
+ std::ptrdiff_t len = 0;
+
+ do
+ {
+ result = result*10 + int(ch - '0');
+ ++len;
+ ++scan;
+ } while (!scan.at_end() && (ch = *scan, ch >= '0' && ch <= '9'));
+
+ return len;
+ }
+ };
+
+ functor_parser<number_parser> number_parser_p;
+``
+
+[tip
+The full source code can be [@../../example/fundamental/functor_parser.cpp viewed here]. This is part of the Spirit distribution.
+]
+
+To further understand the implementation, see In-depth: The Scanner for the scanner API details. We now have a parser number_parser_p` that we can use just like any other Spirit parser. Example:
+
+``
+ r = number_parser_p >> *(',' >> number_parser_p);
+``
+
+[endsect][/ functor_parser]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/includes.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/includes.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,138 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Includes]
+
+[section Modules]
+
+Spirit is designed to be header only. Generally, there are no libraries to build and link against. Certain features, however, require additional libraries; in particular the regular expression parser requires [link __Boost.Regex__] and multithreading support requires [link __Boost.Thread__]s.
+
+Using Spirit is as easy as including the main header file:
+
+``
+ #include <boost/spirit.hpp>
+``
+
+Doing so will include all the header files. This might not be desirable. A low cholesterol alternative is to include only the module that you need. Each of the modules has its own header file. The master spirit header file actually includes all the module files. To avoid unnecessary inclusion of features that you do not need, it is better to include only the modules that you need.
+
+``
+ #include <boost/spirit/actor.hpp>
+ #include <boost/spirit/attribute.hpp>
+ #include <boost/spirit/core.hpp>
+ #include <boost/spirit/debug.hpp>
+ #include <boost/spirit/dynamic.hpp>
+ #include <boost/spirit/error_handling.hpp>
+ #include <boost/spirit/iterator.hpp>
+ #include <boost/spirit/meta.hpp>
+ #include <boost/spirit/symbols.hpp>
+ #include <boost/spirit/tree.hpp>
+ #include <boost/spirit/utility.hpp>
+``
+
+[endsect][/ modules]
+
+[section Sub-Modules]
+
+For even finer control over header file inclusion, you can include only the specific files that you need. Each module is in its own sub-directory:
+
+[/TODO? variablelist]
+
+[h4 `actor`]
+
+``
+ #include <boost/spirit/actor/assign_actor.hpp>
+ #include <boost/spirit/actor/assign_key.hpp>
+ #include <boost/spirit/actor/clear_actor.hpp>
+ #include <boost/spirit/actor/decrement_actor.hpp>
+ #include <boost/spirit/actor/erase_actor.hpp>
+ #include <boost/spirit/actor/increment_actor.hpp>
+ #include <boost/spirit/actor/insert_key_actor.hpp>
+ #include <boost/spirit/actor/push_back_actor.hpp>
+ #include <boost/spirit/actor/push_front_actor.hpp>
+ #include <boost/spirit/actor/swap_actor.hpp>
+``
+
+[h4 `attribute`]
+
+``
+ #include <boost/spirit/attribute/closure.hpp>
+ #include <boost/spirit/attribute/closure_context.hpp>
+ #include <boost/spirit/attribute/parametric.hpp>
+``
+
+[h4 `debug`]
+
+The `debug` module should not be directly included. See [link __Debugging__] for more info on how to use Spirit's debugger.
+
+[h4 `dynamic`]
+
+``
+ #include <boost/spirit/dynamic/for.hpp>
+ #include <boost/spirit/dynamic/if.hpp>
+ #include <boost/spirit/dynamic/lazy.hpp>
+ #include <boost/spirit/dynamic/rule_alias.hpp>
+ #include <boost/spirit/dynamic/select.hpp>
+ #include <boost/spirit/dynamic/stored_rule.hpp>
+ #include <boost/spirit/dynamic/switch.hpp>
+ #include <boost/spirit/dynamic/while.hpp>
+``
+
+[h4 `error_handling`]
+
+``
+ #include <boost/spirit/error_handling/exceptions.hpp>
+``
+
+[h4 ìterator]
+
+``
+ #include <boost/spirit/iterator/file_iterator.hpp>
+ #include <boost/spirit/iterator/fixed_size_queue.hpp>
+ #include <boost/spirit/iterator/multi_pass.hpp>
+ #include <boost/spirit/iterator/position_iterator.hpp>
+``
+
+[h4 `meta`]
+
+``
+ #include <boost/spirit/meta/as_parser.hpp>
+ #include <boost/spirit/meta/fundamental.hpp>
+ #include <boost/spirit/meta/parser_traits.hpp>
+ #include <boost/spirit/meta/refactoring.hpp>
+ #include <boost/spirit/meta/traverse.hpp>
+``
+
+[h4 `tree`]
+
+``
+ #include <boost/spirit/tree/ast.hpp>
+ #include <boost/spirit/tree/parse_tree.hpp>
+ #include <boost/spirit/tree/parse_tree_utils.hpp>
+ #include <boost/spirit/tree/tree_to_xml.hpp>
+``
+
+[h4 `utility`]
+
+``
+ #include <boost/spirit/utility/chset.hpp>
+ #include <boost/spirit/utility/chset_operators.hpp>
+ #include <boost/spirit/utility/confix.hpp>
+ #include <boost/spirit/utility/distinct.hpp>
+ #include <boost/spirit/utility/escape_char.hpp>
+ #include <boost/spirit/utility/flush_multi_pass.hpp>
+ #include <boost/spirit/utility/functor_parser.hpp>
+ #include <boost/spirit/utility/lists.hpp>
+ #include <boost/spirit/utility/loops.hpp>
+ #include <boost/spirit/utility/regex.hpp>
+ #include <boost/spirit/utility/scoped_lock.hpp>
+``
+
+[endsect][/ sub_modules]
+
+[endsect][/ includes]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/lazy_parser.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/lazy_parser.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,92 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+
+[/ The example code used here]
+[def __lazy_parser.cpp__ [@../../example/intermediate/lazy_parser.cpp lazy_parser.cpp]]
+
+[section:lazy_p The Lazy Parser]
+
+Closures are cool. It allows us to inject stack based local variables anywhere in our parse descent hierarchy. Typically, we store temporary variables, generated by our semantic actions, in our closure variables, as a means to pass information up and down the recursive descent.
+
+Now imagine this... Having in mind that closure variables can be just about any type, we can store a parser, a rule, or a pointer to a parser or rule, in a closure variable. Yeah, right, so what?... Ok, hold on... What if we can use this closure variable to initiate a parse? Think about it for a second. Suddenly we'll have some powerful dynamic parsers! Suddenly we'll have a full round trip from to [link __Phoenix__] and Spirit and back! Phoenix semantic actions choose the right Spirit parser and Spirit parsers choose the right Phoenix semantic action. Oh MAN, what a honky cool idea, I might say!!
+
+[h3 lazy_p]
+
+This is the idea behind the `lazy_p` parser. The `lazy_p` syntax is:
+
+``
+ lazy_p(actor)
+``
+
+where `actor` is a [link __Phoenix__] expression that returns a Spirit parser. This returned parser is used in the parsing process.
+
+Example:
+
+``
+ lazy_p(phoenix::val(int_p))[assign_a(result)]
+``
+
+Semantic actions attached to the `lazy_p` parser expects the same signature as that of the returned parser (`int_p`, in our example above).
+
+[h3 `lazy_p` example]
+
+To give you a better glimpse (see the __lazy_parser.cpp__), say you want to parse inputs such as:
+
+``
+ dec
+ {
+ 1 2 3
+ bin
+ {
+ 1 10 11
+ }
+ 4 5 6
+ }
+``
+
+where `bin {...}` and `dec {...}` specifies the numeric format (binary or decimal) that we are expecting to read. If we analyze the input, we want a grammar like:
+
+``
+ base = "bin" | "dec";
+ block = base >> '{' >> *block_line >> '}';
+ block_line = number | block;
+``
+
+We intentionally left out the number rule. The tricky part is that the way number rule behaves depends on the result of the base rule. If `base` got a `"bin"`, then number should parse binary numbers. If `base` got a `"dec"`, then `number` should parse decimal numbers. Typically we'll have to rewrite our grammar to accomodate the different parsing behavior:
+
+``
+ block =
+ "bin" >> '{' >> *bin_line >> '}'
+ | "dec" >> '{' >> *dec_line >> '}'
+ ;
+ bin_line = bin_p | block;
+ dec_line = int_p | block;
+``
+
+while this is fine, the redundancy makes us want to find a better solution; after all, we'd want to make full use of Spirit's dynamic parsing capabilities. Apart from that, there will be cases where the set of parsing behaviors for our `number` rule is not known when the grammar is written. We'll only be given a map of string descriptors and corresponding rules [e.g. `(("dec", int_p), ("bin", bin_p)` ... etc...)].
+
+The basic idea is to have a rule for binary and decimal numbers. That's easy enough to do (see [link __numerics__]). When `base` is being parsed, in your semantic action, store a pointer to the selected `base` in a closure variable (e.g. `block.int_rule`). Here's an example:
+
+``
+ base
+ = str_p("bin")[block.int_rule = &var(bin_rule)]
+ | str_p("dec")[block.int_rule = &var(dec_rule)]
+ ;
+``
+
+With this setup, your number rule will now look something like:
+
+``
+ number = lazy_p(*block.int_rule);
+``
+
+The __lazy_parser.cpp__ does it a bit differently, ingeniously using the symbol table to dispatch the correct rule, but in essence, both strategies are similar. This technique, using the symbol table, is detailed in the Techiques section: [link __nabialek_trick__]. Admitedly, when you add up all the rules, the resulting grammar is more complex than the hard-coded grammar above. Yet, for more complex grammar patterns with a lot more rules to choose from, the additional setup is well worth it.
+
+[endsect][/ lazy_p]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/list_parsers.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/list_parsers.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,136 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section List Parsers]
+
+List Parsers are generated by the special predefined parser generator object `list_p`, which generates parsers recognizing list structures of the type
+
+``
+ item >> *(delimiter >> item) >> !end
+``
+
+where item is an expression, delimiter is a delimiter and end is an optional closing expression. As you can see, the `list_p` generated parser does not recognize empty lists, i.e. the parser must find at least one item in the input stream to return a successful match. If you wish to also match an empty list, you can make your `list_p` optional with `operator!` An example where this utility parser is helpful is parsing comma separated C/C++ strings, which can be easily formulated as:
+
+``
+ rule<> list_of_c_strings_rule
+ = list_p(confix_p('\"', *c_escape_char_p, '\"'), ',')
+ ;
+``
+
+The `confix_p` and `c_escape_char_p` parser generators are described here and here.
+
+The `list_p` parser generator object can be used to generate the following different types of List Parsers:
+
+[table List Parsers
+ [
+ [`list_p`]
+ [`list_p` used by itself parses comma separated lists without special item formatting, i.e. everything in between two commas is matched as an item, no end of list token is matched.]
+ ]
+ [
+ [`list_p(delimiter)`]
+ [generates a list parser, which recognizes lists with the given delimiter and matches everything in between them as an item, no end of list token is matched.]
+ ]
+ [
+ [`list_p(item, delimiter)`]
+ [generates a list parser, which recognizes lists with the given delimiter and matches items based on the given item parser, no end of list token is matched.]
+ ]
+ [
+ [`list_p(item, delimiter, end)`]
+ [generates a list parser, which recognizes lists with the given delimiter and matches items based on the given item parser and additionally recognizes an optional end expression.]
+ ]
+]
+
+All of the parameters to `list_p` can be single characters, strings or, if more complex parsing logic is required, auxiliary parsers, each of which is automatically converted to the corresponding parser type needed for successful parsing.
+
+If the item parser is an `action_parser_category` type (parser with an attached semantic action) we have to do something special. This happens, if the user wrote something like:
+
+``
+ list_p(item[func], delim)
+``
+
+where item is the parser matching one item of the list sequence and func is a functor to be called after matching one item. If we would do nothing, the resulting code would parse the sequence as follows:
+
+``
+ (item[func] - delim) >> *(delim >> (item[func] - delim))
+``
+
+what in most cases is not what the user expects. (If this is what you've expected, then please use one of the `list_p` generator functions `direct()`, which will inhibit refactoring of the item parser). To make the list parser behave as expected:
+
+``
+ (item - delim)[func] >> *(delim >> (item - delim)[func])
+``
+
+the actor attached to the item parser has to be re-attached to the `(item - delim)` parser construct, which will make the resulting list parser 'do the right thing'. This refactoring is done by the help of the [link __Refactoring Parsers__]. Additionally special care must be taken, if the item parser is a `unary_parser_category` type parser as for instance:
+
+``
+ list_p(*anychar_p, ',')
+``
+
+which without any refactoring would result in
+
+``
+ (*anychar_p - ch_p(','))
+ >> *( ch_p(',') >> (*anychar_p - ch_p(',')) )
+``
+
+and will not give the expected result (the first `*anychar_p` will eat up all the input up to the end of the input stream). So we have to refactor this into:
+
+``
+ *(anychar_p - ch_p(','))
+ >> *( ch_p(',') >> *(anychar_p - ch_p(',')) )
+``
+
+which will give the correct result.
+
+The case, where the item parser is a combination of the two mentioned problems (i.e. the item parser is a unary parser with an attached action), is handled accordingly too:
+
+``
+ list_p((*anychar_p)[func], ',')
+``
+
+will be parsed as expected:
+
+``
+ (*(anychar_p - ch_p(',')))[func]
+ >> *( ch_p(',') >> (*(anychar_p - ch_p(',')))[func] )
+``
+
+The required refactoring is implemented with the help of the Refactoring Parsers.
+
+[table Summary of List Parser refactorings
+ [[You write it as:] [It is refactored to:]]
+ [
+[`list_p(item, delimiter)`]
+[`(item - delimiter)
+>> *(delimiter >> (item - delimiter))`]
+ ]
+ [
+[`list_p(item[func], delimiter)`]
+[`(item - delimiter)[func]
+>> *(delimiter >> (item - delimiter)[func])`]
+ ]
+[`list_p(*item, delimiter)`]
+[`*(item - delimiter)
+>> *(delimiter >> *(item - delimiter))`]
+ [
+[`list_p((*item)[func], delimiter)`]
+[`(*(item - delimiter))[func]
+>> *(delimiter >> (*(item - delimiter))[func])]
+ ]
+]
+
+[@../../example/fundamental/list_parser.cpp list_parser.cpp] sample shows the usage of the `list_p` utility parser:
+
+1. parsing a simple `','` delimited list w/o item formatting.
+2. parsing a CSV list (comma separated values - strings, integers or reals)
+3. parsing a token list (token separated values - strings, integers or reals) with an action parser directly attached to the item part of the `list_p` generated parser.
+
+This is part of the Spirit distribution.
+
+[endsect][/ list_parsers]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/loops.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/loops.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,79 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Loops]
+
+So far we have introduced a couple of EBNF operators that deal with looping. We have the `+` positive operator, which matches the preceding symbol one (1) or more times, as well as the Kleene star `*` which matches the preceding symbol zero (0) or more times.
+
+Taking this further, we may want to have a generalized loop operator. To some this may seem to be a case of overkill. Yet there are grammars that are impractical and cumbersome, if not impossible, for the basic EBNF iteration syntax to specify. Examples:
+
+* A file name may have a maximum of 255 characters only.
+* A specific bitmap file format has exactly 4096 RGB color information.
+* A 32 bit binary string (1..32 1s or 0s).
+
+Other than the Kleene star `*`, the Positive closure `+`, and the optional `!`, a more flexible mechanism for looping is provided for by the framework.
+
+[table Loop Constructs
+ [[`repeat_p (n) [p]` ] [Repeat `p` exactly `n` times.] ]
+ [[`repeat_p (n1, n2) [p]` ] [Repeat `p` at least `n1` times and at most `n2` times.] ]
+ [[`repeat_p (n, more) [p]`] [Repeat `p` at least `n` times, continuing until `p` fails or the input is consumed.]]
+]
+
+Using the `repeat_p` parser, we can now write our examples above:
+
+A file name with a maximum of 255 characters:
+
+``
+ valid_fname_chars = /*..*/;
+ filename = repeat_p(1, 255)[valid_fname_chars];
+``
+
+A specific bitmap file format which has exactly 4096 RGB color information:
+
+``
+ uint_parser<unsigned, 16, 6, 6> rgb_p;
+ bitmap = repeat_p(4096)[rgb_p];
+``
+
+As for the 32 bit binary string (1..32 1s or 0s), of course we could have easily used the `bin_p` numeric parser instead. For the sake of demonstration however:
+
+``
+ bin32 = lexeme_d[repeat_p(1, 32)[ch_p('1') | '0']];
+``
+
+[note Loop parsers are run-time parametric.]
+
+The Loop parsers can be dynamic. Consider the parsing of a binary file of Pascal-style length prefixed string, where the first byte determines the length of the incoming string. Here's a sample input:
+
+[table
+ [[`11`][`h`][`e`][`l`][`l`][`o`][`-`][`w`][`o`][`r`][`l`][`d`]]
+]
+
+This trivial example cannot be practically defined in traditional EBNF. Although some EBNF syntax allow more powerful repetition constructs other than the Kleene star, we are still limited to parsing fixed strings. The nature of EBNF forces the repetition factor to be a constant. On the other hand, Spirit allows the repetition factor to be variable at run time. We could write a grammar that accepts the input string above:
+
+``
+ int c;
+ r = anychar_p[assign_a(c)] >> repeat_p(boost::ref(c))[anychar_p];
+``
+
+The expression
+
+``
+ anychar_p[assign_a(c)]
+``
+
+extracts the first character from the input and puts it in c. What is interesting is that in addition to constants, we can also use variables as parameters to `repeat_p`, as demonstrated by
+
+``
+ repeat_p(boost::ref(c))[anychar_p]
+``
+
+Notice that [link boost.ref `boost::ref`] is used to reference the integer c. This usage of `repeat_p` makes the parser defer the evaluation of the repetition factor until it is actually needed. Continuing our example, since the value 11 is already extracted from the input, `repeat_p` is is now expected to loop exactly 11 times.
+
+[endsect][/ loops]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/multi_pass.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/multi_pass.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,365 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:multi_pass The `multi_pass`]
+
+Backtracking in Spirit requires the use of the following types of iterator: forward, bidirectional, or random access. Because of backtracking, input iterators cannot be used. Therefore, the standard library classes istreambuf_iterator and istream_iterator, that fall under the category of input iterators, cannot be used. Another input iterator that is of interest is one that wraps a lexer, such as LEX.
+
+[note Input Iterators
+
+In general, Spirit is a backtracking parser. This is not an absolute requirement though. In the future, we shall see more deterministic parsers that require no more than 1 character (token) of lookahead. Such parsers allow us to use input iterators such as the istream_iterator as is.
+]
+
+Unfortunately, with an input iterator, there is no way to save an iterator position, and thus input iterators will not work with backtracking in Spirit. One solution to this problem is to simply load all the data to be parsed into a container, such as a vector or deque, and then pass the begin and end of the container to Spirit. This method can be too memory intensive for certain applications, which is why the `multi_pass` iterator was created.
+
+The `multi_pass` iterator will convert any input iterator into a forward iterator suitable for use with Spirit. `multi_pass` will buffer data when needed and will discard the buffer when only one copy of the iterator exists.
+
+A grammar must be designed with care if the `multi_pass` iterator is used. Any rule that may need to backtrack, such as one that contains an alternative, will cause data to be buffered. The rules that are optimal to use are sequence and repetition. Sequences of the form `a >> b` will not buffer data at all. Any rule that repeats, such as `kleene_star` (`*a`) or positive such as (`+a`), will only buffer the data for the current repetition.
+
+In typical grammars, ambiguity and therefore lookahead is often localized. In fact, many well designed languages are fully deterministic and require no lookahead at all. Peeking at the first character from the input will immediately determine the alternative branch to take. Yet, even with highly ambiguous grammars, alternatives are often of the form `*(a | b | c | d)`. The input iterator moves on and is never stuck at the beginning. Let's look at a Pascal snippet for example:
+
+``
+ program =
+ programHeading >> block >> '.'
+ ;
+
+ block =
+ *( labelDeclarationPart
+ | constantDefinitionPart
+ | typeDefinitionPart
+ | variableDeclarationPart
+ | procedureAndFunctionDeclarationPart
+ )
+ >> statementPart
+ ;
+``
+
+Notice the alternatives inside the Kleene star in the rule block . The rule gobbles the input in a linear manner and throws away the past history with each iteration. As this is fully deterministic LL(1) grammar, each failed alternative only has to peek 1 character (token). The alternative that consumes more than 1 character (token) is definitely a winner. After which, the Kleene star moves on to the next.
+
+Be mindful if you use the free parse functions. All of these make a copy of the iterator passed to them.
+
+Now, after the lecture on the features to be careful with when using `multi_pass`, you may think that `multi_pass` is way too restrictive to use. That's not the case. If your grammar is deterministic, you can make use of `flush_multi_pass` in your grammar to ensure that data is not buffered when unnecessary.
+
+Again, following up the example we started to use in the section on the scanner . Here's an example using the `multi_pass`: This time around we are extracting our input from the input stream using an `istreambuf_iterator`.
+
+``
+ #include <boost/spirit/core.hpp>
+ #include <boost/spirit/iterator/multi_pass.hpp>
+
+ using namespace boost::spirit;
+ using namespace std;
+
+ ifstream in("input_file.txt"); // we get our input from this file
+
+ typedef char char_t;
+ typedef multi_pass<istreambuf_iterator<char_t> > iterator_t;
+
+ typedef skip_parser_iteration_policy<space_parser> iter_policy_t;
+ typedef scanner_policies<iter_policy_t> scanner_policies_t;
+ typedef scanner<iterator_t, scanner_policies_t> scanner_t;
+
+ typedef rule<scanner_t> rule_t;
+
+ iter_policy_t iter_policy(space_p);
+ scanner_policies_t policies(iter_policy);
+ iterator_t first(
+ make_multi_pass(std::istreambuf_iterator<char_t>(in)));
+
+ scanner_t scan(
+ first, make_multi_pass(std::istreambuf_iterator<char_t>()),
+ policies);
+
+ rule_t n_list = real_p >> *(',' >> real_p);
+ match<> m = n_list.parse(scan);
+``
+
+[h3 `flush_multi_pass`]
+
+There is a predefined pseudo-parser called `flush_multi_pass`. When this parser is used with `multi_pass`, it will call `multi_pass::clear_queue()`. This will cause any buffered data to be erased. This also will invalidate all other copies of `multi_pass` and they should not be used. If they are, an `boost::illegal_backtracking` exception will be thrown.
+
+[h3 `multi_pass` Policies]
+
+`multi_pass` is a templated policy driven class. The description of `multi_pass` above is how it was originally implemented (before it used policies), and is the default configuration now. But, `multi_pass` is capable of much more. Because of the open-ended nature of policies, you can write your own policy to make `multi_pass` behave in a way that we never before imagined.
+
+The `multi_pass` class has five template parameters:
+
+ * `InputT` - The type multi_pass uses to acquire it's input. This is typically an input iterator, or functor.
+ * `InputPolicy` - A class that defines how `multi_pass` acquires it's input. The `InputPolicy` is parameterized by `InputT`.
+ * `OwnershipPolicy` - This policy determines how `multi_pass` deals with it's shared components.
+ * `CheckingPolicy` - This policy determines how checking for invalid iterators is done.
+ * `StoragePolicy` - The buffering scheme used by `multi_pass` is determined and managed by the `StoragePolicy`.
+
+[section Predefined policies]
+
+All predefined `multi_pass` policies are in the namespace `boost::spirit::multi_pass_policies`.
+
+[h3 Predefined `InputPolicy` classes]
+
+[variablelist
+ [
+ [`input_iterator`]
+ [This policy directs multi_pass to read from an input iterator of type `InputT`.]
+ ]
+ [
+ [`lex_input`]
+ [This policy obtains it's input by calling `yylex()`, which would typically be provided by a scanner generated by LEX. If you use this policy your code must link against a LEX generated scanner.]
+ ]
+ [
+ [`functor_input`]
+ [This input policy obtains it's data by calling a functor of type `InputT`. The functor must meet certain requirements. It must have a typedef called `result_type` which should be the type returned from `operator()`. Also, since an input policy needs a way to determine when the end of input has been reached, the functor must contain a static variable named `eof` which is comparable to a variable of `result_type`.]
+ ]
+]
+
+[h3 Predefined `OwnershipPolicy` classes]
+
+[variablelist
+ [
+ [`ref_counted`]
+ [This class uses a reference counting scheme. `multi_pass` will delete it's shared components when the count reaches zero.]
+ ]
+ [
+ [`first_owner`]
+ [When this policy is used, the first `multi_pass` created will be the one that deletes the shared data. Each copy will not take ownership of the shared data. This works well for spirit, since no dynamic allocation of iterators is done. All copies are made on the stack, so the original iterator has the longest lifespan.]
+ ]
+]
+
+[h3 Predefined `CheckingPolicy` classes]
+
+[variablelist
+ [
+ [`no_check`]
+ [This policy does no checking at all.]
+ ]
+ [
+ [`buf_id_check]
+ [`buf_id_check` keeps around a buffer id, or a buffer age. Every time `clear_queue()` is called on a `multi_pass` iterator, it is possible that all other iterators become invalid. When `clear_queue()` is called, `buf_id_check` increments the buffer id. When an iterator is dereferenced, this policy checks that the buffer id of the iterator matches the shared buffer id. This policy is most effective when used together with the `std_deque` `StoragePolicy`. It should not be used with the `fixed_size_queue` StoragePolicy, because it will not detect iterator dereferences that are out of range.]
+ ]
+ [
+ [`full_check`]
+ [This policy has not been implemented yet. When it is, it will keep track of all iterators and make sure that they are all valid.]
+ ]
+]
+
+[h3 Predefined `StoragePolicy` classes]
+
+[variablelist
+ [
+ [`std_deque`]
+ [This policy keeps all buffered data in a `std::deque`. All data is stored as long as there is more than one iterator. Once the iterator count goes down to one, and the queue is no longer needed, it is cleared, freeing up memory. The queue can also be forcibly cleared by calling `multi_pass::clear_queue()`.]
+ ]
+ [
+ [`fixed_size_queue<N>`]
+ [`fixed_size_queue` keeps a circular buffer that is size `N+1` and stores `N` elements. `fixed_size_queue` is a template with a `std::size_t` parameter that specified the queue size. It is your responsibility to ensure that `N` is big enough for your parser. Whenever the foremost iterator is incremented, the last character of the buffer is automatically erased. Currently there is no way to tell if an iterator is trailing too far behind and has become invalid. No dynamic allocation is done by this policy during normal iterator operation, only on initial construction. The memory usage of this `StoragePolicy` is set at `N+1` bytes, unlike `std_deque`, which is unbounded.]
+ ]
+]
+
+[endsect][/ predefined_policies]
+
+[section:combinations Combinations: How to specify your own custom `multi_pass`]
+
+The beauty of policy based designs is that you can mix and match policies to create your own custom class by selecting the policies you want. Here's an example of how to specify a custom `multi_pass` that wraps an `istream_iterator<char>`, and is slightly more efficient than the default because it uses the `first_owner OwnershipPolicy` and the `no_check CheckingPolicy`:
+
+``
+ typedef multi_pass<
+ istream_iterator<char>,
+ multi_pass_policies::input_iterator,
+ multi_pass_policies::first_owner,
+ multi_pass_policies::no_check,
+ multi_pass_policies::std_deque
+ > first_owner_multi_pass_t;
+``
+
+The default template parameters for `multi_pass` are: `input_iterator InputPolicy, ref_counted OwnershipPolicy, buf_id_check CheckingPolicy` and `std_deque StoragePolicy`. So if you use `multi_pass<istream_iterator<char> >` you will get those pre-defined behaviors while wrapping an `istream_iterator<char>`.
+
+There is one other pre-defined class called `look_ahead`. look_ahead has two template parameters: `InputT`, the type of the input iterator to wrap, and a `std::size_t N`, which specifies the size of the buffer to the `fixed_size_queue` policy. While the default `multi_pass` configuration is designed for safey, `look_ahead` is designed for speed. `look_ahead` is derived from a `multi_pass` with the following policies: `input_iterator InputPolicy, first_owner OwnershipPolicy, no_check CheckingPolicy`, and `fixed_size_queue<N> StoragePolicy`.
+
+[h3 How to write a functor for use with the `functor_input InputPolicy`]
+
+If you want to use the `functor_input InputPolicy`, you can write your own functor that will supply the input to `multi_pass`. The functor must satisfy two requirements. It must have a `typedef result_type` which specifies the return type of `operator()`. This is standard practice in the STL. Also, it must supply a static variable called eof which is compared against to know whether the input has reached the end. Here is an example:
+
+``
+ class my_functor
+ {
+ public:
+
+ typedef char result_type;
+
+ my_functor()
+ : c('A') {}
+
+ result_type operator()() const
+ {
+ if (c == 'M')
+ return eof;
+ else
+ return c++;
+ }
+
+ static result_type eof;
+
+ private:
+
+ char c;
+ };
+
+ my_functor::result_type my_functor::eof = '\0';
+
+ typedef multi_pass<
+ my_functor,
+ multi_pass_policies::functor_input,
+ multi_pass_policies::first_owner,
+ multi_pass_policies::no_check,
+ multi_pass_policies::std_deque
+ > functor_multi_pass_t;
+
+ functor_multi_pass_t first = functor_multi_pass_t(my_functor());
+ functor_multi_pass_t last;
+``
+
+[h3 How to write policies for use with `multi_pass`]
+
+[h4 `InputPolicy`]
+
+An `InputPolicy` must have the following interface:
+
+``
+ class my_input_policy // your policy name
+ {
+ public:
+
+ // class inner will be instantiated with the type given
+ // as the InputT parameter to multi_pass.
+
+ template <typename InputT>
+ class inner
+ {
+ public:
+
+ // these typedefs determine the iterator_traits for multi_pass
+ typedef x value_type;
+ typedef x difference_type;
+ typedef x pointer;
+ typedef x reference;
+
+ protected:
+
+ inner();
+ inner(InputT x);
+ inner(inner const& x);
+ // delete or clean up any state
+ void destroy();
+ // return true if *this and x have the same input
+ bool same_input(inner const& x) const;
+ void swap(inner& x);
+
+ public:
+
+ // get an instance from the input
+ result_type get_input() const;
+ // advance the input
+ void advance_input();
+ // return true if the input is at the end
+ bool input_at_eof() const;
+ };
+ };
+``
+
+Because of the way that `multi_pass` shares a buffer and input among multiple copies, class inner should keep a pointer to it's input. The copy constructor should simply copy the pointer. `destroy()` should delete it. `same_input` should compare the pointers. For more details see the various implementations of `InputPolicy` classes.
+
+[h4 `OwnershipPolicy`]
+
+The `OwnershipPolicy` must have the following interface:
+
+``
+ class my_ownership_policy
+ {
+ protected:
+
+ my_ownership_policy();
+ my_ownership_policy(my_ownership_policy const& x);
+ // clone is called when a copy of the iterator is made
+ void clone();
+ // called when a copy is deleted. Return true to indicate
+ // resources should be released
+ bool release();
+ void swap(my_ownership_policy& x);
+
+ public:
+ // returns true if there is only one iterator in existence.
+ // std_dequeue StoragePolicy will free it's buffered data if this
+ // returns true.
+ bool unique() const;
+ };
+``
+
+[h4 `CheckingPolicy`]
+
+The `CheckingPolicy` must have the following interface:
+
+``
+ class my_check
+ {
+ protected:
+
+ my_check();
+ my_check(my_check const& x);
+ void destroy();
+ void swap(my_check& x);
+ // check should make sure that this iterator is valid
+ void check() const;
+ void clear_queue();
+ };
+``
+
+[h4 `StoragePolicy`]
+
+A `StoragePolicy` must have the following interface:
+
+``
+ class my_storage_policy
+ {
+ public:
+
+ // class inner will be instantiated with the value_type from the InputPolicy
+
+ template <typename ValueT>
+ class inner
+ {
+ protected:
+
+ inner();
+ inner(inner const& x);
+ // will be called from the destructor of the last iterator.
+ void destroy();
+ void swap(inner& x);
+ // This is called when the iterator is dereferenced. It's a template
+ // method so we can recover the type of the multi_pass iterator
+ // and access it.
+ template <typename MultiPassT>
+ static ValueT dereference(MultiPassT const& mp);
+ // This is called when the iterator is incremented. It's a template
+ // method so we can recover the type of the multi_pass iterator
+ // and access it.
+ template <typename MultiPassT>
+ static void increment(MultiPassT& mp);
+ void clear_queue();
+ // called to determine whether the iterator is an eof iterator
+ template <typename MultiPassT>
+ static bool is_eof(MultiPassT const& mp);
+ // called by operator==
+ bool equal_to(inner const& x) const;
+ // called by operator<
+ bool less_than(inner const& x) const;
+ }; // class inner
+ };
+``
+
+A `StoragePolicy` is the trickiest policy to write. You should study and understand the existing `StoragePolicy` classes before you try and write your own.
+
+[endsect][/ multi_pass]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/organisation.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/organisation.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,52 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Organisation]
+
+The framework is highly modular and is organized in layers:
+
+[pre
+ iterator actor
+
+ debug
+
+attribute dynamic error_handling symbols tree utility
+
+ meta
+
+ core
+scanner primitives composite non_terminal
+]
+
+Spirit has four layers, plus an independent top layer. The independent layer, comprising of actor and iterator, does not rely on the other layers. The framework's architecture is completely orthogonal. The relationship among the layers is acyclic. Lower layers do not depend nor know the existence of upper layers. Modules in a layer do not depend on other modules in the same layer.
+
+The client may use only the modules that she wants without incurring any compile time nor run time penalty. A minimalistic approach is to use only the core as is. The highly streamlined core is usable by itself. The core is sufficiently suitable for tasks such as micro parsing.
+
+The [*iterator] module is independent of Spirit and may be used in other non-Spirit applications. This module is a compilation of stand-alone iterators and iterator wrappers compatible with Spirit. Over time, these iterators have been found to be most useful for parsing with Spirit.
+
+The [*actor] module, also independent of Spirit, is a compilation of predefined semantic actions that covers the most common semantics processing tasks.
+
+The [*debug] module provides library wide parser debugging. This module hooks itself up transparently into the core non-intrusively and only when necessary.
+
+The [*attribute] module introduces advanced semantic action machinery with emphasis on extraction and passing of data up and down the parser hierarchy through inherited and synthesized attributes. Attributes may also be used to actually control the parsing. Parametric parsers are a form of dynamic parsers that changes their behavior at run time based on some attribute or data.
+
+The [*dynamic] module focuses on parsers with behavior that can be modified at run-time.
+
+[*error_handling]. The framework would not be complete without Error Handling. C++'s exception handling mechanism is a perfect match for Spirit due to its highly recursive functional nature. C++ Exceptions are used extensively by this module for handling errors.
+
+The [*symbols] module focuses on symbol table management. This module is rather basic now. The goal is to build a sub-framework that will be able to accommodate C++ style multiple scope mechanisms. C++ is a great model for the complexity of scoping that perhaps has no parallel in any other language. There are classes and inheritance, private, protected and public access restrictions, friends, namespaces, using declarations, using directives, Koenig lookup (Argument Dependent Lookup) and more. The symbol table functionality we have now will be the basis of a complete facility that will attempt to model this.
+
+[:["I wish that I could ever see, a structure as lovely as a tree...]]
+
+Parse Tree and Abstract Syntax Tree (AST) generation are handled by the [*Tree] module. There are advantages with Parse Trees and Abstract Syntax Trees over semantic actions. You can make multiple passes over the data without having to re-parse the input. You can perform transformations on the tree. You can evaluate things in any order you want, whereas with attribute schemes you have to process in a begin to end fashion. You do not have to worry about backtracking and action side effects that may occur with an ambiguous grammar.
+
+The [*utility] module is a set of commonly useful parsers and support classes that were found to be useful in handling common tasks such as list processing, comments, confix expressions, etc.
+
+[*meta], provides metaprogramming facilities for advanced Spirit developers. This module facilitates compile-time and run-time introspection of Spirit parsers.
+
+[endsect][/ organisation]

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/portability.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/portability.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,27 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Portability]
+
+Historically, Spirit supported a lot of compilers, including (to some extent) poorly conforming compilers such as VC6. Spirit v1.6.x will be the last release that will support older poorly conforming compilers. Starting from Spirit v1.8.0, ill conforming compilers will not be supported. If you are still using one of these older compilers, you can still use Spirit v1.6.x.
+
+The reason why Spirit v1.6.x worked on old non-conforming compilers is that the authors laboriously took the trouble of searching for workarounds to make these compilers happy. The process takes a lot of time and energy, especially when one encounters the dreaded ICE or "Internal Compiler Error". Sometimes searching for a single workaround takes days or even weeks. Sometimes, there are no known workarounds. This stifles progress a lot. And, as the library gets more progressive and takes on more advanced C++ techniques, the difficulty is escalated to even new heights.
+
+Spirit v1.6.x will still be supported. Maintenance will still happen and bug fixes will still be applied. There will still be active development for the back-porting of new features introduced in Spirit v1.8.0 (and Spirit 1.9.0) to lesser able compilers; hopefully, fueled by contributions from the community. We welcome active support from the C++ community, especially those with special expertise on compilers such as older Borland and MSVC++ compilers.
+
+Spirit 1.8 has been tested to compile and run properly on these compilers:
+
+ 1. g++ 3.1 and above
+ 2. Comeau 4.24.5
+ 3. MSVC 7.1
+ 4. Intel 7.1
+
+If your compiler is sufficiently conforming, chances are, you can compile Spirit as it is or with minimal portability fixes here and there. Please inform us if your compiler is known to be ISO/ANSI conforming and is not in this list above. Feel free to post feedback to [@https://lists.sourceforge.net/lists/listinfo/spirit-general Spirit-general mailing list] \[Spirit-general_at_[hidden]\].
+
+[endsect][/ portability]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/position_iterator.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/position_iterator.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,83 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Position Iterator]
+
+Often, when writing a parser that is able to detect errors in the format of the input stream, we want it to communicate to the user where the error happened within that input. The classic example is when writing a compiler or interpreter that detects syntactical errors in the parsed program, indicating the line number and maybe even the position within the line where the error was found.
+
+The class `position_iterator` is a tool provided within Spirit that allows parser writers to easily implement this functionality. The concept is quite simple: this class is an iterator wrapper that keeps track of the current position within the file, including current file, line and column. It requires a single template parameter, which should be the type of the iterator that is to be wrapped.
+
+To use it, you'll need to add the following include:
+
+``
+ #include <boost/spirit/iterator/position_iterator.hpp>
+``
+
+Or include all the iterators in Spirit:
+
+``
+ #include <boost/spirit/iterator.hpp>
+``
+
+To construct the wrapper, it needs both the begin and end iterators of the input sequence, and the file name of the input sequence. Optionally, you can also specify the starting line and column numbers, which default to 1. Default construction, with no parameters, creates a generic end-of-sequence iterator, in a similar manner as it's done in the stream operators of the standard C++ library.
+
+The wrapped iterator must belong to the input or forward iterator category, and the `position_iterator` just inherits that category.
+
+For example, to create begin and end positional iterators from an input C-string, you'd use:
+
+``
+ char const* inputstring = "...";
+ typedef position_iterator<char const*> iterator_t;
+
+ iterator_t begin(inputstring, inputstring+strlen(inputstring));
+ iterator_t end;
+``
+
+[section Operations]
+
+``
+ void set_position(file_position const&);
+``
+
+Call this function when you need to change the current position stored in the iterator. For example, if parsing C-style `#include` directives, the included file's input must be marked by restarting the file and column to 1 and 1 and the name to the new file's name.
+
+``
+ file_position const& get_position() const;
+``
+
+Call this function to retrieve the current position.
+
+``
+ void set_tabchars(int);
+``
+
+Call this to set the number of tabs per character. This value is necessary to correctly track the column number.
+
+[h4 `file_position`]
+
+`file_position` is a structure that holds the position within a file. Its fields are:
+
+[table `file_position` fields
+ [
+ [`std::string file;`]
+ [Name of the file. Hopefully a full pathname.]
+ ]
+ [
+ [`int line;`]
+ [Line number within the file. By default, the first line is number 1.]
+ ]
+ [
+ [`int column;`]
+ [Column position within the file. The first column is 1.]
+ ]
+]
+
+See [@../../example/fundamental/position_iterator/position_iterator.cpp position_iterator.cpp] for a compilable example. This is part of the Spirit distribution.
+
+[endsect][/ position_iterator]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/rationale.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/rationale.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,101 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Rationale]
+
+[section:vf Virtual functions: From static to dynamic C++]
+
+Rules straddle the border between static and dynamic C++. In effect, a rule transforms compile-time polymorphism (using templates) into run-time polymorphism (using virtual functions). This is necessary due to C++'s inability to automatically declare a variable of a type deduced from an arbitrarily complex expression in the right-hand side (rhs) of an assignment. Basically, we want to do something like:
+
+``
+ T rule = an_arbitrarily_complex_expression;
+``
+
+without having to know or care about the resulting type of the right-hand side (rhs) of the assignment expression. Apart from this, we also need a facility to forward declare an unknown type:
+
+``
+ T rule;
+ ...
+ rule = a | b;
+``
+
+These limitations lead us to this implementation of rules. This comes at the expense of the overhead of a virtual function call, once through each invocation of a rule.
+
+[endsect][/ vf]
+
+[section:md Multiple declaration]
+
+Some BNF variants allow multiple declarations of a rule. The declarations are taken as alternatives. Example:
+
+``
+ r = a;
+ r = b;
+``
+
+is equivalent to:
+
+``
+ r = a | b;
+``
+
+[link __Spirit v1.3] allowed this behavior. However, the current version of Spirit no longer allows this because experience shows that this behavior leads to unwanted gotchas (for instance, it does not allow rules to be held in containers). In the current release of Spirit, a second assignment to a rule will simply redefine it. The old definition is destructed. This follows more closely C++ semantics and is more in line with what the user expects the rule to behave.
+
+[endsect][/ md]
+
+[section:ss Sequencing Syntax]
+
+The comma operator as in `a, b` seems to be a better candidate, syntax-wise. But then the problem is with its precedence. It has the lowest precedence in C/C++, which makes it virtually useless.
+
+[@http://research.att.com/~bs Bjarne Stroustrup], in his article [link __references__.generalised_overloading "Generalizing Overloading for C++2000"] talks about overloading whitespace. Such a feature would allow juxtapositioning of parser objects exactly as we do in (E)BNF (e.g. `a b | c` instead of `a >> b | c`). Unfortunately, the article was dated April 1, 1998. Oh well.
+
+[endsect][/ ss]
+
+[section:fi Forward iterators]
+
+In general, the scanner expects at least a standard conforming forward iterator. Forward iterators are needed for backtracking where the iterator needs to be saved and restored later. Generally speaking, Spirit is a backtracking parser. The implication of this is that at some point, the iterator position will have to be saved to allow the parser to backtrack to a previous point. Thus, for backtracking to work, the framework requires at least a forward iterator.
+
+Some parsers might require more specialized iterators (bi-directional or even random access). Perhaps in the future, deterministic parsers when added to the framework, will perform no backtracking and will need just a single token lookahead, hence will require input iterators only.
+
+[endsect][/ fi]
+
+[section:subrules Why are subrules important?]
+
+__Subrules__ open up the oportunity to do aggressive meta programming as well because they do not rely on virtual functions. The virtual function is the meta-programmer's hell. Not only does it slow down the program due to the virtual function indirect call, it is also an opaque wall where no metaprogram can get past. It kills all meta-information beyond the virtual function call. Worse, the virtual function cannot be templated. Which means that its arguments have to be tied to a actual types. Many problems stem from this limitation.
+
+While Spirit is a currently classified as a non-deterministic recursive-descent parser, Doug Gregor first noted that other parsing techniques apart from top-down recursive descent may be applied. For instance, apart from non-deterministic recursive descent, deterministic LL(1) and LR(1) can theoretically be implemented using the same expression template front end. Spirit rules use virtual functions to encode the RHS parser expression in an opaque abstract parser type. While it serves its purpose well, the rule's virtual functions are the stumbling blocks to more advanced metaprogramming. Subrules are free from virtual functions.
+
+[endsect[/ subrules]
+
+[section:greedy Exhaustive backtracking and greedy RD]
+
+Spirit doesn't do exhaustive backtracking like regular expressions are expected to. For example:
+
+``
+ *chlit_p('a') >> chlit_p('a');
+``
+
+will always fail to match because Spirit's Kleene star does not back off when the rest of the rule fails to match.
+
+Actually, there's a solution to this greedy RD (recursive-descent) problem. Such a scheme is discussed in section 6.6.2 of [@http://www.cs.vu.nl/%7Edick/PTAPG.html Parsing Techniques: A Practical Guide]. The trick involves passing a tail parser (in addition to the scanner) to each parser. The start parser will then simply be: `start >> end_p;` (`end_p` is the start's tail).
+
+Spirit is greedy, using straight forward, naive RD. It is certainly possible to implement the fully backtracking scheme presented above, but there will be also certainly be a performance hit. The scheme will always try to match all possible parser paths (full parser hierarchy traversal) until it reaches a point of certainty, that the whole thing matches or fails to match.
+
+[note
+Backtracking and Greedy RD
+
+Spirit is quite consistent and intuitive about when it backtracks and to where, although it may not be obvious to those coming from different backgrounds. In general, any (sub)parser will, given the same input, always match the same portion of the input (or fail to match the input at all). This means that Spirit is inherently greedy. Spirit will only backtrack when a (sub)parser fails to match the input, and it will always backtrack to the next choice point upward (not backward) in the parser structure. In other words `abb|ab` will match `"ab"`, as will `a(bb|b)`, but `(ab|a)b` won't because the `(ab|a)` subparser will always match the `'b'` after the `'a'` if it is available.
+
+--Rainer Deyke
+]
+
+There's a strong preference on "simplicity with all the knobs when you need them" approach, right now. On the other hand, the flexibility of Spirit makes it possible to have different optional schemes available. It might be possible to implement an exhaustive backtracking RD scheme as an optional feature in the future.
+
+[endsect][/ greedy]
+
+[endsect][/ rationale]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/refactoring.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/refactoring.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,99 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:refactoring Refactoring Parsers]
+
+There are three types of Refactoring Parsers implemented right now, which help to abstract common parser refactoring tasks. Parser refactoring means, that a concrete parser construct is replaced (refactored) by another very similar parser construct. Two of the Refactoring Parsers described here (`refactor_unary_parser` and `refactor_action_parser`) are introduced to allow a simple and more expressive notation while using [link __Confix Parsers__] and [link __List Parsers__]. The third Refactoring Parser (`attach_action_parser`) is implemented to abstract some functionality required for the Grouping Parser. Nevertheless these Refactoring Parsers may help in solving other complex parsing tasks too.
+
+[h3 Refactoring unary parsers]
+
+The `refactor_unary_d` parser generator, which should be used to generate a unary refactoring parser, transforms a construct of the following type
+
+``
+ refactor_unary_d[*some_parser - another_parser]
+``
+
+to
+
+``
+ *(some_parser - another_parser)
+``
+
+where `refactor_unary_d` is a predefined object of the parser generator struct `refactor_unary_gen<>`.
+
+The `refactor_unary_d` parser generator generates a new parser as shown above, only if the original construct is an auxilliary binary parser (here the difference parser) and the left operand of this binary parser is an auxilliary unary parser (here the kleene star operator). If the original parser isn't a binary parser the compilation will fail. If the left operand isn't an unary parser, no refactoring will take place.
+
+[h3 Refactoring action parsers]
+
+The refactor_action_d parser generator, which should be used to generate an action refactoring parser, transforms a construct of the following type
+
+``
+ refactor_action_d[some_parser[some_actor] - another_parser]
+``
+
+to
+
+``
+ (some_parser - another_parser)[some_actor]
+``
+
+where `refactor_action_d` is a predefined object of the parser generator struct `refactor_action_gen<>`.
+
+The `refactor_action_d` parser generator generates a new parser as shown above, only if the original construct is an auxilliary binary parser (here the difference parser) and the left operand of this binary parser is an auxilliary parser generated by an attached semantic action. If the original parser isn't a binary parser the compilation will fail. If the left operand isn't an action parser, no refactoring will take place.
+
+[h3 Attach action refactoring]
+
+The `attach_action_d` parser generator, which should be used to generate an attach action refactoring parser, transforms a construct of the following type
+
+``
+ attach_action_d[(some_parser >> another_parser)[some_actor]]
+``
+
+to
+
+``
+ some_parser[some_actor] >> another_parser[some_actor]
+``
+
+where `attach_action_d` is a predefined object of the parser generator struct `attach_action_gen<>`.
+
+The `attach_action_d` parser generator generates a new parser as shown above, only if the original construct is an auxilliary action parser and the parser to it this action is attached is an auxilliary binary parser (here the sequence parser). If the original parser isn't a action parser the compilation will fail. If the parser to which the action is attached isn't an binary parser, no refactoring will take place.
+
+[h3 Nested refactoring]
+
+Sometimes it is required to nest different types of refactoring, i.e. to transform constructs like
+
+``
+ (*some_parser)[some_actor] - another_parser
+``
+
+to
+
+``
+ (*(some_parser - another_parser))[some_actor]
+``
+
+To simplify the construction of such nested refactoring parsers the `refactor_unary_gen<>` and `refactor_action_gen<>` both can take another refactoring parser generator type as their respective template parameter. For instance, to construct a refactoring parser generator for the mentioned nested transformation we should write:
+
+``
+ typedef refactor_action_gen<refactor_unary_gen<> > refactor_t;
+ const refactor_t refactor_nested_d = refactor_t(refactor_unary_d);
+``
+
+Now we could use it as follows to get the required result:
+
+``
+ refactor_nested_d[(*some_parser)[some_actor] - another_parser]
+``
+
+An empty template parameter means not to nest this particular refactoring parser. The default template parameter is `non_nesting_refactoring`, a predefined helper structure for inhibiting nesting. Sometimes it is required to nest a particular refactoring parser with itself. This is achieved by providing the predefined helper structure `self_nested_refactoring` as the template parameter to the corresponding refactoring parser generator template.
+
+See [@../../example/fundamental/refactoring.cpp refactoring.cpp] for a compilable example. This is part of the Spirit distribution.
+
+[endsect][/ refactoring]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/regular_expression_parser.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/regular_expression_parser.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,40 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:regex_p Regular Expression Parser]
+
+Regular expressions are a form of pattern-matching that are often used in text processing. Many users will be familiar with the usage of regular expressions. Initially there were the Unix utilities grep, sed and awk, and the programming language perl, each of which make extensive use of regular expressions. Today the usage of such regular expressions is integrated in many more available systems.
+
+During parser construction it is often useful to have the power of regular expressions available. The Regular Expression Parser was introduced, to make the use of regular expressions accessible for Spirit parser construction.
+
+The Regular Expression Parser `rxstrlit` has a single template type parameter: an iterator type. Internally, `rxstrlit` holds the [link boost.regex Boost.Regex] object containing the provided regular expression. The `rxstrlit` attempts to match the current input stream with this regular expression. The template type parameter defaults to `char const*`. `rxstrlit` has two constructors. The first accepts a null-terminated character pointer. This constructor may be used to build `rxstrlit`'s from quoted regular expression literals. The second constructor takes in a first/last iterator pair. The function generator version is `regex_p`.
+
+Here are some examples:
+
+``
+ rxstrlit<>("Hello[[:space:]]+[W|w]orld")
+ regex_p("Hello[[:space:]]+[W|w]orld")
+
+ std::string msg("Hello[[:space:]]+[W|w]orld");
+ rxstrlit<>(msg.begin(), msg.end());
+``
+
+The generated parser object acts at the character level, thus an eventually given skip parser is not used during the attempt to match the regular expression (see [link __The Scanner Business__]).
+
+The Regular Expression Parser is implemented by the help of [link boost.regex the Boost Regex library], so you have to have some limitations in mind.
+
+* [@http://www.boost.org Boost] libraries have to be installed on your computer and the Boost root directory has to be added to your compiler `#include<...>` search path. You can download the actual version at the [@http://www.boost.org Boost web site].
+
+* The Boost Regex library requires the usage of bi-directional iterators. So you have to ensure this during the usage of the Spirit parser, which contains a Regular Expression Parser.
+
+* The Boost Regex library is not a header only library, as Spirit is, though it provides the possibility to include all of the sources, if you are using it in one compilation unit only. Define the preprocessor constant `BOOST_SPIRIT_NO_REGEX_LIB` before including the spirit Regular Expression Parser header, if you want to include all the Boost.Regex sources into this compilation unit. If you are using the Regular Expression Parser in more than one compilation unit, you should not define this constant and must link your application against the regex library as described in the related documentation.
+
+See [@../../example/fundamental/regular_expression.cpp regular_expression.cpp] for a compilable example. This is part of the Spirit distribution.
+
+[endsect][/ regexp_p]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/scoped_lock.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/scoped_lock.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,24 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Scoped Lock]
+
+[h3 `scoped_lock_d`]
+
+The `scoped_lock_d` directive constructs a parser that locks a mutex during the attempt to match the contained parser.
+
+Syntax:
+
+``
+ scoped_lock_d(mutex&)[body-parser]
+``
+
+Note, that nesting `scoped_lock_d` directives bears the risk of deadlocks since the order of locking depends on the grammar used and may even depend on the input being parsed. Locking order has to be consistent within an application to ensure deadlock free operation.
+
+[endsect][/ scoped_lock]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/select_parser.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/select_parser.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,67 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:select_p The Select Parser]
+
+Select parsers may be used to identify a single parser from a given list of parsers, which successfully recognizes the current input sequence. Example:
+
+``
+ rule<> rule_select =
+ select_p
+ (
+ parser_a
+ , parser_b
+ /* ... */
+ , parser_n
+ );
+``
+
+The parsers (`parser_a`, `parser_b` etc.) are tried sequentially from left to right until a parser matches the current input sequence. If there is a matching parser found, the `select_p` parser returns the parser's position (zero based index). For instance, in the example above, 1 is returned if `parser_b` matches.
+
+There are two predefined parsers of the select parser family: `select_p` and `select_fail_p`. These parsers differ in the way the no match case is handled (when none of the parsers match the current input sequence). While the `select_p` parser will return `-1` if no matching parser is found, the `select_fail_p` parser will not match at all.
+
+The following sample shows how the select parser may be used very conveniently in conjunction with a [link __switch parser__]:
+
+``
+ int choice = -1;
+ rule<> rule_select =
+ select_fail_p('a', 'b', 'c', 'd')[assign_a(choice)]
+ >> switch_p(var(choice))
+ [
+ case_p<0>(int_p),
+ case_p<1>(ch_p(',')),
+ case_p<2>(str_p("bcd")),
+ default_p
+ ]
+ ;
+``
+
+This example shows a rule, which matches:
+
+ * `'a'` followed by an integer
+ * `'b'` followed by a `','`
+ * `'c'` followed by `"bcd"`
+ * a single `'d'`.
+
+For other input sequences the give rule does not match at all.
+
+[important
+`BOOST_SPIRIT_SELECT_LIMIT`
+
+The number of possible entries inside the `select_p` parser is limited by the Spirit compile time constant `BOOST_SPIRIT_SELECT_LIMIT`, which defaults to `3`. This value should not be greater than the compile time constant given by `PHOENIX_LIMIT` (see __phoenix__). Example:
+
+``
+// Define these before including anything else
+#define PHOENIX_LIMIT 10
+#define BOOST_SPIRIT_SELECT_LIMIT 10
+``
+
+][/ important]
+
+[endsect][/ select_p]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/stored_rule.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/stored_rule.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,75 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Storable Rules]
+
+The rule is a weird C++ citizen, unlike any other C++ object. It does not have the proper copy and assignment semantics and cannot be stored and passed around by value. You cannot store rules in STL containers (vector, stack, etc) for later use and you cannot pass and return rules to and from functions by value.
+
+EBNF is primarily declarative. Like in functional programming, an EBNF grammar is a static recipe and there's no notion of 'do this, then that'. However, in Spirit, we managed to coax imperative C++ to take in declarative EBNF. Hah! Fun!... We did that by masquerading the C++ assignment operator to mimic EBNF's `::=`. To do that, we gave the rule class' assignment operator and copy constructor a different meaning and semantics. The downside is that doing so made the rule unlike any other C++ object. You can't copy it. You can't assign it.
+
+We want to have the dynamic nature of C++ to our advantage. We've seen dynamic Spirit in action here and there. There are indeed some interesting applications of dynamic parsers using Spirit. Yet, we will not fully utilize the power of dynamic parsing, unless we have a rule that behaves like any other good C++ object. With such a beast, we can write full parsers that's defined at run time, as opposed to compile time.
+
+We now have dynamic rules: `stored_rules`. Basically they are rules with perfect C++ assignment/copy-constructor semantics. This means that `stored_rules` can be stored in containers and/or dynamically created at run-time.
+
+``
+ template<
+ typename ScannerT = scanner<>,
+ typename ContextT = parser_context<>,
+ typename TagT = parser_address_tag>
+ class stored_rule;
+``
+
+The interface is exactly the same as with the `rule` class (see the [link __section on rules__] for more information regarding the API). The only difference is with the copy and assignment semantics. Now, with `stored_rules, we can dynamically and algorithmically define our rules. Here are some samples...
+
+Say I want to dynamically create a rule for:
+
+``
+ start = *(a | b | c);`
+``
+
+I can write it dynamically step-by-step:
+
+``
+ stored_rule<> start;
+
+ start = a;
+ start = start.copy() | b;
+ start = start.copy() | c;
+ start = *(start.copy());
+``
+
+Later, I changed my mind and want to redefine it (again dynamically) as:
+
+``
+ start = (a | b) >> (start | b);
+``
+
+I write:
+
+``
+ start = b;
+ start = a | start.copy();
+ start = start.copy() >> (start | b);
+``
+
+Notice the statement:
+
+``
+ start = start.copy() | b;
+``
+
+Why is `start.copy()` required? Well, because like rules, stored rules are still embedded by reference when found in the RHS (one reason is to avoid cyclic-shared-pointers). If we write:
+
+``
+ start = start | b;
+``
+
+We have left-recursion! Copying copy of `start` avoids self referencing. What we are doing is making a copy of `start`, ORing it with `b`, then destructively assigning the result back to `start`.
+
+[endsect][/ storable_rules]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/style.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/style.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,78 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:style Style Guidelines]
+
+ At some point, especially when there are lots of semantic actions attached to various points, the grammar tends to be quite difficult to follow. In order to keep an easy-to-read, consistent en aesthetically pleasing look to the Spirit code, the following coding styleguide is advised.
+
+This coding style is adapted and extended from the ANTLR/PCCTS style (Terrence Parr) and [@http://groups.yahoo.com/group/boost/files/coding_guidelines.html Boost coding guidelines] (David Abrahams and Nathan Myers) and is the combined work of Joel de Guzman, Chris Uzdavinis and Hartmut Kaiser.
+
+ * Rule names use std C++ (Boost) convention. The rule name may be very long.
+ * The `'='` is neatly indented 4 spaces below. Like Boost, use spaces instead of tabs.
+ * Breaking the operands into separate lines puts the semantic actions neatly to the right.
+ * Semicolon at the last line terminates the rule.
+ * The adjacent parts of a sequence should be indented accordingly to have all, what belongs to one level, at one indentation level.
+
+``
+ program
+ = program_heading [heading_action]
+ >> block [block_action]
+ >> '.'
+ | another_sequence
+ >> etc
+ ;
+``
+
+ * Prefer literals in the grammar instead of identifiers. e.g. `"program"` instead of `PROGRAM`, `'>='` instead of `GTE` and `'.'` instead of `DOT`. This makes it much easier to read. If this isn't possible (for instance where the used tokens must be identified through integers) capitalized identifiers should be used instead.
+ * Breaking the operands may not be needed for short expressions. e.g. `*(',' >> file_identifier)` as long as the line does not exceed 80 characters.
+ * If a sequence fits on one line, put spaces inside the parentheses to clearly separate them from the rules.
+
+``
+ program_heading
+ = as_lower_d["program"]
+ >> identifier
+ >> '('
+ >> file_identifier
+ >> *( ',' >> file_identifier )
+ >> ')'
+ >> ';'
+ ;
+``
+
+ * Nesting directives: If a rule does not fit on one line (80 characters) it should be continued on the next line intended by one level.
+ * The brackets of directives, semantic expressions (using [link __Phoenix__] or [link __LL lambda__] expressions) or parsers should be placed as follows.
+
+``
+ identifier
+ = nocase
+ [
+ lexeme
+ [
+ alpha >> *(alnum | '_') [id_action]
+ ]
+ ]
+ ;
+``
+
+ * Nesting unary operators (e.g.Kleene star)
+ * Unary rule operators (Kleene star, `'!'`, `'+'`, etc.) should be moved out one space before the corresponding indentation level, if this rule has a body or a sequence after it, which does not fit on on line. This makes the formatting more consistent and moves the rule 'body' at the same indentation level as the rule itself, highlighting the unary operator.
+
+``
+ block
+ = *( label_declaration_part
+ | constant_definition_part
+ | type_definition_part
+ | variable_declaration_part
+ | procedure_and_function_declaration_part
+ )
+ >> statement_part
+ ;
+``
+
+[endsect][/ style]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/switch_parser.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/switch_parser.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,108 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section:switch_p The Switch Parser]
+
+Switch parsers may be used to simplify certain alternation constructs. Consider the following code:
+
+``
+ rule<> rule_overall =
+ ch_p('a') >> parser_a
+ | ch_p('b') >> parser_b
+ // ...
+ | ch_p('n') >> parser_n
+ ;
+``
+
+Each of the alternatives are evaluated normally in a sequential manner. This tend to be inefficient, especially for a large number of alternatives. To avoid this inefficiency and to make it possible to write such constructs in a more readable form, Spirit contains the `switch_p` family of parsers. The `switch_p` parser allows us to rewrite the previous construct as:
+
+``
+ rule<> rule_overall =
+ switch_p
+ [
+ case_p<'a'>(parser_a),
+ case_p<'b'>(parser_b),
+ // ...
+ case_p<'n'>(parser_n)
+ ]
+ ;
+``
+
+This `switch_p` parser takes the next character (or token) from the input stream and tries to match it against the given integral compile time constants supplied as the template parameters to the `case_p` parsers. If this character matches one of the `case_p` branches, the associated parser is executed (i.e. if `'a'` is matched, `parser_a` is executed, if `'b'` is matched, `parser_b` is executed and so on) . If no `case_p` branch matches the next input character, the overall construct does not match at all.
+
+[tip
+Nabialek trick
+
+The [link __nabialek trick__ "Nabialek trick"] (from the name of its inventor, Sam Nabialek), can also improve the rule dispatch from linear non-deterministic to deterministic. This is similar to the `switch_p` parser, yet, can handle grammars where a keyword (`operator`, etc), instead of a single character or token, precedes a production.
+
+Sometimes it is desireable to add handling of the default case (none of the `case_p` branches matched). This may be achieved with the help of a `default_p` branch:
+
+``
+ rule<> rule_overall =
+ switch_p
+ [
+ case_p<'a'>(parser_a),
+ case_p<'b'>(parser_b),
+ // ...
+ case_p<'n'>(parser_n),
+ default_p(parser_default)
+ ]
+ ;
+``
+
+This form chooses the `parser_default` parser if none of the cases matches the next character from the input stream. Please note that, obviously, only one `default_p` branch may be added to the `switch_p` parser construct.
+
+Moreover, it is possible to omit the parentheses and body from the `default_p` construct, in which case, no additional parser is executed and the overall `switch_p` construct simply returns a match on any character of the input stream, which does not match any of the `case_p` branches:
+
+``
+ rule<> rule_overall =
+ switch_p
+ [
+ case_p<'a'>(parser_a),
+ case_p<'b'>(parser_b),
+ // ...
+ case_p<'n'>(parser_n),
+ default_p
+ ]
+ ;
+``
+
+There is another form of the `switch_p` construct. This form allows us to explicitly specify the value to be used for matching against the `case_p` branches:
+
+``
+ rule<> rule_overall =
+ switch_p(cond)
+ [
+ case_p<'a'>(parser_a),
+ case_p<'b'>(parser_b),
+ // ...
+ case_p<'n'>(parser_n)
+ ]
+ ;
+``
+
+where `cond` is a parser or a nullary function or function object (functor). If it is a parser, then it is tried and its return value is used to match against the `case_p` branches. If it is a nullary function or functor, then its return value will be used.
+
+Please note that during its compilation, the `switch_p` construct is transformed into a real C++ switch statement. This makes the runtime execution very efficient.
+
+[important
+`BOOST_SPIRIT_SWITCH_CASE_LIMIT`
+
+The number of possible `case_p`/`default_p` branches is limited by the Spirit compile time constant `BOOST_SPIRIT_SWITCH_CASE_LIMIT`, which defaults to `3`. There is no theoretical upper limit for this constant, but most compilers won't allow you to specify a very large number.
+
+Example:
+
+``
+// Define these before including switch.hpp
+#define BOOST_SPIRIT_SWITCH_CASE_LIMIT 10
+``
+
+][/ important]
+
+[endsect][/ switch_p]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/symbols.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/symbols.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,183 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Symbols]
+
+This class symbols implements a symbol table. The symbol table holds a dictionary of symbols where each symbol is a sequence of `CharT`s (a `char`, `wchar_t`, `int`, enumeration etc.) . The template class, parameterized by the character type (`CharT`), can work efficiently with 8, 16, 32 and even 64 bit characters. Mutable data of type `T` is associated with each symbol.
+
+Traditionally, symbol table management is maintained seperately outside the BNF grammar through semantic actions. Contrary to standard practice, the Spirit symbol table class `symbols` is-a parser. An instance of which may be used anywhere in the EBNF grammar specification. It is an example of a dynamic parser. A dynamic parser is characterized by its ability to modify its behavior at run time. Initially, an empty `symbols` object matches nothing. At any time, symbols may be added, thus, dynamically altering its behavior.
+
+Each entry in a symbol table has an associated mutable data slot. In this regard, one can view the symbol table as an associative container (or map) of key-value pairs where the keys are strings.
+
+The symbols class expects two template parameters (actually there is a third, see detail box). The first parameter `T` specifies the data type associated with each symbol (defaults to `int`) and the second parameter `CharT` specifies the character type of the symbols (defaults to `char`).
+
+``
+ template
+ <
+ typename T = int,
+ typename CharT = char,
+ typename SetT = impl::tst<T, CharT>
+ >
+ class symbols;
+``
+
+[info Ternary State Trees
+
+The actual set implementation is supplied by the `SetT` template parameter (3rd template parameter of the symbols class) . By default, this uses the tst class which is an implementation of the Ternary Search Tree.
+
+Ternary Search Trees are faster than hashing for many typical search problems especially when the search interface is iterator based. Searching for a string of length k in a ternary search tree with n strings will require at most O(log n+k) character comparisons. TSTs are many times faster than hash tables for unsuccessful searches since mismatches are discovered earlier after examining only a few characters. Hash tables always examine an entire key when searching.
+
+For details see [@http://www.cs.princeton.edu/~rs/strings/].
+]
+
+Here are some sample declarations:
+
+``
+ symbols<> sym;
+ symbols<short, wchar_t> sym2;
+
+ struct my_info
+ {
+ int id;
+ double value;
+ };
+
+ symbols<my_info> sym3;
+``
+
+After having declared our symbol tables, symbols may be added statically using the construct:
+
+``
+ sym = a, b, c, d ...;
+``
+
+where `sym` is a symbol table and `a..d etc. are strings. `
+
+[note Note that the comma operator is separating the items being added to the symbol table, through an assignment. Due to operator overloading this is possible and correct (though it may take a little getting used to) and is a concise way to initialize the symbol table with many symbols. Also, it is perfectly valid to make multiple assignments to a symbol table to iteratively add symbols (or groups of symbols) at different times.]
+
+Simple example:
+
+``
+ sym = "pineapple", "orange", "banana", "apple", "mango";
+``
+
+Note that it is invalid to add the same symbol multiple times to a symbol table, though you may modify the value associated with a symbol artibrarily many times.
+
+Now, we may use sym in the grammar. Example:
+
+``
+ fruits = sym >> *(',' >> sym);
+``
+
+Alternatively, symbols may be added dynamically through the member functor add (see `[link __symbol_inserter__]` below). The member functor add may be attached to a parser as a semantic action taking in a begin/end pair:
+
+``
+ p[sym.add]
+``
+
+where `p` is a parser (and `sym` is a symbol table). On success, the matching portion of the input is added to the symbol table.
+
+`add` may also be used to directly initialize data. Examples:
+
+``
+ sym.add("hello", 1)("crazy", 2)("world", 3);
+``
+
+Assuming of course that the data slot associated with `sym` is an integer.
+
+The data associated with each symbol may be modified any time. The most obvious way of course is through semantic actions. A function or functor, as usual, may be attached to the symbol table. The symbol table expects a function or functor compatible with the signature:
+
+[variablelist
+ [
+ [Signature for functions:]
+ [`void func(T& data);`]
+ ]
+ [
+ [Signature for functors:]
+ [`struct ftor
+ {
+ void operator()(T& data) const;
+ };`]
+
+Where `T` is the data type of the symbol table (the `T` in its template parameter list). When the symbol table successfully matches something from the input, the data associated with the matching entry in the symbol table is reported to the semantic action.
+ ]
+]
+
+[h3:utilities Symbol table utilities]
+
+Sometimes, one may wish to deal with the symbol table directly. Provided are some symbol table utilities.
+
+[variablelist
+ [
+ [`add`]
+ [`template <typename T, typename CharT, typename SetT>
+ T* add(symbols<T, CharT, SetT>& table, CharT const* sym, T const& data = T());`
+
+Adds a symbol `sym` (C string) to a symbol table table plus an optional data data associated with the symbol. Returns a pointer to the data associated with the symbol or `NULL` if add failed (e.g. when the symbol is already added before).]
+ ]
+ [`find`]
+ [`template <typename T, typename CharT, typename SetT>
+ T* find(symbols<T, CharT, SetT> const& table, CharT const* sym);`
+
+Finds a symbol `sym` (C string) from a symbol table table. Returns a pointer to the data associated with the symbol or `NULL` if not found.
+ ]
+]
+
+[h3 `symbol_inserter`]
+
+The `symbols` class holds an instance of this class named `add`. This can be called directly just like a member function, passing in a first/last iterator and optional data:
+
+``
+ sym.add(first, last, data);
+``
+
+Or, passing in a C string and optional data:
+
+``
+ sym.add(c_string, data);
+``
+
+where `sym` is a symbol table. The `data` argument is optional. The nice thing about this scheme is that it can be cascaded. We've seen this applied above. Here's a snippet from the [@../../example/fundamental/roman_numerals.cpp roman numerals parser]:
+
+``
+ // Parse roman numerals (1..9) using the symbol table.
+
+ struct ones : symbols<unsigned>
+ {
+ ones()
+ {
+ add
+ ("I" , 1)
+ ("II" , 2)
+ ("III" , 3)
+ ("IV" , 4)
+ ("V" , 5)
+ ("VI" , 6)
+ ("VII" , 7)
+ ("VIII" , 8)
+ ("IX" , 9)
+ ;
+ }
+
+ } ones_p;
+``
+
+Notice that a user defined struct ones is subclassed from `symbols. Then at construction time, we added all the symbols using the `add symbol_inserter.
+
+The full source code can be` [@../../example/fundamental/roman_numerals.cpp viewed here]. This is part of the Spirit distribution.
+
+Again, `add` may also be used as a semantic action since it conforms to the action interface (see [link __semantic actions__)]):
+
+``
+ p[sym.add]
+``
+
+where `p` is a parser of course.
+
+[endsect][/ symbols]
+

Added: sandbox/boost_docs/branches/spirit_qbking/doc/src/trees.qbk
==============================================================================
--- (empty file)
+++ sandbox/boost_docs/branches/spirit_qbking/doc/src/trees.qbk 2007-10-14 20:56:27 EDT (Sun, 14 Oct 2007)
@@ -0,0 +1,750 @@
+[/
+/ Copyright © 1998-2003 Joel de Guzman
+/ Copyright © 2007 Darren Garvey
+/
+/ Distributed under the Boost Software License, Version 1.0. (See accompanying
+/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+/]
+
+[section Trees]
+
+[section:why Why use parse trees]
+
+Parse trees are an in-memory representation of the input with a structure that conforms to the grammar.
+
+The advantages of using parse trees instead of semantic actions:
+
+* You can make multiple passes over the data without having to re-parse the input.
+* You can perform transformations on the tree.
+* You can evaluate things in any order you want, whereas with attribute schemes you have to process in a begin to end fashion.
+* You do not have to worry about backtracking and action side effects that may occur with an ambiguous grammar.
+
+[h3 Example]
+
+Now that you think you may want to use trees, I'll give an example of how to use them and you can see how easy they are to use. So, following with tradition (and the rest of the documentation) we'll do a calculator. Here's the grammar:
+
+```
+ integer
+ = token_node_d[ (!ch_p('-') >> +digit_p) ]
+ ;
+
+ factor
+ = integer
+ | '(' >> expression >> ')'
+ | ('-' >> factor)
+ ;
+
+ term
+ = factor
+ >> *( ('*' >> factor)
+ | ('/' >> factor)
+ )
+ ;
+
+ expression
+ = term
+ >> *( ('+' >> term)
+ | ('-' >> term)
+ )
+ ;
+``
+
+Now, you'll notice the only thing different in this grammar is the `token_node_d` directive. This causes the match of the integer rule to end up in one node. Without `token_node_d`, each character would get it's own node. Further note that `token_node_d` is an implicit lexeme (that means no `lexeme_d` is needed to switch to character level parsing). As you'll soon see, it's easier to convert the input into an `int` when all the characters are in one node. Here is how the parse is done to create a tree:
+
+``
+ tree_parse_info<> info = pt_parse(first, expression);
+``
+
+`pt_parse()` is similar to `parse()`. There are four overloads: two for pairs of first and last iterators and two for character strings. Two of the functions take a skipper parser and the other two do not.
+
+The `tree_parse_info` struct contains the same information as a `parse_info` struct as well as one extra data member called trees. When the parse finishes, trees will contain the parse tree.
+
+Here is how you can use the tree to evaluate the input:
+
+``
+ if (info.full)
+ {
+ cout << "parsing succeeded\n";
+ cout << "result = " << evaluate(info) << "\n\n";
+ }
+``
+
+Now you ask, where did `evaluate()` come from? Is it part of spirit? Unfortunately, no, `evaluate()` is only a part of the sample. Here it is:
+
+``
+ long evaluate(const tree_parse_info<>& info)
+ {
+ return eval_expression(info.trees.begin());
+ }
+``
+
+So here you see `evaluate()` simply calls `eval_expression()` passing the `begin()` iterator of `info.trees`. Now here's the rest of the example:
+
+``
+ // Here's some typedefs to simplify things
+ typedef char const* iterator_t;
+ typedef tree_match<iterator_t> parse_tree_match_t;
+ typedef parse_tree_match_t::const_tree_iterator iter_t;
+
+ // Here's the function prototypes that we'll use. One function for each
+ // grammar rule.
+ long evaluate(const tree_parse_info<>& info);
+ long eval_expression(iter_t const& i);
+ long eval_term(iter_t const& i);
+ long eval_factor(iter_t const& i);
+ long eval_integer(iter_t const& i);
+
+ // i should be pointing to a node created by the expression rule
+ long eval_expression(iter_t const& i)
+ {
+ // first child points to a term, so call eval_term on it
+ iter_t chi = i->children.begin();
+ long lhs = eval_term(chi);
+ for (++chi; chi != i->children.end(); ++chi)
+ {
+ // next node points to the operator. The text of the operator is
+ // stored in value (a vector<char>)
+ char op = *(chi->value.begin());
+ ++chi;
+ long rhs = eval_term(chi);
+ if (op == '+')
+ lhs += rhs;
+ else if (op == '-')
+ lhs -= rhs;
+ else
+ assert(0);
+ }
+ return lhs;
+ }
+
+ long eval_term(iter_t const& i)
+ {
+ // ... see ``[@../../example/fundamental/parse_tree_calc1.cpp parse_tree_calc1.cpp]`` for complete example
+ // (it's rather similar to eval_expression() ) ...
+ }
+
+ long eval_factor(iter_t const& i)
+ {
+ // ... again, see ``[@../../example/fundamental/parse_tree_calc1.cpp parse_tree_calc1.cpp]`` if you want all the details ...
+ }
+
+ long eval_integer(iter_t const& i)
+ {
+ // use the range constructor for a string
+ string integer(i->value.begin(), i->value.end());
+ // convert the string to an integer
+ return strtol(integer.c_str(), 0, 10);
+ }
+``
+
+[tip
+The full source code can be [@../../example/fundamental/parse_tree_calc1.cpp viewed here]. This is part of the Spirit distribution.
+]
+
+So, you like what you see, but maybe you think that the parse tree is too hard to process? With a few more directives you can generate an abstract syntax tree (ast) and cut the amount of evaluation code by at least [*50%]. So without any delay, here's the ast calculator grammar:
+
+``
+ integer
+ = leaf_node_d[ (!ch_p('-') >> +digit_p) ]
+ ;
+
+ factor
+ = integer
+ | inner_node_d[ch_p('(') >> expression >> ch_p(')')]
+ | (root_node_d[ch_p('-')] >> factor)
+ ;
+
+ term
+ = factor
+ >> *( (root_node_d[ch_p('*')] >> factor)
+ | (root_node_d[ch_p('/')] >> factor)
+ )
+ ;
+
+ expression
+ = term
+ >> *( (root_node_d[ch_p('+')] >> term)
+ | (root_node_d[ch_p('-')] >> term)
+ )
+ ;
+``
+
+The differences from the parse tree grammar are hi-lighted in bold-red. The `inner_node_d` directive causes the first and last nodes generated by the enclosed parser to be discarded, since we don't really care about the parentheses when evaluating the expression. The `root_node_d` directive is the key to ast generation. A node that is generated by the parser inside of `root_node_d` is marked as a root node. When a root node is created, it becomes a root or parent node of the other nodes generated by the same rule.
+
+To start the parse and generate the ast, you must use the `ast_parse` functions, which are similar to the `pt_parse` functions.
+
+``
+ tree_parse_info<> info = ast_parse(first, expression);
+``
+
+Here is the `eval_expression` function (note that to process the ast we only need one function instead of four):
+
+``
+ long eval_expression(iter_t const& i)
+ {
+ if (i->value.id() == parser_id(&integer))
+ {
+ // convert string to integer
+ string integer(i->value.begin(), i->value.end());
+ return strtol(integer.c_str(), 0, 10);
+ }
+ else if (i->value.id() == parser_id(&factor))
+ {
+ // factor can only be unary minus
+ return - eval_expression(i->children.begin());
+ }
+ else if (i->value.id() == parser_id(&term))
+ {
+ if (*i->value.begin() == '*')
+ {
+ return eval_expression(i->children.begin()) *
+ eval_expression(i->children.begin()+1);
+ }
+ else if (*i->value.begin() == '/')
+ {
+ return eval_expression(i->children.begin()) /
+ eval_expression(i->children.begin()+1);
+ }
+ }
+ else if (i->value.id() == parser_id(&expression))
+ {
+ if (*i->value.begin() == '+')
+ {
+ return eval_expression(i->children.begin()) +
+ eval_expression(i->children.begin()+1);
+ }
+ else if (*i->value.begin() == '-')
+ {
+ return eval_expression(i->children.begin()) -
+ eval_expression(i->children.begin()+1);
+ }
+ }
+
+ return 0;
+ }
+``
+
+[tip
+An entire working example is [@../../example/fundamental/ast_calc.cpp ast_calc.cpp]. Hopefully this example has been enough to whet your appetite for trees. For more nitty-gritty details, keep on reading the rest of this chapter.
+]
+
+[endsect][/ why]
+
+[section Usage]
+
+[section `pt_parse`]
+
+To create a parse tree, you can call one of the five free functions:
+
+``
+ template <typename FactoryT, typename IteratorT, typename ParserT, typename SkipT>
+ tree_parse_info<IteratorT, FactoryT>
+ pt_parse(
+ IteratorT const& first_,
+ IteratorT const& last_,
+ parser<ParserT> const& parser,
+ SkipT const& skip_,
+ FactoryT const & factory_ = FactoryT());
+ template <typename IteratorT, typename ParserT, typename SkipT>
+ tree_parse_info<IteratorT>
+ pt_parse(
+ IteratorT const& first_,
+ IteratorT const& last_,
+ parser<ParserT> const& parser,
+ SkipT const& skip_);
+ template <typename IteratorT, typename ParserT>
+ tree_parse_info<IteratorT>
+ pt_parse(
+ IteratorT const& first_,
+ IteratorT const& last_,
+ parser<ParserT> const& parser);
+ template <typename CharT, typename ParserT, typename SkipT>
+ tree_parse_info<CharT const*>
+ pt_parse(
+ CharT const* str,
+ parser<ParserT> const& parser,
+ SkipT const& skip);
+ template <typename CharT, typename ParserT>
+ tree_parse_info<CharT const*>
+ pt_parse(
+ CharT const* str,
+ parser<ParserT> const& parser);
+``
+
+[endsect][/ pt_parse]
+
+[section `ast_parse`]
+
+To create an abstract syntax tree (ast for short) you call one of the five free functions:
+
+``
+ template <typename FactoryT, typename IteratorT, typename ParserT, typename SkipT>
+ tree_parse_info<IteratorT, FactoryT>
+ ast_parse(
+ IteratorT const& first_,
+ IteratorT const& last_,
+ parser<ParserT> const& parser,
+ SkipT const& skip_,
+ FactoryT const & factory_ = FactoryT());
+ template <typename IteratorT, typename ParserT, typename SkipT>
+ tree_parse_info<IteratorT>
+ ast_parse(
+ IteratorT const& first_,
+ IteratorT const& last_,
+ parser<ParserT> const& parser,
+ SkipT const& skip_);
+ template <typename IteratorT, typename ParserT>
+ tree_parse_info<IteratorT>
+ ast_parse(
+ IteratorT const& first_,
+ IteratorT const& last,
+ parser<ParserT> const& parser);
+ template <typename CharT, typename ParserT, typename SkipT>
+ tree_parse_info<CharT const*>
+ ast_parse(
+ CharT const* str,
+ parser<ParserT> const& parser,
+ SkipT const& skip);
+ template <typename CharT, typename ParserT>
+ tree_parse_info<CharT const*>
+ ast_parse(
+ CharT const* str,
+ parser<ParserT> const& parser);
+``
+
+[endsect][/ ast_parse]
+
+[section `tree_parse_info`]
+
+The `tree_parse_info` struct returned from `pt_parse` and `ast_parse` contains information about the parse:
+
+``
+ template <typename IteratorT = char const*>
+ struct tree_parse_info
+ {
+ IteratorT stop;
+ bool match;
+ bool full;
+ std::size_t length;
+
+ typename tree_match<IteratorT>::container_t trees;
+ };
+``
+
+[table `tree_parse_info`
+ [[`stop`] [points to the final parse position (i.e. parsing processed the input up to this point).]]
+
+ [[`match`] [`true` if parsing is successful. This may be full (the parser consumed all the input), or partial (the parser consumed only a portion of the input.)]]
+
+ [[`full`] [`true` when we have a full match (when the parser consumed all the input).]]
+
+ [[`length`] [The number of characters consumed by the parser. This is valid only if we have a successful match (either partial or full).]]
+
+ [[`trees`] [Contains the root node(s) of the tree.]]
+]
+
+[endsect][/ tree_parse_info]
+
+[section `tree_match`]
+
+When Spirit is generating a tree, the parser's `parse()` member function will return a `tree_match` object, instead of a `match` object. `tree_match` has three template parameters. The first is the `Iterator` type which defaults to `char const*`. The second is the node factory, which defaults to `node_val_data_factory`. The third is the attribute type stored in the match. A `tree_match` has a member variable which is a container (a `std::vector`) of `tree_node` objects named trees. For efficiency reasons, when a `tree_match` is copied, the trees are not copied, they are moved to the new object, and the source object is left with an empty tree container. `tree_match` supports the same interface as the match class: it has an operator `bool()` so you can test it for a sucessful match: `if (matched)`, and you can query the match length via the `length()` function. The class has this interface:
+
+``
+ template <typename IteratorT = char const*, typename NodeFactoryT = node_val_data_factory<> >
+ struct tree_match
+ {
+ typedef typename NodeFactoryT::template factory<IteratorT> node_factory_t;
+ typedef typename node_factory_t::node_t parse_node_t;
+ typedef tree_node<parse_node_t> node_t;
+ typedef typename node_t::children_t container_t;
+ typedef typename container_t::iterator tree_iterator;
+ typedef typename container_t::const_iterator const_tree_iterator;
+
+ tree_match();
+ tree_match(std::size_t length, parse_node_t const& n);
+ tree_match(tree_match const& x);
+ explicit tree_match(match const& x);
+ tree_match& operator=(tree_match const& x);
+ void swap(tree_match& x);
+ operator bool() const;
+ int length() const;
+
+ container_t trees;
+ };
+``
+
+When a parse has sucessfully completed, the trees data member will contain the root node of the tree.
+
+[note vector?
+
+You may wonder, why is it a vector then? The answer is that it is partly for implementation purposes, and also if you do not use any rules in your grammar, then trees will contain a sequence of nodes that will not have any children.
+]
+
+Having spirit create a tree is similar to how a normal parse is done:
+
+``
+ tree_match<> hit = expression.parse(tree_scanner);
+
+ if (hit)
+ process_tree_root(hit.trees[0]); // do something with the tree
+``
+
+[endsect][/ tree_match]
+
+[section `tree_node`]
+
+Once you have created a tree by calling `pt_parse` or `ast_parse`, you have a `tree_parse_info` which contains the root node of the tree, and now you need to do something with the tree. The data member trees of `tree_parse_info` is a `std::vector<tree_node>`. `tree_node` provides the tree structure. The class has one template parameter named `T`. `tree_node` contains an instance of type `T`. It also contains a `std::vector<tree_node<T> >` which are the node's children. The class looks like this:
+
+``
+ template <typename T>
+ struct tree_node
+ {
+ typedef T parse_node_t;
+ typedef std::vector<tree_node<T> > children_t;
+ typedef typename children_t::iterator tree_iterator;
+ typedef typename children_t::const_iterator const_tree_iterator;
+
+ T value;
+ children_t children;
+
+ tree_node();
+ explicit tree_node(T const& v);
+ tree_node(T const& v, children_t const& c);
+ void swap(tree_node<T>& x);
+ };
+``
+
+This class is simply used to separate the tree framework from the data stored in the tree. It is a generic node and any type can be stored inside it and acessed via the data member value. The default type for `T` is `node_val_data`.
+
+[endsect][/ tree_node]
+
+[section `node_val_data`]
+
+The `node_val_data` class contains the actual information about each node. This includes the text or token sequence that was parsed, an id that indicates which rule created the node, a boolean flag that indicates whether the node was marked as a root node, and an optional user-specified value. This is the interface:
+
+``
+ template <typename IteratorT = char const*, typename ValueT = nil_t>
+ struct node_val_data
+ {
+ typedef typename std::iterator_traits<IteratorT>::value_type value_type;
+ typedef std::vector<value_type> container_t;
+ typedef typename container_t::iterator iterator_t;
+ typedef typename container_t::const_iterator const_iterator_t;
+
+ node_val_data();
+ node_val_data(IteratorT const& _first, IteratorT const& _last);
+ template <typename IteratorT2>
+ node_val_data(IteratorT2 const& _first, IteratorT2 const& _last);
+ void swap(node_val_data& x);
+
+ container_t::iterator begin();
+ container_t::const_iterator begin() const;
+ container_t::iterator end();
+ container_t::const_iterator end() const;
+
+ bool is_root() const;
+ void is_root(bool b);
+
+ parser_id id() const;
+ void id(parser_id r);
+
+ ValueT const& value() const;
+ void value(ValueT const& v);
+ };
+``
+
+[endsect][/ node_val_data]
+
+[section:parser_id `parser_id`, checking and setting]
+
+If a node is generated by a rule, it will have an id set. Each rule has an id that it sets of all nodes generated by that rule. The id is of type `parser_id`. The default id of each rule is set to the address of that rule (converted to an integer). This is not always the most convenient, since the code that processes the tree may not have access to the rules, and may not be able to compare addresses. So, you can override the default id used by each rule by giving it a specific ID. Then, when processing the tree you can call `node_val_data::id()` to see which rule created that node.
+
+[endsect][/ parser_id]
+
+[section structure/layout of a parse tree]
+
+[section parse tree layout]
+
+The tree is organized by the rules. Each rule creates a new level in the tree. All parsers attached to a rule create a node when a sucessful match is made. These nodes become children of a node created by the rule. So, the following code:
+
+``
+ rule_t myrule = ch_p('a') >> ',' >> *ch_p('b');
+ char const* input = "a,bb";
+ scanner_t scanner(input, input + strlen(input));
+ tree_match<> m = myrule.parse(scanner);
+``
+
+When executed, this code would return a `tree_match`, `m.trees[0]` would contain a tree like this:
+
+[$../theme/trees1.png]
+
+The root node would not contain any text, and it's id would be set to the address of `myrule`. It would have four children. Each child's id would be set to the address of `myrule`, would contain the text as shown in the diagram, and would have no children.
+
+[endsect]
+
+[section ast layout]
+
+When calling `ast_parse`, the tree gets generated differently. It mostly works the same as when generating a parse tree. The difference happens when a rule only generated one sub-node. Instead of creating a new level, `ast_parse` will not create a new level, it will just leave the existing node. So, this code:
+
+``
+ rule_t myrule = ch_p('a');
+ char const* input = "a";
+ ast_scanner_t scanner(input, input+strlen(input));
+ tree_match<> m = myrule.parse(scanner);
+``
+
+will generate a single node that contains `'a'`. If `tree_match_policy` had been used instead of `ast_match_policy`, the tree would have looked like this:
+
+[$../theme/trees2.png]
+
+`ast_match_policy` has the effect of eliminating intermediate rule levels which are simply pass-through rules. This is not enough to generate abstract syntax trees, `root_node_d` is also needed. `root_node_d` will be explained later.
+
+[endsect][/ ast layout]
+
+[section:switching switching: `gen_pt_node_d[]` & `gen_ast_node_d[]`]
+
+If you want to mix and match the parse tree and ast behaviors in your application, you can use the `gen_pt_node_d[]` and `gen_ast_node_d[]` directives. When parsing passes through the `gen_pt_node_d` directive, parse tree creation behavior is turned on. When the `gen_ast_node_d` directive is used, the enclosed parser will generate a tree using the ast behavior. Note that you must pay attention to how your rules are declared if you use a rule inside of these directives. The match policy of the scanner will have to correspond to the desired behavior. If you avoid rules and use primitive parsers or grammars, then you will not have problems.
+
+[endsect][/ switching]
+
+[section Directives]
+
+There are a few more directives that can be used to control the generation of trees. These directives only effect tree generation. Otherwise, they have no effect.
+
+[h3 `no_node_d`]
+
+This directive is similar to `gen_pt_node_d` and `gen_ast_node_d`, in that is modifies the scanner's match policy used by the enclosed parser. As it's name implies, it does no tree generation, it turns it off completely. This is useful if there are parts of your grammar which are not needed in the tree. For instance: keywords, operators (`*, -, &&`, etc.) By eliminating these from the tree, both memory usage and parsing time can be lowered. This directive has the same requirements with respect to rules as `gen_pt_node_d` and `gen_ast_node_d` do. See the [*example file] [@../../example/application/xml xml_grammar.hpp] (in libs/spirit/example/application/xml directory) for example usage of `no_node_d[]`.
+
+[h3 `discard_node_d`]
+
+This directive has a similar purpose to `no_node_d`, but works differently. It does not switch the scanner's match policy, so the enclosed parser still generates nodes. The generated nodes are discarded and do not appear in the tree. Using `discard_node_d` is slower than `no_node_d`, but it does not suffer from the drawback of having to specify a different rule type for any rule inside it.
+
+[h3 `leaf_node_d/token_node_d`]
+
+Both `leaf_node_t` and `token_node_d` work the same. They create a single node for the match generated by the enclosed parser. Unlike with earlier versions of Spirit, this directive is an implicit lexeme and alters the scanner (see [link __scanner_business__]).
+
+[h3 `reduced_node_d`]
+
+This directive groups together all the nodes generated by the enclosed parser. For earlier versions of Spirit `leaf_node_d` and `token_node_d` were implemented this way. The new implementation of those directives is a lot faster, so `reduced_node_d` is primarily provided for portability and can be useful when using a custom node factory (see [link ___advanced tree generation___], below).
+
+[h3 `infix_node_d`]
+
+This is useful for removing separators from lists. It discards all the nodes in even positions. Thus this rule:
+
+``
+ rule_t intlist = infix_node_d[ integer >> *(',' >> integer) ];
+``
+
+would discard all the comma nodes and keep all the integer nodes.
+
+[h3 `discard_first_node_d`]
+
+This discards the first node generated.
+
+[h3 `discard_last_node_d`]
+
+This discards the last node generated.
+
+[h3 `inner_node_d`]
+
+This discards the first and last node generated.
+
+[section:ast_generation `root_node_d and ast generation]
+
+The ``root_node_d` directive is used to help out ast generation. It has no effect when generating a parse tree. When a parser is enclosed in `root_node_d`, the node it generates is marked as a root. This affects the way it is treated when it's added to the list of nodes being generated. Here's an example:
+
+``
+ rule_t add = integer >> *(root_node_d[ ch_p('+') ] >> integer);
+``
+
+When parsing `5+6` the following tree will be generated:
+
+[$../theme/trees3.png]
+
+When parsing `1+2+3` the following will be generated:
+
+[$../theme/trees4.png]
+
+When a new node is created the following rules are used to determine how the tree will be generated:
+
+[pre
+ Let a be the previously generated node.
+ Let b be the new node.
+
+ If b is a root node then
+
+ b's children become a + b's previous children.
+ a is the new first child of b.
+
+ else if a is a root node and b is not, then
+
+ b becomes the last child of a.
+
+ else
+
+ a and b become siblings.
+]
+
+After parsing leaves the current rule, the root node flag on the top node is turned off. This means that the `root_node_d` directive only affects the current rule.
+
+[tip
+The example [@../../example/fundamental/ast_calc.cpp ast_calc.cpp]` demonstrates the use of` `root_node_d` and `ast_parse`. The full source code can be [@../../example/fundamental/ast_calc.cpp viewed here]. This is part of the Spirit distribution.
+]
+
+[endsect][/ ast_generation]
+
+[section `parse_tree_iterator`]
+
+The `parse_tree_iterator` class allows you to parse a tree using spirit. The class iterates over the tokens in the leaf nodes in the same order they were created. The `parse_tree_iterator` is templated on `ParseTreeMatchT`. It is constructed with a container of trees, and a position to start. Here is an example usage:
+
+``
+ rule_t myrule = ch_p('a');
+ char const* input = "a";
+
+ // generate parse tree
+ tree_parse_info<> i = pt_parse(input, myrule);
+
+ typedef parse_tree_iterator<tree_match<> > parse_tree_iterator_t;
+
+ // create a first and last iterator to work off the tree
+ parse_tree_iterator_t first(i.trees, i.trees.begin());
+ parse_tree_iterator_t last;
+
+ // parse the tree
+ rule<parse_tree_iterator_t> tree_parser =...
+ tree_parser.parse(first, last);
+``
+
+[section:advanced advanced tree generation]
+
+[h3 node value]
+
+The `node_val_data` can contain a value. By default it contains a void_t, which is an empty class. You can specify the type, using a template parameter, which will then be stored in every node. The type must be default constructible, and assignable. You can get and set the value using
+
+``
+ ValueT node_val_data::value;
+``
+
+and
+
+``
+ void node_val_data::value(Value const& value);
+``
+
+To specify the value type, you must use a different `node_val_data` factory than the default. The following example shows how to modify the factory to store and retrieve a double inside each node_val_data.
+
+``
+ typedef node_val_data_factory<double> factory_t;
+ my_grammar gram;
+ my_skip_grammar skip;
+ tree_parse_info<iterator_t, factory_t> i =
+ ast_parse<factory_t>(first, last, gram, skip);
+ // access the double in the root node
+ double d = i.trees.begin()->value;
+``
+
+[h3 access_node_d`]
+
+Now, you may be wondering, "What good does it do to have a value I can store in each node, but not to have any way of setting it?" Well, that's what` `access_node_d` is for. `access_node_d` is a directive. It allows you to attach an action to it, like this:
+
+``
+ access_node_d[...some parsers...][my_action()]
+``
+
+The attached action will be passed 3 parameters: A reference to the root node of the tree generated by the parser, and the current first and last iterators. The action can set the value stored in the node.
+
+[section:factories Tree node factories]
+
+By setting the factory, you can control what type of nodes are created and how they are created. There are 3 predefined factories: `node_val_data_factory`, `node_all_val_data_factory`, and `node_iter_data_factory`. You can also create your own factory to support your own node types.
+
+Using factories with grammars is quite easy, you just need to specify the factory type as explicit template parameter to the free `ast_parse` function:
+
+``
+ typedef node_iter_data_factory<int> factory_t;
+ my_grammar gram;
+ my_skip_grammar skip;
+ tree_parse_info<iterator_t, factory_t> i =
+ ast_parse<factory_t>(first, last, gram, skip);
+``
+
+Instead, using the factory directly with rules is slightly harder because the factory is a template parameter to the scanner match policy, so you must use a custom scanner:
+
+``
+ typedef spirit::void_t value_t;
+ typedef node_val_data_factory<value_t> factory_t;
+ typedef tree_match<iterator_t, factory_t> match_t;
+ typedef ast_match_policy<iterator_t, factory_t> match_policy_t;
+ typedef scanner<iterator_t, scanner_policies<iter_policy_t, match_policy_t> > scanner_t;
+ typedef rule<scanner_t> rule_t;
+
+ rule_t r =...;
+
+ scanner_t scan = scanner_t(first, last);
+ match_t hit = r.parse(scan);
+``
+
+[h3 `node_val_data_factory`]
+
+This is the default factory. It creates `node_val_data` nodes. Leaf nodes contain a copy of the matched text, and intermediate nodes don't. `node_val_data_factory` has one template parameter: `ValueT`. `ValueT` specifies the type of value that will be stored in the node_val_data.
+
+[h3 `node_all_val_data_factory`]
+
+This factory also creates `node_val_data`. The difference between it and `node_val_data_factory` is that every node contains all the text that spans it. This means that the root node will contain a copy of the entire parsed input sequence. `node_all_val_data_factory` has one template parameter: `ValueT`. `ValueT` specifies the type of value that will be stored in the `node_val_data`.
+
+[h3 `node_iter_data_factory`]
+
+This factory creates the `parse_tree_iter_node`. This node stores iterators into the input sequence instead of making a copy. It can use a lot less memory. However, the input sequence must stay valid for the life of the tree, and it's not a good idea to use the `multi_pass` iterator with this type of node. All levels of the tree will contain a begin and end iterator. `node_iter_data_factory` has one template parameter: `ValueT`. `ValueT` specifies the type of value that will be stored in the `node_val_data`.
+
+[h3 custom]
+
+You can create your own factory. It should look like this:
+
+``
+ class my_factory
+ {
+ public:
+
+ // This inner class is so that the factory can simulate
+ // a template template parameter
+
+ template <typename IteratorT>
+ class factory
+ {
+ public:
+
+ // This is your node type
+ typedef my_node_type node_t;
+
+ static node_t create_node(
+ IteratorT const& first, IteratorT const& last, bool is_leaf_node)
+ {
+ // create a node and return it.
+ }
+
+ // This function is used by the reduced_node directive.
+ // If you don't use it, then you can leave this function
+ // unimplemented.
+
+ template <typename ContainerT>
+ static node_t group_nodes(ContainerT const& nodes)
+ {
+ // Group all the nodes into one and return it.
+ }
+ };
+ };
+
+ // Typedefs to use my_factory
+ typedef my_factory factory_t;
+ typedef tree_match<iterator_t, factory_t> match_t;
+ typedef tree_match_policy<iterator_t, factory_t> match_policy_t;
+
+ // Typedefs if you are using rules instead of grammars
+ typedef scanner<iterator_t, scanner_policies<iter_policy_t, match_policy_t> > scanner_t;
+ typedef rule<scanner_t> rule_t;
+``
+
+[endsect][/ factories]
+
+[endsect][/ advanced]
+
+[endsect][/ trees]


Boost-Commit list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk