Boost logo

Boost-Commit :

Subject: [Boost-commit] svn:boost r54082 - in trunk/libs/spirit/doc: . karma lex
From: hartmut.kaiser_at_[hidden]
Date: 2009-06-18 22:35:09


Author: hkaiser
Date: 2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
New Revision: 54082
URL: http://svn.boost.org/trac/boost/changeset/54082

Log:
Some work on Lexer docs
Text files modified:
   trunk/libs/spirit/doc/karma/quick_reference.qbk | 32 ++++----
   trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk | 56 --------------
   trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk | 150 ++++++++++++++++++++++++++++++++++++++++
   trunk/libs/spirit/doc/spirit2.qbk | 1
   4 files changed, 169 insertions(+), 70 deletions(-)

Modified: trunk/libs/spirit/doc/karma/quick_reference.qbk
==============================================================================
--- trunk/libs/spirit/doc/karma/quick_reference.qbk (original)
+++ trunk/libs/spirit/doc/karma/quick_reference.qbk 2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -150,44 +150,44 @@
 of the attribute of `a`, and `B` is the type of the attribute of `b`, then the
 type of the attribute of `a >> b` will be `tuple<A, B>`.
 
-[table /Spirit.Karma/ compound generator attribute types
+[table Spirit.Karma compound generator attribute types
     [[Expression] [Attribute]]
-
- [[sequence (`<<`)]
+
+ [[sequence (`<<`)]
 [``a: A, b: B --> (a << b): tuple<A, B>
 a: A, b: Unused --> (a << b): A
 a: Unused, b: B --> (a << b): B
 a: Unused, b: Unused --> (a << b): Unused
 a: A, b: A --> (a << b): vector<A>``
 ]]
-
- [[alternative (`|`)]
+
+ [[alternative (`|`)]
 [``a: A, b: B --> (a | b): variant<A, B>
 a: A, b: Unused --> (a | b): variant<Unused, A>
 a: Unused, b: B --> (a | b): variant<Unused, B>
 a: Unused, b: Unused --> (a | b): Unused``
 a: A, b: A --> (a | b): A`]]
-
- [[kleene (`*`)]
+
+ [[kleene (`*`)]
 [``a: A --> *a: vector<A>
 a: Unused --> a: Unused``]]
-
- [[plus (`+`)]
+
+ [[plus (`+`)]
 [``a: A --> +a: vector<A>
 a: Unused --> a: Unused``]]
-
- [[list (`%`)]
+
+ [[list (`%`)]
 [``a: A, b: B --> (a % b): vector<A>
 a: Unused, b: B --> (a % b): Unused``]]
-
- [[repetition]
+
+ [[repetition]
 [``a: A --> repeat(...,...)[a]: vector<A>
 a: Unused --> repeat(...,...)[a]: Unused``]]
-
- [[optional (`-`)]
+
+ [[optional (`-`)]
 [``a: A --> -a: optional<A>
 a: Unused --> -a: Unused``]]
-
+
     [[and predicate (`&`)] [`a: A --> &a: Unused`]]
     [[not predicate (`!`)] [`a: A --> !a: Unused`]]
 ]

Modified: trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk
==============================================================================
--- trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk (original)
+++ trunk/libs/spirit/doc/lex/lexer_quickstart2.qbk 2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -60,60 +60,8 @@
 associated with a token definition gets executed after the recognition of a
 matching input sequence. The code above uses function objects constructed using
 __phoenix2__, but it is possible to insert any C++ function or function object
-as long as it exposes the interface:
-
- void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id, Context& ctx);
-
-[variablelist where:
- [[`Iterator& start`] [This is a the iterator pointing to the begin of the
- matched range in the underlying input sequence. The
- type of the iterator is the same as specified while
- defining the type of the `lexertl_lexer<...>`
- (its first template parameter). The semantic action
- is allowed to change the value of this iterator
- influencing, the matched input sequence.]]
- [[`Iterator& end`] [This is a the iterator pointing to the end of the
- matched range in the underlying input sequence. The
- type of the iterator is the same as specified while
- defining the type of the `lexertl_lexer<...>`
- (its first template parameter). The semantic action
- is allowed to change the value of this iterator
- influencing, the matched input sequence.]]
- [[`pass_flag& matched`] [This value is pre/initialized to `pass_normal`.
- If the semantic action sets it to `pass_fail` the
- behaves as if the token has not been matched in
- the first place. If the semantic action sets this
- to `pass_ignore` the lexer ignores the current
- token and tries to match a next token from the
- input.]]
- [[`Idtype& id`] [This is the token id of the type Idtype (most of
- the time this will be a `std::size_t`) for the
- matched token. The semantic action is allowed to
- change the value of this token id, influencing the
- if of the created token.]]
- [[`Context& ctx`] [This is a reference to a lexer specific,
- unspecified type, providing the context for the
- current lexer state. It can be used to access
- different internal data items and is needed for
- lexer state control from inside a semantic
- action.]]
-]
-
-When using a C++ function as the semantic action the following prototypes are
-allowed as well:
-
- void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id);
- void f (Iterator& start, Iterator& end, pass_flag& matched);
- void f (Iterator& start, Iterator& end);
- void f ();
-
-Even if it is possible to write your own function object implementations (i.e.
-using Boost.Lambda or Boost.Bind), the preferred way of defining lexer semantic
-actions is to use __phoenix2__. In this case you can access the four parameters
-described in the table above by using the predefined __spirit__ placeholders:
-`_1` for the iterator range, `_2` for the token id, `_3` for the reference
-to the boolean value signaling the outcome of the semantic action, and `_4` for
-the reference to the internal lexer context.
+as long as it exposes the proper interface. For more details on please refer
+to the section __sec_lex_semactions__.
 
 [heading Associating Token Definitions with the Lexer]
 

Modified: trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk
==============================================================================
--- trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk (original)
+++ trunk/libs/spirit/doc/lex/lexer_semantic_actions.qbk 2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -7,4 +7,154 @@
 ===============================================================================/]
 
 [section:lexer_semantic_actions Lexer Semantic Actions]
+
+The main task of a lexer normally is to recognize tokens in the input.
+Traditionally this has been complemented with the possibility to execute
+arbitrary code whenever a certain token has been detected. __lex__ has been
+designed to support this mode of operation as well. We borrow from the concept
+of semantic actions for parsers (__qi__) and generators (__karma__). Lexer
+semantic actions may be attached to any token definition. These are C++
+functions or function objects that are called whenever a token definition
+successfully recognizes a portion of the input. Say you have a token definition
+`D`, and a C++ function `f`, you can make the lexer call `f` whenever it matches
+an input by attaching `f`:
+
+ D[f]
+
+The expression above links `f` to the token definition, `D`. The required
+prototype of `f` is:
+
+ void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id, Context& ctx);
+
+[variablelist where:
+ [[`Iterator& start`] [This is a the iterator pointing to the begin of the
+ matched range in the underlying input sequence. The
+ type of the iterator is the same as specified while
+ defining the type of the `lexertl_lexer<...>`
+ (its first template parameter). The semantic action
+ is allowed to change the value of this iterator
+ influencing, the matched input sequence.]]
+ [[`Iterator& end`] [This is a the iterator pointing to the end of the
+ matched range in the underlying input sequence. The
+ type of the iterator is the same as specified while
+ defining the type of the `lexertl_lexer<...>`
+ (its first template parameter). The semantic action
+ is allowed to change the value of this iterator
+ influencing, the matched input sequence.]]
+ [[`pass_flag& matched`] [This value is pre/initialized to `pass_normal`.
+ If the semantic action sets it to `pass_fail` the
+ behaves as if the token has not been matched in
+ the first place. If the semantic action sets this
+ to `pass_ignore` the lexer ignores the current
+ token and tries to match a next token from the
+ input.]]
+ [[`Idtype& id`] [This is the token id of the type Idtype (most of
+ the time this will be a `std::size_t`) for the
+ matched token. The semantic action is allowed to
+ change the value of this token id, influencing the
+ if of the created token.]]
+ [[`Context& ctx`] [This is a reference to a lexer specific,
+ unspecified type, providing the context for the
+ current lexer state. It can be used to access
+ different internal data items and is needed for
+ lexer state control from inside a semantic
+ action.]]
+]
+
+When using a C++ function as the semantic action the following prototypes are
+allowed as well:
+
+ void f (Iterator& start, Iterator& end, pass_flag& matched, Idtype& id);
+ void f (Iterator& start, Iterator& end, pass_flag& matched);
+ void f (Iterator& start, Iterator& end);
+ void f ();
+
+[heading The context of a lexer semantic action]
+
+The last parameter passed to any lexer semantic action is a reference to an
+unspecified type (see the `Context` type in the table above). This type is
+unspecified because it depends on and is implemented by the token type returned
+by the lexer. Nevertheless any context type is expected to expose a couple of
+functions allowing to influence the behavior of the lexer. The following table
+gives an overview and a short description of the available functionality.
+
+[table Functions exposed by any context passed to a lexer semantic action
+ [[Name] [Description]]
+ [[`Iterator const& get_eoi() const`]
+ [The function `get_eoi()` may be used by to access the end iterator of
+ the input stream the lexer has been initialized with]
+ ]
+ [[`Iterator const& less(Iterator const& it, int n) `]
+ [The function `less()` returns an iterator positioned to the nth input
+ character beyond the current start iterator (i.e. by passing the return
+ value to the parameter `end` it is possible to return all but the
+ first n characters of the current token back to the input stream.]
+ ]
+ [[`void more()`]
+ [The function `more()` tells the lexer that the next time it matches a
+ rule, the corresponding token should be appended onto the current token
+ value rather than replacing it.]
+ ]
+ [[`bool lookahead(std::size_t id)`]
+ [The function `lookahead()` can be for instance used to implement
+ lookahead for lexer engines not supporting constructs like flex' `a/b`
+ (match `a`, but only when followed by `b`). It invokes the lexer on the
+ input following the current token without actually moving forward in the
+ input stream. The function returns whether the lexer was able to match a
+ token with the given token-id `id`.]
+ ]
+ [[`std::size_t get_state() const` and `void set_state(std::size_t state)`]
+ [The functions `get_state()` and `set_state()` may be used to introspect
+ and change the current lexer state.]
+ ]
+]
+
+[heading Lexer Semantic Actions Using Phoenix]
+
+Even if it is possible to write your own function object implementations (i.e.
+using Boost.Lambda or Boost.Bind), the preferred way of defining lexer semantic
+actions is to use __phoenix2__. In this case you can access the parameters
+described above by using the predefined __spirit__ placeholders:
+
+[table Predefined Phoenix placeholders for lexer semantic actions
+ [[Placeholder] [Description]]
+ [[`_start`] [Refers to the iterator pointing to the begin of the
+ matched input sequence. Any modifications to this
+ iterator value will be reflected in the generated
+ token.]]
+ [[`_end`] [Refers to the iterator pointing past the end of the
+ matched input sequence. Any modifications to this
+ iterator value will be reflected in the generated
+ token.]]
+ [[`_pass`] [References the value signaling the outcome of the
+ semantic action. This is pre-initialized to
+ `lex::pass_flags::pass_normal`. If this is set to
+ `lex::pass_flags::pass_fail`, the lexer will behave as
+ if no token has been matched, if is set to
+ `lex::pass_flags::pass_ignore`, the lexer will ignore
+ the current match and proceed trying to match tokens
+ from the input.]]
+ [[`_tokenid`] [Refers to the token id of the token to be generated. Any
+ modifications to this value will be reflected in the
+ generated token.]]
+ [[`_state`] [Refers to the lexer state the input has been match in.
+ Any modifications to this value will be reflected in the
+ lexer itself (the next match will start in the new
+ state). The currently generated token is not affected
+ by changes to this variable.]]
+ [[`_eoi`] [References the end iterator of the overall lexer input.
+ This value cannot be changed.]]
+]
+
+[heading Support functions callable from semantic actions]
+
+[table Support functions
+ [[Plain function] [Phoenix function] [Description]]
+
+[[`ctx.more()`][lex::more()][]]
+[[`ctx.less()`][lex::less()][]]
+[[`ctx.lookahead()`][lex::lookahead()][]]
+
+]
+
 [endsect]

Modified: trunk/libs/spirit/doc/spirit2.qbk
==============================================================================
--- trunk/libs/spirit/doc/spirit2.qbk (original)
+++ trunk/libs/spirit/doc/spirit2.qbk 2009-06-18 22:35:08 EDT (Thu, 18 Jun 2009)
@@ -80,6 +80,7 @@
 [def __sec_lex_primitives__ [link spirit.lex.abstracts.lexer_primitives Lexer Primitives]]
 [def __sec_lex_tokenvalues__ [link spirit.lex.abstracts.lexer_primitives.lexer_token_values About Tokens and Token Values]]
 [def __sec_lex_attributes__ [link spirit.lex.abstracts.lexer_attributes Lexer Attributes]]
+[def __sec_lex_semactions__ [link spirit.lex.abstracts.lexer_semantic_actions Lexer Semantic Actions]]
 
 [def __sec_ref_lex_token__ [link spirit.lex.reference.concepts.token Token Reference]]
 [def __sec_ref_lex_token_def__ [link spirit.lex.reference.concepts.tokendef TokenDef Reference]]


Boost-Commit list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk