|
Boost : |
From: Larry Evans (jcampbell3_at_[hidden])
Date: 2001-05-29 23:52:28
Douglas Gregor wrote:
> On Tuesday 29 May 2001 06:41 am, you wrote:
> > Douglas Gregor wrote:
> > >> I thinks that the more fundamential classes should be created first:
> > >> BNF, automaton, probably NFA/DFA.
> > >> [snip]
> > >> making
> > >> Acceptor interface for existing DFA is better than making Acceptor
> > >> interface primary.
> > >
> > >This depends on what one considers the primary purpose of an automaton. If
> > >the primary purpose is to take a string an either "accept" or "reject" it,
> > >then the primary interface we want to define is based on accept/reject. If
http://groups.yahoo.com/group/boost/message/12270 has an attachment
containing a pccts grammar with terminals TOK_SLC and TOK_PRED. As
explained in that post, I'd like the boost automata builder which
allows easy substitution of another language for these terminals, more
specifically, I'd like a "predicate expression" language to be
substituted for TOK_PRED and a "straight-line-code" language to be
substituted for TOK_SLC. This is an example of "substitution" as
explained in theorum 6.2 of _Introduction to Automata Theory,
Languages, and Computation_, by Hopcroft and Ullman, Addison-Wesley
(1979). In the following definitions make the subsequent quote
clearer:
CFL means the set of context free languages
Term means some terminal vocabulary
variable means non-terminal
<= means is subset of
G means some grammar
L(G) means the language defined by grammar G
The following is the proof of theorum 6.2 quoted from the book:
Let L be a CFL, L <= Term*, and for each a in Term, let L[a] be a
CFL. Let L be L(G) and for each a in Term let L[a] be L(G[a]).
Without loss of generality assume that the variables of G and the
G[a]'s are disjoint. Construct a grammar G' as follows. The
variables of G' are the variables of G and the G[a]'s; the terminals
of G' are the terminals of the G[a]'s. The start symbol of G' is
the start symbol of G. The productions of G' are all the
productions of the G[a]'s together with those productions formed by
taking a production A-> alpha of G and substituting S[a], the start
symbol of G[a] for each instance of an a in Term appearing in alpha.
My first guess on how to do this is implemented is that each grammar
class would look something like:
template<typename Terminals>
struct Lgrammar
: public Terminals
{ enum VariableIndex
{ Start
, V1
, V2
...
, End
};
Variable vars[End];
Lgrammar(void)
{ Terminals& terms = Terminals::vars
; vars[Start]
= vars[V1]
| vars[V3]
| terms[T1]
; vars[V1]
= vars[V5]
>> vars[V20]
>> terms[T1]
| terms[T6]
;}
};
Obviously using an index into a vector is more wordy than the Spirit
examples we've seen so far, and it also means that whatever a
Variables is, it must be a pointer or smart pointer.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk