|
Boost : |
From: Vladimir Prus (ghost_at_[hidden])
Date: 2001-05-25 06:30:29
Since there seem to be interest in automata building library, I suggest some
more detailed discussion.
I thinks that the more fundamential classes should be created first: BNF,
automaton, probably NFA/DFA. Those have foundation in computer science, and
hopefully will not depend on personal preferences as much as high level
classes. All the other suggestions can be evaluated later -- e.g. making
Acceptor interface for existing DFA is better than making Acceptor interface
primary.
Now to the point! I have implementation for LALR(1) and SLR(1), which at
least works, but prefer to discuss interface first. Here's first attempt:
namespace Grammar {
// This represent symbol from the high point of view -- essentially the name
class Symbol {
public:
std::string name;
virtual bool is_terminal() const = 0;
bool is_nontermonal() const;
bool operator==(const Symbol&) const;
bool operator!=(const Symbol&) const;
bool operator<(const Symbol&) const;
};
// In real life user will want to add some more information to Symbol,
// Terminal and Nonterminal -- subclassing will suffice, I think.
class Terminal : public Symbol {};
class Nonterminal : public Symbol {};
class BNF {
public:
class Rule {
public:
// Q which ctors needed?
Symbol* left;
std::vector<Symbol*> right;
};
std::vector<Rule> rules;
std::vector<Symbol*> terminals, nonterminals;
Symbol* grammar_start;
// Is it good to expose the variables above?
// If not, what is the best set of methods
// probably
// void add(auto_ptr<Terminal>)
// void add(auto_ptr<Nonterminal>)
// void add(const Rule&);
};
}
namespace LR {
// Represent parser/automaton view of symbol -- essentially a number
class Symbol {
bool is_terminal() const;
int terminal_number() const;
bool is_nonterminal() const;
int nonterminal_number() const;
bool is_eof() const; // ?
bool is_error() const; // ?
// kind of named ctor
static Symbol for_terminal(int t);
};
// Or another kind of named ctor
// class Terminal : public Symbol {};
class LR_automaton {
public:
LR_Action action(int state, const Symbol& s) const;
int goto(int state, const Symbol& s) const;
int rule_length(int rule) const;
};
void build_slr1(const Grammar::BNF&, LR_automaton&);
void build_lalr1(const Grammar::BNF&, LR_automaton&);
// What interface for reporting conflicts, priorities &c. ?
template<class Scanner>
class LR_parser {
public:
LR_parser(const LR_automaton&, Scanner&);
bool parse();
virtual void reduction(int rule);
};
// How to implement semantic values? Make it another template parameter and
// change reduction to
// SemanticValue* reduction(int rule, Iterator begin, Iterator end)?
}
Hope for futher disccussion and suggestions.
-- Regards, Vladimir
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk