|
Boost : |
From: Vladimir Prus (ghost_at_[hidden])
Date: 2001-05-30 09:51:32
1. Symbol representaion.
John Max Skaller wrote:
>What has subclassing got to do with it??
>There is No OO here. Terminal and Nonterminal are concrete
>template argument types.
We must distinguish between many various "Symbols" and "(Non)Terminals".
i) those user-written scanner want to return and those nonterminals
user reduction code deals with. "Semantical" value is likely stored here.
ii) those parser deals with. and
iii) those construction algorithms deal with
iv) those used to describe grammar to the construction algorithm.
They are different and these considerations apply:
i) We cannot limit this type of symbols in any way.
ii) and iii) we cannot choose -- it should be convenient number.
iv) This is the question. I think having separate classes for this purpose
is ok and even desirable. Suppose the library is to be used by standalone
parser generator. Storing information about where terminal and nonterminals
are defined in input file seems natural. On the other, side, I find using
i) here not very reasonable -- e.g. if i)::Terminal has methos line() and
column(), it is totally alien here, isn't it?
2. BNF
Douglas Gregor wrote:
>It is possible to consider a grammar as simply a grammar induced by a given
>start symbol. I am most interested in this definition (for the reason you
>mention below - reusing parts of parsers) because it seems most pure.
>I think the same thing of a tree. A 'tree' structure makes little sense to
me
>in that a 'tree' is merely the tree induced by a given root node. Any
>recursive data structure can be thought of in this way, and in my opinion it
>is more clean to think of a nonterminal/tree node/list node/etc in isolation
>instead of bringing in the larger context of a grammar.
>> 2. Elimitating BNF would require that each nonterminal know all the rules
>> for it. This will result in a net of pointers, and no single entity
>> responsible for deleting all that (or responsible for all that in
>> general).
>This net of pointers will have to be somewhere regardless. Why not in the
>place most local to its usage?
From theoretical view, this might be more clean. But consider:
i) every book defines grammar as
(Terminals, Nonterminals, Rules, Start Symbol).
why change it?
ii) using arbitrary nonterminal as start one is definetely advanced usage.
BNF is more suitable for baseline usase.
iii) With BNF net of pointers is much more simpler and BNF, which keeps
pointers to each Symbol, has no problems deleting it. Without BNF, you'd have
to use reference counting -- not simply shared_ptr, but full-scale garbage
collector, because of possible cycles.
3. Parser interface
>I mean that the parser should never know about the scanner. Instead, tokens
>should be pushed into the parser from some source(s). Again, I return to the
>Acceptor interface where the parser knows nothing about where its input
comes
>from, but merely takes tokens via an acceptor.
Now I understand. Then the scanner should know about types of terminal and
nonterminal. Right? The question is what type of terminal should be and how
scanner will turn it into terminal number. This settled, and Scanner template
parameter may be removed.
4. Polymorphism.
>> If I understood Douglas correctly, he proposes just the opposite :-).
>> I really don't think that runtime polymorphism will cost much -- it is not
>> used in the most critical parts.
>Actually, I'm agreeing with Joel on this :). We'd be better off with static
>(compile time) polymorphism as far as we can take it.
What's motivation? As I stated in point 1 above, Grammar::Symbol is
completely non performace critical. The only other place where 'virtual'
appears is 'reduction' method. Think we need some profiling there.
-- Regards, Vladimir
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk