Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2004-12-25 22:10:00


Joel de Guzman wrote:
> David Abrahams wrote:
>> Joel de Guzman wrote:
>>
>>>Using first-follow makes a parser deterministic, it can't be both.
>>>An LL(1) parser, for instance, doesn't do backtracking.
>>
>> Yes, but LL(1) is not synonymous with "recursive descent with first/follow."
>
> Hmmm... I don't recall mentioning that. Did I?

No. I'm arguing that the mere use of first/follow does not in and of
itself make a parser deterministic.

>>>>>typically with one symbol lookahead (LL(1)). A strictly predictive
>>>>>parser is a lot more restrictive in the grammars that one can use,
>>>>>however. There's no way to make it parse adhoc grammars like the
>>>>>C pre-processor or, ahem, c++. What I am inclined to do is to mix
>>>>>determinism with classic spirit's non-determinism. Exactly the
>>>>>recipe that you are alluding to.
>>>>
>>>>I'm not sure if it is or not -- I'm just getting more confuseder. ;^)
>>>
>>>Classic LL(1) does not have the "dynamic failure conditions" that
>>>you mention.
>>
>> Yes, I know that. But then, LL(1) is not synonymous with "recursive
>> descent with first/follow."
>
> Sure. But I don't see the point. What am I missing?

Well, I say "you could use first/follow to prune unproductive parses
while still using dynamic failure conditions to get nondeterminism," and
you come back with "Classic LL(1) doesn't have those." I assume you're
using that as an argument against my statement, but it's a hollow
argument because classic LL(1) is only one way to use first/follow with
recursive descent.

>>>NLL(1) grammars cannot have ambiguity. If the grammar
>>>is ambiguous, first follow computation will fail (first follow
>>>conflicts).
>>
>>
>> Hum? The algorithm for computing the first and follow sets of a symbol
>> don't have an "otherwise the computation fails" branch, and the
>> definitions of these sets are simple:
>>
>> first(a) = the set of tokens that can begin strings derived from a.
>> follow(N) = the set of tokens that can follow N in grammatical sequences
>>
>> You can use this information to eliminate hopeless branches of a parse
>> and avoid wasting effort backtracking without requiring the grammar to
>> be LL(1) or LL(2) or... even to be unambiguous at all.
>
> Yes. In theory.

So "first follow computation will fail" is false. BTW, I don't know
what NLL(1) means. Nondeterministic LL(1)? I can't even picture that.

>> Maybe we're talking about the same thing.
>
> Yes! That was what I was trying to point out from the start.
> However, it's not as simple as it seems.

Why not?

>> If you're talking about what I'm describing, I don't think there's
>> anything particularly radical about it, and I'd be very surprised if it
>> hadn't been done before.
>
> AFAIK, no. Correct me if I'm wrong.

I once sent you a comparative study of some NLP parsing strategies. One
of those was using first/follow computation and parsing ambiguous
grammars. I'm pretty sure they were using backtracking or that they
told me that they had thought of adding it.

>>>An alternative scheme is to let the user switch
>>>strategies (possibly using spirit directives).
>>>
>>>Another scheme, used by PCCTS, is to make the parser predictive
>>>by default, and allow the user to use syntactic and semantic
>>>predicates with arbitrary lookahead on specific points in
>>>the grammar.
>>
>>
>> Eewww, I smell a kludge ;-)
>
> Not really. There's a big problem with backtracking in general:
> backtracking and semantic actions do not mix well (e.g. false
> triggering).

It's not a problem if you build a parse tree and run the analysis on
that (which is sometimes required for synthesized attributes anyway).

> In a syntactic and semantic predicate, all actions
> are intentionally inhibited.

Have you considered the strategy of
http://docs.biostat.wustl.edu/cgi-bin/info2html?(bison.info.gz)GLR%2520Parsers
(http://tinyurl.com/694kq)
??

-- 
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk