Boost logo

Boost :

From: Joel de Guzman (joel_at_[hidden])
Date: 2004-12-26 21:34:38


Larry Evans wrote:
> On 12/25/2004 11:43 PM, Joel de Guzman wrote:

>> The problem is that each alternative can potentially have
>> different types. A production such as:
>
>
> Could you explain what this means. I'm guessing it means
> each alternate would have a different synthesized attribut
> calculated for it.

Hi Larry!

Ehm... no, the alternatives themselves have different types.
Consider the grammar:

    r = ch_p('a') | ch_p('b') >> ch_p('c');

the type of the left alternative is chlit<char>. The type of
the right alternative is sequence<chlit<char>, chlit<char> >.

>> a | b | c;

> Wouldn't the most "natural" attribute for this be a
> "named" variant which map contain the same type for
> two or more different "names"? IOW, instead of the
> variant currently in boost, the sum type in:
>
> sandbox/boost/indexed_types/sum.hpp

A variant. Yes, that's the plan. But that's a different matter.
Anyway. Now that you've mentioned it. Yes, a named
variant would be nice!

> would represent the result of parsing the alternative expression:
>
> a | b | c
>
> and the names could be 1,2,3 (all unsigned) or possibly names,
> such as elements of an enumeration:
>
> enum alt_abc{ alt_a, alt_b, alt_c};

Very nice!

>> where A, B and C are the types of a, b and c. So, now, when
>> creating the parser, you'll have to make a table of alternates
>> tuple<A, B, C> which dispatches properly based on information
>> known only at runtime: 1) the current input character and
>
>
> Of course, I understand how this must be only known at runtime.
>
>> 2) the first follow sets of A, B and C.
>>
> Now this could be calculated at compile time if there were no
> cycles in the productions. For example, the simplest case would
> be:
>
> A = 'a' A | ''
>
> i.e. a sequence of a's. The cycle, or recursion, is caused by
> the A on the rhs. Of course, we know the solution in this
> simple case, but other's would, I guess, couldn't be solved or
> would be too difficult to solve. (I'm really guessing just based
> on my inability to figure out how). The point is, some could be

That adds another complication. But then my point was that:

1) the dynamic nature of spirit parsers makes it impossible
    to compute the first/follow sets at compile time.

2) You use this first/follow information to
     A) manipulate and build a parse table in the form of a
        tuple with heterogeneous types (not an array or vector
        of homogeneous types) and
     B) make a prediction of which alternative to choose in a
        tuple<alternatives...>; so basically, you'll have a
        runtime variable N (index) based on the prediction, but
        being in a tuple, choosing the right alternative requires
        a static constant N (index) ( i.e. get<N>(alt)
        instead of alt[N] ).

The impedance mismatch between compile time and runtime is what
makes this so darned difficult.

> while other's might not be; hence, a more accurate statement
> would be:
> 2) some of the first and follow sets of A, B, and C could not be
> calculated at compile time or would be "too difficult" to
> calculate at compile time.

Yes!

Regards,

-- 
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk