Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2004-05-27 11:25:03


Joel de Guzman wrote:

> Simply, "=" in Spirit should not be confused with assignment.
> This behavior in fact killed the pascal grammar. Consider this:
>
> rule<> a;
> rule<> b;
>
> a = b; // alias. a shares b
> // (points to nothing_p since b is still undefined)
>
> b = int_p; // now b is an int_p
>
> // at this point, a still refers a points to nothing_p
> // and did not (cannot) follow b's change.
>

Why is that confusing? That's how most other types in C++ behave:

int a,b; // a and b are undefined
a = b; // a and b are still undefined
b = 1; // a is undefined, b is 1

> Also, be wary of cycles. It is quite common that a rule references
> another rule which indirectly references the start rule, in a cycle.
> No, share_ptr can't handle this common situation. This scheme (using
> shared_ptr) can't handle forward declared rules like the humble
> calculator:
>
> group = '(' >> expression >> ')';
> factor = integer | group;
> term = factor >> *(('*' >> factor) | ('/' >> factor));
> expression = term >> *(('+' >> term) | ('-' >> term));
>
> Note that group is defined *before* expresion is defined. Note
> too that this is cyclic. expression indirectly references group,
> while group references expression.
>
> There's only one solution: use plain references. When a rule
> is referenced in the RHS of another rule, it is held by reference.

There is another solution. Give rules value semantics by default as I
suggest, but require people to use special syntax to get reference
semantics. Your example becomes:

     group = '(' >> &expression >> ')';
     factor = integer | group;
     term = factor >> *(('*' >> factor) | ('/' >> factor));
     expression = term >> *(('+' >> term) | ('-' >> term));

When referencing a rule that has not yet been initialized, it gets
included by-reference by applying the address-of operator (or by
wrapping it in a call to ref() if that's more palatable).

This is currently how xpressive behaves, BTW. I suppose it comes down to
expectations and common usage. Will people have to wrap everything in
ref() because they can't figure out when it's not needed? Do people
expect reference semantics from a DSL for EBNF in C++, even though C++
has value semantics by default? The answers may be different in the
regular expression domain, where there is not a long tradition of
regexes referring to other regexes.

I dunno. I like value semantics.

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk