|
Boost : |
From: Hartmut Kaiser (hartmutkaiser_at_[hidden])
Date: 2004-12-28 08:08:20
Dave Handley wrote:
> >How do you specify production rule's? Your above sample
> specifies how
> >to use the recognition part only ;-) The numbers are the
> token ID's, I
> >assume?
>
> The production rules are wrapped up in the different (and
> extensible) concept of a token. The token_float for example
> will match a float, then store the matched number as a float
> inside the token. The normal token stores nothing since
> something like a keyword match you don't need the overhead of
> the original text, but the token_string stores the string
> that was matched. I am contemplating an interface that
> allows you to use a functor to perform some basic string
> manipulation before the token is stored - for example
> dropping the enclosing quotes or similar. Although this is
> clearly similar to functionality already available in Spirit.
Ok, got it.
Wave as it is today uses one single token type only. I was thinking from the
beginning to use different token types to support tools build on top of the
preprocessor (for storing symbolt table entries and such), but haven't got
to implement this yet. So this is a second point which makes it very
interesting for me to collaborate with your team!
> Yes the numbers are token IDs - I have contemplated using
> const char * template parameters as well, but at present I
> haven't done that because of the added difficulty of ensuring
> that your names have external linkage.
> Perhaps in the final version we can support both.
Token IDs should be sufficient in most cases, IMHO.
> >Is your token type a predefined one or is it possible to use my own
> >token types instead?
> >I'm asking, because Wave in principle allows to use any
> token type for
> >its work, but it requires the tokens to expose a certain
> (very small)
> >interface.
>
> In the system as stands, all token types must inherit from a
> base version.
> This base version provides lots of useful virtual functions
> that derived token types can work with. The iterator
> interface then returns the token_base from its de-reference
> function. You can define new token types, but accessing the
> specific interface on the derived token means doing one
> of:
>
> 1) Ensuring the interface is exposed in the token_base.
> 2) Using polymorphic despatch to get the correct type,
> through a visitor
> for example.
> 3) If only a single token type is used in the lexer then use a
> static_cast.
> 4) Using a dynamic_cast.
Since tokens have to be copied around in a typesafe manner I'd have expected
to have something similar to the boost::variant, but maybe I've got you
wrong... How do you plan to achieve this? Copying through a base class won't
work AFAICT?
> You can also define new rule classes - at present we have
> written a rule class for conventional tokens, and another
> rule class for tokens that have a dynamic (run-time) ID
> rather than the static (compile-time) ID in the template parameter.
This should be sufficient to integrate with Wave.
Looking forward and please keep me in the loop!
Regards Hartmut
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk