Boost logo

Boost Users :

From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2005-11-01 15:04:03


Andreas Sæbjørnsen wrote:

> After looking at the documentation of Wave I believe it can
> be used to extract the preprocessor grammar and I am
> therefore looking for some tips on how to get started on my
> problem. I want to adapt wave into a mechanism for
> exptracting all preprocessor grammar from a source file
> without modifying the library itself, and I will try to
> explain what I want to use wave for. What I want to do is to
> *extracting the position and value of
> preprocessor tokens like #ifdef's, #defines, #warning etc.

Let me elaborate a bit. Wave is built as a layered iterator. At the bottom
(on top of the iterators of the input stream) there is the lexer component
which constructs the C++ tokens from the input. The tokens you are looking
for (#ifdef etc.) are contained in the token sequence generated by the lexer
iterators.
On top of the lexer we have the preprocessing component which does the
actual preprocessing (as you might have expected). The token sequence
produced by the preprocessing component obviously doesn't contain these
tokens anymore.

What you could do in this situation is
- build your own lexer intercepting the tokens you're interested in and
storing the information you need somewhere else. This makes it very
difficult to track down the (virtual) position of these tokens in the
preprocessed token stream.
- adding some additional hooks to the library allowing to get notified on
these tokens. I'm not sure of the implications, though and how much this is
different from the first bullet :-P
- Perhaps you have another idea on this?

> *evaluate the preprocessor conditionals

What do you have in mind? Do you mean the (macro-)expanded conditional
expression?

> *after evaluating the preprocessor conditional,
> extract the portion which was evaluated as false as a string
>
> #define positive
> #ifdef int positive
> #endif /*Extract this false part as string*/ int x; #endif

Hmmm. This one is tough. The preprocessor is designed to skip this
information, so I'll have to look at the code base how to best access the
corresponding code fragments. Perhaps a special hook could be introduced to
get called for every skipped token.

> *I am also interested in the value and position
> of unexpanded macros

Undefined macros?

> *extracting the C/C++ statement or expression
> that the unexpanded macro (and expanded) is
> a part of.

This conceptually isn't possible at the preprocessor level because it has no
notion of a C++ statement/expression.

> *extract the value and position of all C and
> C++ comments

This one is easy. Just enable the preserve comments mode and all the comment
tokens will be part of the generated output token sequence.

> Which work should I bace my work upon and which
> data structures should I reimplement and for what? I have
> looked at the documentation and the code, and it seems easy
> to do some parts but other are not obvious to me at this point.

Generally you should look at the existing preprocessing hooks and if these
can provide you with sufficient information. It should be quite straight
forward to add additional hooks to the library, so any suggestions are
welcome.

HTH
Regards Hartmut

>
> Thanks,
> Andreas Saebjoernsen
>


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net