Boost logo

Boost Users :

From: Andreas Sæbjørnsen (andreas.saebjoernsen_at_[hidden])
Date: 2005-11-01 17:50:28


On 11/1/05, Hartmut Kaiser <hartmut.kaiser_at_[hidden]> wrote:
>
> Let me elaborate a bit. Wave is built as a layered iterator. At the bottom
> (on top of the iterators of the input stream) there is the lexer component
> which constructs the C++ tokens from the input. The tokens you are looking
> for (#ifdef etc.) are contained in the token sequence generated by the
> lexer
> iterators.
> On top of the lexer we have the preprocessing component which does the
> actual preprocessing (as you might have expected). The token sequence
> produced by the preprocessing component obviously doesn't contain these
> tokens anymore.
>
> What you could do in this situation is
> - build your own lexer intercepting the tokens you're interested in and
> storing the information you need somewhere else. This makes it very
> difficult to track down the (virtual) position of these tokens in the
> preprocessed token stream.
> - adding some additional hooks to the library allowing to get notified on
> these tokens. I'm not sure of the implications, though and how much this
> is
> different from the first bullet :-P
> - Perhaps you have another idea on this?

I was hoping to avoid modifying the lexer itself, so I have been more
reclined towards the approach of adding hooks to the library. I was thinking
more in the direction of
doing something similar to what struct default_preprocessing_hooks (in
preprocessing_hooks.hpp) does for macros, since the user can reimplement
this for the instantiation
of the template<> class context. Another option is to add this to template<>
class context much similar to the way it is currently done for macros. What
leads me to favour the struct default_preprocessing_hooks solution over
modifying template<> class context is that it already handles similar
problems and you could also argue that these hooks does not fit in
template<> class context. So my two options are:
- add more hooks to struct default_preprocessing_hooks
- add more member functions which work as hooks within template<> class
context
The hooks must be provided with all the necessary information for extracting
the preprocessor grammar and evaluating the preprocessor conditionals. What
do you think is the best solution,
its feasibility and how do you think it would fit into the wave preprocessor
library?

> *evaluate the preprocessor conditionals
>
> What do you have in mind? Do you mean the (macro-)expanded conditional
> expression?

yes, so that I can extract the unexpanded preprocessor conditional
expression and when this expression is (macro-)expanded if the result is
positive or negative.
For instance in the example
#define BAR
#ifdef BAR
int x;
#ifdef FOO
int y;
#endif
#endif
I would be interested in extracting '#ifdef BAR' and also that it is
evaluated as true. I would also be interested in extracting '#ifdef FOO' and
that it is evaluated as false.

> *after evaluating the preprocessor conditional,
> > extract the portion which was evaluated as false as a string
> >
> > #define positive
> > #ifdef int positive
> > #endif /*Extract this false part as string*/ int x; #endif
>
> Hmmm. This one is tough. The preprocessor is designed to skip this
> information, so I'll have to look at the code base how to best access the
> corresponding code fragments. Perhaps a special hook could be introduced
> to
> get called for every skipped token.

That would be great! :) Do you think this can be done through struct
default_preprocessing_hooks?

> *I am also interested in the value and position
> > of unexpanded macros
>
> Undefined macros?

I am only interested in defined macros. To be more specific I am interested
in when the preprocessor recognises a macro. For each macro it is
interesting to extract the value and position in the file. The macro can be
found in two forms; the one before macro-expansion and the one after. Both
forms are interesting. But this is from what I have seen already handled in
struct default_preprocessing_hooks.
1: #define FOO int x;
2: FOO
On line 2 in this example code the macro FOO is found. This macro can be
expanded to 'int x', which to the preprocessor is equivalent to the
unexpanded macro FOO found on line 1.

> *extracting the C/C++ statement or expression
> > that the unexpanded macro (and expanded) is
> > a part of.
>
> This conceptually isn't possible at the preprocessor level because it has
> no
> notion of a C++ statement/expression.

I do not want to use Wave as a C/C++ parser, only to understand a subst of
it's grammar. Let me corroborate for why I think it is doable and that the
information necessary to do this is already easily available. What I was
thinking was that
-since brackets ('{' and '}') and semicolon should is found within the
tokens from the lexer, you should as far as I can see be able to fully
define the grammar necessary to recognize what can be an expression or
statment. For instance a variable declaration statement in C/C++ always ends
with a ';'.
- a function definition statement has a basic block (body) which is always
limited by the bracket ({...}).
-reference expression also tend to end with an ';', like for instance a
function reference expression " foo(); ".
Therefore I would argue that since I do not think that you need an
understanding of C/C++ syntax and only hopefully a fairly limited view of
the C/C++ grammar (and the information for this is in the token-stream
returned from the lexer) this should be doable. I think it can be a little
bit difficult though, but I have to draw on your expertize here. Do you have
any ideas for this?

> *extract the value and position of all C and
> > C++ comments
>
> This one is easy. Just enable the preserve comments mode and all the
> comment
> tokens will be part of the generated output token sequence.

Great. :) What about making a hook for this within stuct
default_preprocessing_hooks also?

> Which work should I bace my work upon and which
> > data structures should I reimplement and for what? I have
> > looked at the documentation and the code, and it seems easy
> > to do some parts but other are not obvious to me at this point.
>
> Generally you should look at the existing preprocessing hooks and if these
> can provide you with sufficient information. It should be quite straight
> forward to add additional hooks to the library, so any suggestions are
> welcome.
>

It would be very interesting to do some work on this, and it would be useful
to hear what you think about adding the additional hooks we have been
talking about. Maybe these hooks should be better specified.

Regards
Andreas



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net