Boost logo

Boost Users :

Subject: Re: [Boost-users] Boost.Wave getting raw input tokens for code transformation
From: Hartmut Kaiser (hartmut.kaiser_at_[hidden])
Date: 2012-11-28 07:20:07


> I'm trying to utilize Wave for simple code transformations. My starting
> point is to do an identity mapping through Wave: read in a source file,
> tokenize it, and output it again. The output shall be the same as the
> input on a byte by byte basis. I modified the cpp_lexer example to just
> output the token.value() texts. From that I discovered whitespace in "#
> define" will be dropped. Ok, fixed this by modifying Wave source code.
>
> Now my main problem are line continuations using backslash like in
> multiline macro definitions. Those get processed at a very low level, I
> guess. The backslash will never be reported as a token, but the adjacent
> characters will be fused into one token. Is there a way to define a "raw"
> token mode that would report the backslash and keep the adjacent
> characters as distinct tokens? I don't understand the tokenizer level at
> all.

The processing of backslash/eol character sequences is handled below the
tokenizer, before the input token stream is even processed by the lexer. If
you want to disable that, just modify the function is_backslash()
(boost/libs/wave/src/cpplexer/re2clex/cpp_re.cpp, line 182) to always return
'false'. You might also have to adjust starting line 295 to avoid checking
for backslashes there. I'm not sure, though, what consequences this change
might have on the rest of the library.

If you come up with a general solution controllable by a flag or so I'd be
happy to accept a patch.

Regards Hartmut
---------------
http://boost-spirit.com
http://stellar.cct.lsu.edu


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net