Boost logo

Boost :

Subject: Re: [boost] [preprocessor] Sequences vs. All Other Data Structures
From: Paul Mensonides (pmenso57_at_[hidden])
Date: 2012-04-26 20:55:16


On 4/25/2012 5:01 AM, Paul Mensonides wrote:

> However, it is also not *nearly* as complex as the core language--and
> therefore not nearly as difficult to do correctly. In fact, in the last
> 24 hours, I implemented a macro expansion algorithm that, AFAICT, works
> flawlessly--even in all the corner cases. (Grain of salt... I haven't
> tested it very well, however, and it doesn't have a lexer or parser
> attached to it, so it is not a "preprocessor"--it just takes a
sequence of
> preprocessing tokens and a set of existing macro definitions, and
does the
> macro replacement therein.)

Implementation of the above attached--probably lots of room for
improvement and optimization. Apologies for the mega function (in
particular--I didn't feel like refactoring) and the (non-portable) shell
color-coding (I wanted to see blue paint). I built it with g++ 4.7.0,
but I believe 4.6+ should also work.

$ g++ -std=c++11 -I $CHAOS_ROOT 1.cpp

If this wasn't just a toy, I'd probably replace the symbol table with a
trie (radix tree) populated during lexical analysis--especially with how
many common prefixes you get in C++. Also, I'd avoid the various string
comparisons and just compare iterators--which would faster and improve
locality. Also, for output to tty (i.e. preprocess only) there needs to
be state machine to judiciously insert whitespace to prevent erroneous
re-tokenization by later tool. Such generated whitespace can no longer
affect the semantics of the program (whereas before, it can, thanks to
stringizing).

As one can see from this code, a recursive call to the macro replacement
scanner only occurs in one spot: when preparing an actual argument for a
formal argument that is used in the replacement "in the open". Aside
from that, the blue paint, and context changes (implemented here via
virtual tokens), this is a classical stream editor. Macros invocations
are found in the stream, replaced (in the stream) by their replacement
lists (without macro replacement), and scanning resumes at the first
token from the replacement list.

The input is:

#define O 0 +X ## Y A
#define A() A() B
#define B() B() A
#define C(x, y) x ## y
#define D(...) D D ## __VA_ARGS__ __VA_ARGS__ #__VA_ARGS__
#define ID(...) __VA_ARGS__
#define P(p, x) p ## x(P

O()
A()()()
C(C,)
D(D(0, 1))
ID (
     1
)
P(,ID)(1,2),P(1,2)))

Given no lexer/parser, the above is manually put in in the code. The
output is:

0 <space> + XY <space> A ( ) <space> B <newline>
A ( ) <space> B ( ) <space> A ( ) <space> B <newline>
C <newline>
D <space> DD ( 0 , <space> 1 ) <space> D <space> D0 , <space> 1 <space>
0 , <space> 1 <space> "0, 1" <space> "D(0, 1)" <newline>
<space> <tab> 1 <space> <newline>
P ( 1 , 2 ) , 12 ( P ) <newline>

which is correct.

g++ outputs:

0 +XY A() B
A() B() A() B
C
D DD(0, 1) D D0, 1 0, 1 "0, 1" "D(0, 1)"
1
P(1,2),12(P)

which is correct.

cl outputs:

0 +XY A() B
A() B() A() B
C
D DD D0, 1 0, 1 "0, 1" D D0, 1 0, 1 "0, 1" "D(0, 1)"
1
12(P,12(P)

where the 4th and 6th are wrong.

wave outputs:

0 +XY A() B
A() B() A() B
C
D DD(0, 1) D D0, 1 0, 1 "0, 1" "D(0, 1)"
1
error: improperly terminated macro invocation or replacement-list
terminates in partial macro expansion (not supported yet): missing ')'

Regards,
Paul Mensonides




Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk