Boost logo

Boost :

From: Paul Mensonides (pmenso57_at_[hidden])
Date: 2006-01-17 09:06:54


> -----Original Message-----
> From: boost-bounces_at_[hidden]
> [mailto:boost-bounces_at_[hidden]] On Behalf Of Bradley Smith

Okay, this is going to be a very long reply...

The process of macro expansion is best viewed as the process of scanning for
macro expansions (rather than the process of a single macro expansion). When
the preprocessor encounters a sequence of preprocessing tokens (and whitespace
separations) that needs to be scanned for macro expansion, it has to perform a
number of steps. For example,

#define OBJECT 123

+ X OBJECT +

The preprocessor must scan the sequence of preprocessing tokens (and whitespace
separations) between directives looking for macros to expand. For the sake of
simplicity, say the preprocessor's current position is on the first +. Also for
the sake of simplicity, I'm going to be implicitly skipping whitespace--which
the preprocessor can't actually do--and referring to "preprocessing tokens" as
"tokens"--though there is a difference.

+ X OBJECT +
^

The preprocessor sees that this token is not an identifier (and therefore can't
be a macro name) and outputs it (or gives it to the parser--IOW, it is done with
that token). So, scanning moves on to the next token.

+ X OBJECT +
  ^

The preprocessor sees that this token is an identifier (and therefore might be a
macro name). The first thing it must do is check to see if the particular token
itself is marked as disabled (often called "painted blue" or just "painted").
If it is painted, it outputs the token and moves on. If it is not painted,
which is the case here with X, it looks it up in the symbol table (of the
preprocessor) and finds that X is not a macro name, so it outputs it and moves
on to the next token.

+ X OBJECT +
    ^

Again the preprocessor sees that this token is an identifier, so it checks for
blue paint on the token (which isn't present) and then looks it up in the symbol
table and finds that OBJECT is a macro name. Next, the preprocessor checks if a
disabling context on the OBJECT macro exists (not on a particular token). If
that disabling context does exist, it paints the particular token, outputs the
token, and moves on. If that disabling context doesn't exist, which is the case
here, the preprocessor checks to see what type of macro it is--object-like or
function-like. In this case, it sees that OBJECT is an object-like macro, so it
begins the replacement process. The first step in that process (for an
object-like macro) is retrieving the replacement list of the macro definition
(stored in the symbol table). In this case, that's just 123. The second step
is performing any token-pasting operations inside the replacement list--which is
a no-op in this case. Note that there is no stringizing operator in object-like
macros. (Note also that the preprocessor has to do more with function-like
macros, but I'll get to those later.) After token-pasting, it replaces the
invocation (which consists only of a single token in the object-like macro
scenario) with the token-pasted version of the replacement list--again only 123.
It also creates a geographic disabling context on the macro name (in this case
OBJECT). E.g.

+ X 123 +
   |^ |
   |___|
     |
   OBJECT disabling context (DC)

It then resumes scanning at the first token of the replacement list, and we're
back to the beginning of the process. Everything else is a no-op, so the entire
result is:

+ X 123 +

It is important to note that the replacement list is not scanned for macro
expansion prior to being substituted in place of the invocation. Specifically,
there is no "stack of macro expansions" nor does macro expansion form a
functional hierarchy. Most people think that in a situation like this...

#define A() B()
#define B() 123

A()

...the top level calls A, A calls B, B returns 123 to A, and A returns 123 to
the top level. That is not the way that it works. Rather, the top-level scan
replaces A() with B(), the resumption of the top-level scan replaces B() with
123. More simply put, it is an iterative inlining process, not a recursive
functional application. (This is why I put "returns" in quotes in my previous
reply.)

Continuing, consider a new example (and I'll just diagram the expansion
process)...

#define A B
#define B A

A *
^

 B *
|^|
|_|
 |
 A DC

  A *
||^||
||_||
| | |
| B DC
|___|
  |
  A DC

A' *
   ^

Here, the result is ultimately A' (where the apostrophe indicates blue paint),
because the invocation of B is geographically nested within the disabling
context of A.

It is also important to note that disabling contexts only exist through a single
scan for macro expansion. Thus, it is a closing context as scanning moves
forward. For example:

#define A + + +

A *
^

 + + + *
|^ |
|_____|
   |
   A DC

+ + + *
 |^ |
 |___|
   |
   A DC

+ + + *
   |^|
   |_|
    |
    A DC

+ + + *
      ^

(This distinction does matter, though it may not seem like it.) On to
function-like macros...

Macro expansion of function-like macros behaves in exactly the same way except
that the preprocessor does a few more things.

#define F() 123

F() *
^

The preprocessor sees that the current token is an identifier. As before (and
with any identifier token that it encounters) it checks to see if the token is
painted. If it is, it outputs the token and moves on. In this case, the F
token isn't painted, so the preprocessor looks up F in the symbol table and
finds that it is a macro name. As before, it then checks to see if a disabling
context exists on that macro. If that disabling context does exist, it paints
the token and moves on. If it doesn't exist, which is the case here, it checks
the type (object-like or function-like) of the macro. If it is defined as
object-like, everything proceeds as above. If it is defined as function-like,
which is the case here, then the preprocessor looks at the next token to see if
it is a left parenthesis. If it isn't, the preprocessor outputs the token an
moves on.

[aside... Note that a particular token that refers to a function-like macro
still gets painted if it is found inside a corresponding disabling context even
if there is no attempt (i.e. parentheses) to invoke it.]

If there is a left parenthesis, then invocation must occur or it is an error
(e.g. if there is no closing parenthesis). Given that there is a left
parenthesis, which is the case here, the preprocessor checks the arity of the
macro definition. In this case, it is nullary (i.e. no arguments), so the
preprocessor makes sure there are no tokens between the left and right
parenthesis of the invocation. If there are, it is a "too many arguments"
error. (I'll get to function-like macros with arguments later; I'm trying to
gradually introduce complexity.) Assuming that everything is good, the
preprocessor begins the replacement process. As with object-like macros, the
first step of that process is retrieving the replacement list of the macro
definition. In this case, that's again just 123. The next step is performing
any token-pasting operations and stringizing operations in the replacement
list--which is a no-op here. (Note that this is a function-like macro, so the
stringizing operator exists. Unlike the token-pasting operator, the stringizing
operator must always be applied to a parameter. If it is applied to anything
else, it is an error. In the unique case of a nullary function-like macro,
there are no parameters, so any use of the stringizing operator will result in
an error at this point.) After token-pasting and stringizing (the order between
the two is unspecified), the preprocessor replaces the invocation (which
consists of the macro name token and the parentheses) with the token-pasted
(etc.) version of the replacement list--just 123. As with object-like macros,
it creates a geographic disabling context on the macro name. E.g.

 123 *
|^ |
|___|
  |
  F DC

The preprocessor then resumes scanning at the first token of the replacement
list, and we're again back to the beginning of the process. Everything else is
a no-op, so the entire result is:

123 *

In essence, it is pretty much the same process for nullary function-like macros
as with object-like macros but with a few extra steps. Before moving on to
non-nullary function-like macros, we need to examine what it means for an
invocation to be nested in a disabling context. Consider

#define A() B()
#define B() A()

A() *
^

 A() *
|^ |
|___|
  |
  A invocation (INV)

 B() *
|^ |
|___|
  |
  A DC

  B() *
||^ ||
||___||
| | |
| B INV
|_____|
   |
   A DC

  A() *
||^ ||
||___||
| | |
| B DC
|_____|
   |
   A DC

  A'() *
|| ^ ||
||____||
| | |
| B DC
|______|
   |
   A DC

  A'() *
|| ^||
||____||
| | |
| B DC
|______|
   |
   A DC

A'() *
     ^

This one is pretty obvious, as the invocation of B is clearly nested within the
disabling context of A. However, things are different when an invocation is
*interleaved* with a disabling context:

#define A() B
#define B() A

A()()() *
^

 A() ()() *
|^ |
|___|
  |
  A INV

 B ()() *
|^|
|_|
 |
 A DC

  B () () *
||^| |
||_|___|
| | |
| | B INV
|__|
  |
  A DC

 A () *
|^|
|_|
 |
 B DC

  A () *
||^| |
||_|___|
| | |
| | A INV
|__|
  |
  B DC

 B *
|^|
|_|
 |
 A DC

B *
  ^

IOW, an invocation that is interleaved with a disabling context is not
geographically nested within that context. When scanning resumes after
replacement, that context is no longer active. Thus, in a situation like the
above, A and B will continue to expand (back and forth) indefinitely until a
left-parenthesis is not found as the next token.

(This is the fundamental reason why the REPEAT_FROM_TO example works, but the
CONCAT example does not.)

On to non-nullary function-like macros...

Again, the process is basically the same except that a few more steps are
involved. Everything proceeds exactly the same as will nullary invocations
until the preprocessor checks the arity of the macro--which it finds to be N.
It then takes the delineates the arguments in the argument list and verifies
that there are N arguments. If there are more or less, than you its a "too many
arguments" or "too few arguments" error. Assuming that everything is good, the
preprocessor begins the replacement process. The first step, again, is
retrieving the replacement list of the macro definition. It then replaces
formal parameters in the replacement list with the actual parameters provided in
the invocation. However, it does this in a variety of ways depending on how the
formal parameter appears in the replacement list. If a particular instance of a
formal parameter in the replacement list is an operand of the token-pasting or
stringizing operator (more on this later), the formal parameter is directly
replaced by the actual argument. However, if it is not an operand of one of
either operator, it is scanned on its own for macro expansion. (This scanning
for macro expansion in an argument is the only true instance of recursion in the
macro expansion process.) Though it is scanned independently, any disabling
contexts still exist (more on this later too).

[aside--Note also that an argument is scanned for macro expansion if and *only*
if at least one instance of a the corresponding formal parameter exists in the
replacement list without being an operand of one of the operators. Thus, in
something like

#define EAT(x)
#define BINARY(a, b) // ...

EAT( BINARY() )

This is not an error, even though it would be if any attempt is made to expand
BINARY() with too few arguments. Because there is no occurrence of x in the
replacement list of EAT, the argument should never undergo a scan for macro
expansion. It only undergoes being delineated as an argument.]

After argument substitution, token-pasting and stringizing are performed.
Again, the order is unspecified, as is the associativity of token-pasting.
However, the operands to the token-pasting operator are single tokens--the ones
immediately to the left and right of it. The operand to the stringizing
operator, OTOH, is the entire argument (recall that it must be applied to a
formal parameter). Thus, even though the order of evaluation between
token-pasting and stringizing is unspecified, you can think of a use of the
stringizing operator as being replaced with a third "stringized" version of the
actual argument (rather than substituting the argument in the replacement list,
marking the beginning and ending of it in some way, and then performing
token-pasting and stringizing). Of course, you can't rely on any order
particular of evaluation. Regarding token-pasting... Even though the
token-pasting operator operates on exactly two tokens (rather than entire
arguments), it does affect entire arguments in the sense that it prevents the
entire argument from being scanned for macro expansion prior to substitution.

After token-pasting and stringizing, the preprocessor replaces the invocation
(which consists of the macro name token, the left parenthesis, all of the
arguments, and the right parenthesis) with the
argument-substituted-token-pasted-and-stringized version of the replacement list
and creates a geographic disabling context on the macro name. Note that the
disabling context is created *after* arguments have been expanded (if
necessary), which is why stuff like A(A(1)) expands both A's.

For example (note that it is hard to diagram the recursive scan of the
parameters--so apologies in advance),

#define A(x) B(x, x + 1)
#define B(p, q) p + q

A(7) *
^

 A(7) *
|^ |
|____|
   |
   A INV

 A(7) *
|^ |
|____|
  |
  A INV

    [recursive scan of argument #1]

    7
    ^

    [result = 7]

  B(7, 7 + 1) *
||^ ||
||___________||
| | |
| B INV |
|_____________|
     |
     A DC

    [recursive scan of argument #1]

        7
        ^
   _ _ ___ _ _
        |
        A DC

    [result = 7]

    [recursive scan of argument #2]

        7 + 1
        ^
    _ _ _____ _ _
          |
          A DC

    (etc.)

    [result = 7 + 1]

  7 + 7 + 1 *
||^ ||
||_________||
| | |
| B DC |
|___________|
     |
     A DC

  7 + 7 + 1 *
|| ^ ||
||_________||
| | |
| B DC |
|___________|
     |
     A DC

(etc.)

7 + 7 + 1 *
          ^

Okay, now things get interesting... As with nullary function-like macro
invocations, an interleaved invocation is not nested, but when arguments are
involved, an invocation inside an argument might be nested in a disabling
context even though the invocation to which it is an argument is not nested in
that context. (Very very few preprocessors get this completely right, so it is
dangerous to mess with.) E.g.

#define A(m) m( B(f)
#define B(x) A(x)

#define C(x) < x >

A(C) ) *
^

 A(C) ) *
|^ |
|____|
  |
  A INV

  // recursive scan of argument #1 results in C

 C( B(f) ) *
|^ |
|_______|
    |
    A DC

  C( B(f) ) *
||^ | |
||_______|_|
| | |
| C INV
|________|
    |
    A DC

    // recursive scan of argument #1...

        B(f)
        ^ |
    _ _ ____|
          |
          A DC

        B(f)
       |^ ||
       |____||
          | |
          B INV
    _ _ _____|
         |
         A DC

        // recursive scan of argument #1 (to B) results in f

        A(f)
       |^ ||
       |____||
          | |
          B DC
    _ _ _____|
          |
          A DC

        A'(f)
       | ^ ||
       |_____||
          | |
          B DC
    _ _ ______|
          |
          A DC

    (etc.)

    // result = A'(f)

 < A'(f) > *
|^ |
|_________|
     |
     C DC

 < A'(f) > *
| ^ |
|_________|
     |
     C DC

 < A'(f) > *
| ^ |
|_________|
     |
     C DC

(etc.)

< A'(f) > *
          ^

Phew. As you can see, things can get complex in scenarios like this.

In any case, that's a relatively detailed overview of macro expansion. Applying
it to the CONCAT example...

#define CONCAT_1(a, b) CONCAT_1_D(a, b)
#define CONCAT_1_D(a, b) a ## b

#define CONCAT_2(a, b) CONCAT_2_D(a, b)
#define CONCAT_2_D(a, b) a ## b

#define AB(c, x, y) CONCAT_ ## c(x, y)
#define CONCAT CONCAT_1

CONCAT(A, B(1, p, q))
^

 CONCAT (A, B(1, p, q))
|^ |
|______|
   |
   CONCAT INV

 CONCAT_1 (A, B(1, p, q))
|^ |
|________|
    |
    CONCAT DC

  CONCAT_1 (A, B(1, p, q))
||^ | |
||________|_______________|
| | |
|_________| CONCAT_1 INV
     |
     CONCAT DC

    // recursive scan of argument #1 results in: A
    // recursive scan of argument #2 results in: B(1, p, q)

 CONCAT_1_D(A, B(1, p, q))
|^ |
|_________________________|
             |
             CONCAT_1 DC

  CONCAT_1_D(A, B(1, p, q))
||^ ||
||_________________________||
| | |
| CONCAT_1_D INV |
|___________________________|
             |
             CONCAT_1 DC

    // no recursive scan of argument #1 (only used as ## operand)
    // no recursive scan of argument #2 (only used as ## operand)

    // token-pasting:

    -> A ## B(1, p, q)
    -> AB(1, p, q)

  AB(1, p, q)
||^ ||
||___________||
| | |
| CONCAT_1_D DC
|_____________|
       |
       CONCAT_1 DC

   AB(1, p, q)
|||^ |||
|||___________|||
|| | ||
|| AB INV ||
||_____________||
| | |
| CONCAT_1_D DC
|_______________|
        |
        CONCAT_1 DC

    // no recursive scan of argument #1 (only used as ## operand)
    // recursive scan of argument #2 results in: p
    // recursive scan of argument #3 results in: q

    // token-pasting:

    -> CONCAT_ ## 1(p, q)
    -> CONCAT_1(p, q)

   CONCAT_1(p, q)
|||^ |||
|||______________|||
|| | ||
|| AB DC ||
||________________||
| | |
| CONCAT_1_D DC
|__________________|
          |
          CONCAT_1 DC

   CONCAT_1'(p, q)
||| ^ |||
|||_______________|||
|| | ||
|| AB DC ||
||_________________||
| | |
| CONCAT_1_D DC
|___________________|
          |
          CONCAT_1 DC

(etc.)

CONCAT_1'(p, q)
               ^
--------------------

Note that the disabling context on the object-like macro CONCAT is long gone,
but that doesn't really help because we still get the disabling context on
CONCAT_1.

--------------------

CONCAT(A, B(2, p, q))
^

 CONCAT (A, B(2, p, q))
|^ |
|______|
   |
   CONCAT INV

 CONCAT_1 (A, B(2, p, q))
|^ |
|________|
    |
    CONCAT DC

  CONCAT_1 (A, B(2, p, q))
||^ | |
||________|_______________|
| | |
|_________| CONCAT_1 INV
     |
     CONCAT DC

    // recursive scan of argument #1 results in: A
    // recursive scan of argument #2 results in: B(2, p, q)

 CONCAT_1_D(A, B(2, p, q))
|^ |
|_________________________|
             |
             CONCAT_1 DC

  CONCAT_1_D(A, B(2, p, q))
||^ ||
||_________________________||
| | |
| CONCAT_1_D INV |
|___________________________|
             |
             CONCAT_1 DC

    // no recursive scan of argument #1 (only used as ## operand)
    // no recursive scan of argument #2 (only used as ## operand)

    // token-pasting:

    -> A ## B(2, p, q)
    -> AB(2, p, q)

  AB(2, p, q)
||^ ||
||___________||
| | |
| CONCAT_1_D DC
|_____________|
       |
       CONCAT_1 DC

   AB(2, p, q)
|||^ |||
|||___________|||
|| | ||
|| AB INV ||
||_____________||
| | |
| CONCAT_1_D DC
|_______________|
        |
        CONCAT_1 DC

    // no recursive scan of argument #1 (only used as ## operand)
    // recursive scan of argument #2 results in: p
    // recursive scan of argument #3 results in: q

    // token-pasting:

    -> CONCAT_ ## 2(p, q)
    -> CONCAT_2(p, q)

   CONCAT_2(p, q)
|||^ |||
|||______________|||
|| | ||
|| AB DC ||
||________________||
| | |
| CONCAT_1_D DC
|__________________|
          |
          CONCAT_1 DC

    CONCAT_2(p, q)
||||^ ||||
||||______________||||
||| | |||
||| CONCAT_2 INV
|||________________|||
|| | ||
|| AB DC ||
||__________________||
| | |
| CONCAT_1_D DC
|____________________|
           |
           CONCAT_1 DC

    // recursive scan of argument #1 results in: p
    // recursive scan of argument #2 results in: q

    CONCAT_2_D(p, q)
||||^ ||||
||||________________||||
||| | |||
||| CONCAT_2 DC
|||__________________|||
|| | ||
|| AB DC ||
||____________________||
| | |
| CONCAT_1_D DC
|______________________|
            |
            CONCAT_1 DC

     CONCAT_2_D(p, q)
|||||^ |||||
|||||________________|||||
|||| | ||||
|||| CONCAT_2_D INV
||||__________________||||
||| | |||
||| CONCAT_2 DC |
|||____________________|||
|| | ||
|| AB DC ||
||______________________||
| | |
| CONCAT_1_D DC
|________________________|
             |
             CONCAT_1 DC

    // no recursive scan of argument #1 (only used as ## operand)
    // no recursive scan of argument #2 (only used as ## operand)

    // token-pasting:

    -> p ## q
    -> pq

     pq
|||||^ |||||
|||||__|||||
|||| | ||||
|||| CONCAT_2_D DC
||||____||||
||| | |||
||| CONCAT_2 DC
|||______|||
|| | ||
|| AB DC
||________||
| | |
| CONCAT_1_D DC
|__________|
      |
      CONCAT_1 DC

pq
   ^

Okay, now we'll get a little fancy:

#define INTERNAL_CAT(a, b) INTERNAL_PRIMITIVE_CAT(a, b)
#define INTERNAL_PRIMITIVE_CAT(a, b) a ## b

#define EMPTY()

// ----- //

#define CAT_1(a, b) PRIMITIVE_CAT_1(a, b)
#define CAT_1_ID() CAT_1
#define PRIMITIVE_CAT_1(a, b) a ## b

#define CAT_2(a, b) PRIMITIVE_CAT_2(a, b)
#define CAT_2_ID() CAT_2
#define PRIMITIVE_CAT_2(a, b) a ## b

#define CAT_3(a, b) PRIMITIVE_CAT_3(a, b)
#define CAT_3_ID() CAT_3
#define PRIMITIVE_CAT_3(a, b) a ## b

#define CAT_4(a, b) PRIMITIVE_CAT_4(a, b)
#define CAT_4_ID() CAT_4
#define PRIMITIVE_CAT_4(a, b) a ## b

// ----- //

#define CAT TRY_1()

#define TRY_1() INTERNAL_CAT(TRY_1_, CAT_1(0, 0))()
#define TRY_2() INTERNAL_CAT(TRY_2_, CAT_2(0, 0))()
#define TRY_3() INTERNAL_CAT(TRY_3_, CAT_3(0, 0))()
#define TRY_4() INTERNAL_CAT(TRY_4_, CAT_4(0, 0))()

#define TRY_1_00 CAT_1_ID
#define TRY_2_00 CAT_2_ID
#define TRY_3_00 CAT_3_ID
#define TRY_4_00 CAT_4_ID

#define TRY_1_CAT_1(a, b) TRY_2
#define TRY_2_CAT_2(a, b) TRY_3
#define TRY_3_CAT_3(a, b) TRY_4
#define TRY_4_CAT_4(a, b) error: ran out of CAT depth! EMPTY

// ----- //

CAT(C, AT(C, AT(1, 2))) // 12

-------------------------

To check you're understanding, you might try to diagram this expansion. I'll
leave that as an exercise, though, this reply is already very long.

This is an example of how one might detect the next available CAT
implementation. It does a linear search. It tries to expand CAT_1. If that
succeeds, CAT_1 is "returned". If not, it tries CAT_2, and so on. This is a
less than ideal way to do it (both the linear search and macro replication), but
it's how you might do it from scratch. The pp-lib uses a binary search rather
than a linear search, and encapsulates as much as possible:

#include <boost/preprocessor/detail/auto_rec.hpp>

#define INTERNAL_CAT(a, b) INTERNAL_PRIMITIVE_CAT(a, b)
#define INTERNAL_PRIMITIVE_CAT(a, b) a ## b

// ----- //

#define CAT_1(a, b) PRIMITIVE_CAT_1(a, b)
#define PRIMITIVE_CAT_1(a, b) a ## b

#define CAT_2(a, b) PRIMITIVE_CAT_2(a, b)
#define PRIMITIVE_CAT_2(a, b) a ## b

#define CAT_3(a, b) PRIMITIVE_CAT_3(a, b)
#define PRIMITIVE_CAT_3(a, b) a ## b

#define CAT_4(a, b) PRIMITIVE_CAT_4(a, b)
#define PRIMITIVE_CAT_4(a, b) a ## b

// ----- //

#define CAT INTERNAL_CAT(CAT_, BOOST_PP_AUTO_REC(CAT_P, 4))
#define CAT_P(n) INTERNAL_CAT(CAT_, CAT_ ## n(0, 0))

#define CAT_00 1
#define CAT_CAT_1(a, b) 0
#define CAT_CAT_2(a, b) 0
#define CAT_CAT_3(a, b) 0
#define CAT_CAT_4(a, b) 0

CAT(C, AT(C, AT(1, 2))) // 12

-------------------------

Of course, we can do better than that:

#include <boost/preprocessor/detail/auto_rec.hpp>
#include <boost/preprocessor/detail/is_nullary.hpp>

#define INTERNAL_CAT(a, b) INTERNAL_PRIMITIVE_CAT(a, b)
#define INTERNAL_PRIMITIVE_CAT(a, b) a ## b

// ----- //

#define CAT_1(a, b) PRIMITIVE_CAT_1(a, b)
#define PRIMITIVE_CAT_1(a, b) a ## b

#define CAT_2(a, b) PRIMITIVE_CAT_2(a, b)
#define PRIMITIVE_CAT_2(a, b) a ## b

#define CAT_3(a, b) PRIMITIVE_CAT_3(a, b)
#define PRIMITIVE_CAT_3(a, b) a ## b

#define CAT_4(a, b) PRIMITIVE_CAT_4(a, b)
#define PRIMITIVE_CAT_4(a, b) a ## b

// ----- //

#define CAT INTERNAL_CAT(CAT_, BOOST_PP_AUTO_REC(CAT_P, 4))
#define CAT_P(n) BOOST_PP_IS_NULLARY(INTERNAL_CAT(CAT_, CAT_ ## n(0, 0)))

#define CAT_00 ()

// ----- //

CAT(C, AT(C, AT(C, AT(1, 2)))) // 12

-------------------------

These examples are just trying to give you an idea of what REPEAT_FROM_TO is
doing to make it appear that it is recursive.

It might make things more concrete with a possible implementation strategy. The
single most design-affecting part of this whole process is the disabling
contexts--everything else is pretty straightforward--so I'll start with that.
Recall that disabling contexts can be viewed as closing contexts because they
are only active during a single scan for macro expansion. (I said before that
this matters, but I wasn't referring to possible implementation strategies, but
rather what one can do because of that behavior. I won't get into that here,
however.) Because disabling contexts are only active during a single scan for
macro expansion, they can be implemented using flags in the symbol table and
virtual tokens. I.e. instead of maintaining an expensive structure similar to
my diagrams, at the point when an invocation is replaced by the replacement
list, the preprocessor can set a "disabled" flag in the symbol table on the
macro definition. It can then put a virtual token at the end of the replacement
list (diagramed as <MACRO>) that when encountered resets that flag (and is
deleted). E.g.

#define A() ...

 A() A() *
|^ |
|___|
  |
  A INV

(set flag on A in symbol table)

... <A> A() *
^

... <A> A() *
     ^ (reset flag on A)

... A() *
   |^ |
   |___|
     |
     A INV

(set flag on A)

... ... <A> *
    ^

... ... <A> *
         ^ (reset flag on A)

... ... *
        ^

This is pretty straightforward. It is a little more complex when parameters are
involved. For any given formal parameter, how it is handled during substitution
is a property of the formal parameter itself. It is known at the time of
definition what variations of a particular formal parameter are needed. I.e. if
the parameter is unused no variation of it is needed, if it is used without
being an operand of # or ## then a macro-expanded variation is needed, if it is
used as an operand of # or ## then a non-macro-expanded variation is needed.
Note that both of the last two variations might be needed for a given parameter
(i.e. they aren't mutually exclusive). Starting with the easy case--when no
variation of the parameter is needed... This case is easy because the actual
parameter is just going to be discarded. However, prior to being discarded, the
argument has to be scanned in order to execute any virtual tokens that it might
contain. After it does that, it can be discarded, and the preprocessor is done
with that parameter. In the case where only the macro-expanded variation of the
parameter is required, it just needs to be scanned for macro replacement as
normal. In the case where only the non-macro-expanded variation is required, it
needs to be scanned 1) to paint tokens and 2) to execute any virtual tokens that
it might contain (deleting them after it executes them). The only hard case is
when both a macro-expanded variation of the parameter is required and a
non-macro-expanded variation of the same parameter is required. In that case,
the preprocessor has to do both of the last two cases. The semi-hard part is
that it has to start both scenarios with the same state--meaning all the same
flags must be set, etc.. This can be accomplished by doing the
non-macro-expanded variation first, but keeping track of every virtual token
that gets executed so that all of the resets can be undone before the
preprocessor starts the recursive scan-for-macro-expansion on the argument.

> Thanks for your time Paul. I tried responding personally but
> your mail server considers me to be a spammer ;-)

Hmm. I don't have any filters set up on the server itself, so that's strange.

Regards,
Paul Mensonides


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk