Boost logo

Boost :

From: Paul Mensonides (pmenso57_at_[hidden])
Date: 2004-11-23 20:52:55


> -----Original Message-----
> From: boost-bounces_at_[hidden]
> [mailto:boost-bounces_at_[hidden]] On Behalf Of Arias Brent-P96059

> > Have you seen
> http://www.cuj.com/documents/s=8470/cujboost0306besser/?
>
>
> Yes, I have seen that before. I did not use it for several
> reasons: (1) I think the source for Besser's ENUM was not
> easily available, (2) my employer gets nervous when code
> "looks" too C++ oriented and gets nervous when macros become
> too "involved" - which means "lengthy" (3) Besser's ENUMX
> macro requires that you specify the number of parameters
> (e.g. ENUM3 or ENUM23).

I believe that he got rid of that (really annoying) issue. At first, that code
was written more like the pp-lib itself is written--repetition of macros. That
almost defeats the purpose of using the pp-lib at all. The pp-lib's strength is
that it provides basic interfaces that enable you to build more complex
interfaces or solutions. That basic interface provides, among other things, an
abstraction layer over manual repetition of macros--so clients don't have to do
it. Later on, I think that the code changed to using sequences.

> Reason #3 was a real hurdle for me - because I'm trying to
> provide this macro to other developers who might "rebel" if
> using the macro requires from them non-intuitive new concepts
> or exercises. That is, if they didn't have to count the
> number of parameters in the enumeration before - they may
> well reject the ENUM macro if it starts to require them to
> count the entries.

I definitely understand that.

> So, I set out to make an ENUM macro that (1) didn't need its
> entries counted by the developer and (2) had a short
> definition (which makes it less daunting for "managment" buy-in).

I understand that as well. :(

> You said:
>
> >there are still basic notions of computation, etc., such as control
> >flow--which is nothing but a conditional "invocation"
> >of its accompanying code. Those kinds of things still apply at the
> >preprocessor level. What you really needed was a way to
> conditionally
> >enumerate some sequence elements, not enumerate a
> conditional sequence.
>
> My implementation was "backwards" because I was unable to
> make the code compile otherwise. That is, depending on the
> value of ENUMS, the following code did not compile:

Sorry, I'm getting way too used to variadics. If SEQ_ENUM generates more than
one comma-separated element, then the comma will interfere with the internals of
EXPR_IIF. E.g.

#define CAT(a, b) PRIMITIVE_CAT(a, b)
#define PRIMITIVE_CAT(a, b) a ## b

#define EXPR_IIF(bit, expr) PRIMITIVE_CAT(EXPR_IIF_, bit)(expr)
#define EXPR_IIF_0(expr)
#define EXPR_IIF_1(expr) expr

#define COMMA() ,

EXPR_IIF(0, a COMMA() b)
EXPR_IIF(1, a COMMA() b)

Both of these invocations will (i.e. "should") fail. For 'expr', EXPR_IIF gets
the argument "a COMMA() b". That is fine because it is a single argument.
However, macro arguments (by default) are expanded _before_ they replace the
formal parameter instances in the replacement list. The argument "a COMMA() b"
expands to "a , b". With that, the expansion procedes as follows:

EXPR_IIF(0, a COMMA() b)
    0 -> 0
    a COMMA() b -> a , b

-> PRIMITIVE_CAT(EXPR_IIF_, 0)(a , b)
-> EXPR_IIF_0(a , b) // error: too many arguments

The same thing occurs if 'bit' is 1, and that is what is occuring with EXPR_IIF
and SEQ_ENUM.

BTW, an argument that expands to include open commas (i.e. it becomes more than
one argument when passed again) is called (by me) an "intermediate". Dealing
with intermediates is one of the major reasons why the pp-lib source is as
hacked up as it is because many preprocessors don't do the expansion order
correctly. Using intermediates requires that a single argument become some
known number of arguments at a fairly particular time.

BTW2, for anyone that may be interested, placemarkers (which is, in essence, C99
uses to deal with empty arguments) can be used to prevent argument expansion
pretty much indefinitely. E.g.

#define A(x) B(x)
#define B(x) C(x)
#define C(x) D(x)
#define D(x) x

A(x)

Here 'x' gets scanned for macro expansion once on entry to A, once on entry to
B, once on entry to C, once on entry to D, and once again during the rescan of
D's replacement list. It gets scanned for macro expansion a total of five
times. However,

#define A(p, x) B(, p ## x)
#define B(p, x) C(, p ## x)
#define C(p, x) D(, p ## x)
#define D(p, x) p ## x

A(, x)

Here 'x' gets scanned once--during the rescan of D's replacement list. (You
can't get rid of that one.) The point of this is optimization. Scanning
preprocessing tokens for expansion can become expensive (especially when the
scan is a no-op--which the preprocessor cannot assume). For example, the
following code...

#define A(x) B(B(B(B(B(B(x))))))
#define B(x) C(C(C(C(C(C(x))))))
#define C(x) D(D(D(D(D(D(x))))))
#define D(x) x

...has more computational power than the pp-lib. The 'x' argument gets scanned
260 times. If there was nine macros (instead of four), the scan count is in the
millions. With thirteen, the scan count is in the billions. It is a sum of
powers plus one scan on entry to A. If 'n' is the number of macros, and 'b' is
the parametric base (i.e. the number or "M(M(M(M(M(M(..." in the replacement
lists), then the scan count is 1 + the summation from k=0 to n-1 of b^k -- which
can be rewritten as 1+(1-b^n)/(1 - b).

[If anyone actually wants to see this in action, try this on a compliant
preprocessor (such as gcc)...

    // file.c
    #define CAT(a, b) PRIMITIVE_CAT(a, b)
    #define PRIMITIVE_CAT(a, b) a ## b

    #define SPLIT(i, im) PRIMITIVE_CAT(SPLIT_, i)(im)
    #define SPLIT_0(a, b) a
    #define SPLIT_1(a, b) b

    #define IS_NULLARY(x) \
        SPLIT(0, CAT(IS_NULLARY_R_, IS_NULLARY_C x)) \
        /**/
    #define IS_NULLARY_C() 1
    #define IS_NULLARY_R_1 1, ~
    #define IS_NULLARY_R_IS_NULLARY_C 0, ~

    #define EMPTY()

    #define EAT(x)

    #define IIF(bit) PRIMITIVE_CAT(IIF_, bit)
    #define IIF_0(t, f) f
    #define IIF_1(t, f) t

    #define A(x) B(B(B(B(B(B(x))))))
    #define B(x) C(C(C(C(C(C(x))))))
    #define C(x) D(D(D(D(D(D(x))))))
    #define D(x) E(E(E(E(E(E(x))))))
    #define E(x) x

    #define KILL(x) x
    #define IS_ALIVE() IS_NULLARY(KILL(()))

    #define TEXT(t) \
        IIF(IS_ALIVE())(TEXT_A, EAT)(t) \
        /**/
    #define TEXT_A(t) t TEXT_B EMPTY()()(t)
    #define TEXT_B() TEXT

    KILL( A( TEXT( Boost ) ) )

gcc --pedantic -x c++ -std=c++98 -E -P file.c

...this generates 'Boost' more than 1500 times.

A word of warning: don't actually do this in real code, especially with
something that generates lots and lots of scans because even though 't' (called,
by me, an "active argument") eventually stops expanding (referred to, by me, as
"reaching a terminal state") the preprocessor still performs the scans
(etc.)--even though the scans might be no-ops. Scanning, for example, the
single preprocessing token + billions of times (and the invocations required to
generate those scans) is no small operation--in fact, it takes a *really*
*really* *really* long time to generate a single + if you don't run out of
memory first (which you probably will).]

Obviously, scanning itself can become expensive. It is interesting that the
most important benefit of placemarkers may well be that it provides a means to
prevent argument expansion rather than providing the means to pass empty
arguments (which is also very useful).

> > #define ENUM(name, start, entries) \
> > BOOST_PP_EXPR_IIF(ENUMS)( \
> > BOOST_PP_SEQ_ENUM( \
> > (typedef enum { BOOST_PP_SEQ_HEAD(entries) = start) \
> > BOOST_PP_SEQ_TAIL(entries) \
> > (LAST_ ## name } name;) \
> > ) \
> > ) \

> You said:
>
> >You can do it, but EXPR_IIF does not lazily evaluate its
> argument (the
> >syntax above is what EXPR_IIF would look like if it *did* lazily
> >evaluate it argument).
> >In other words, when ENUMS is 0, SEQ_ENUM will still be invoked, but
> >its results will be discarded.
>
> I'm not sure what you are suggesting. Depending on the value
> of ENUMS, the code snippet I showed above would not compile -
> lazy evaluation or not. If it won't compile, then there's
> nothing for me to ponder (except to use a different approach
> - which happily you taught me with the TUPLE_EAT macro). :)

Apologies. I was thinking about EXPR_IIF with variadics. With variadics, you
can, ala:

#define EXPR_IIF(bit, ...) PRIMITIVE_CAT(EXPR_IIF_, bit)(__VA_ARGS__)
#define EXPR_IIF_0(...)
#define EXPR_IIF_1(...) __VA_ARGS__

E.g. here the number of arguments doesn't matter. If, on the other hand,
EXPR_IIF did lazy evaluation, then you can still do it without variadics:

#define EXPR_IIF(bit) PRIMITIVE_CAT(EXPR_IIF_, bit)
#define EXPR_IIF_0(expr)
#define EXPR_IIF_1(expr) expr

syntax: EXPR_IIF(bit)(expr)

Here, 'expr' doesn't get expanded at all if 'bit' is 0. The side-effect of lazy
evaluation is that the argument is not passed on (internally) to another macro,
which means that the situation that occurs in the pp-lib-style EXPR_IIF (the
first one above) cannot occur.

I could redesign EXPR_IIF (etc.) using the "EXPR_IIF(bit)(expr)" form. That
would allow the code to work. I still couldn't guarantee lazy evaluation, of
course, because preprocessors are terrible in general. However, if I do so, I
open an encapsulation hole in the workarounds for various preprocessors.
Granted, there are many of those already--the pp-lib cannot actually "fix"
broken preprocessors after all--but client exposure is minimized.

Regards,
Paul Mensonides


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk