Boost logo

Boost :

From: Vesa Karvonen (vesa.karvonen_at_[hidden])
Date: 2001-05-19 09:16:05


Hi,

I'm thinking of submitting a small CPP or C Pre Processor meta programming
library into Boost. For this reason I have uploaded a trivial demo subset of
the library into the Files section/Vault. You can find the files under the
CPP folder. The rest of this message contains a motivated example of the use
of a particular feature of the library.

Please Note That
================

I'm fully aware of the traditional problems of using CPP. This library does
not in any way aim to promote bad use of CPP such as
- defining constants using macros when it is possible to use constants or
enumerations
- using macros instead of inline functions
- polluting the "global namespace" using macros.

The code snippets here are slightly different (in particular, they lack
BOOST_ prefixes and boost:: qualifications) from the library code in order
to make the code slightly more readable.

The code snippets in this message have not passed through a compiler, which
means that there can be bugs in the examples.

Some macro idioms are not presented here, because I assume that they are
self explanatory to the readers.

Problem
=======

The C++ function and template argument/parameter lists are special syntactic
constructs and it is impossible to directly manipulate or generate them
using C++ constructs. This leads to unnecessary code repetition.

Consider the implementation of the is_function<> meta function in boost.
Part of the implementation is an overloaded is_function_tester() function
that is used for testing if a type is convertible to pointer to a function.
Because of the special treatment of argument lists, it is not possible to
directly match a function with an arbitrary argument list. Instead, the
is_function_tester() must be overloaded for every distinct number of
arguments that is to be supported. Example:

    template <class R>
    yes_type is_function_tester(R (*)(void));
    template <class R, class A0>
    yes_type is_function_tester(R (*)(A0));
    template <class R, class A0, class A1>
    yes_type is_function_tester(R (*)(A0, A1));
    template <class R, class A0, class A1, class A2>
    yes_type is_function_tester(R (*)(A0, A1, A2));

    // ...

The need for this kind of repetition occurs particularly frequently while
implementing generic components or meta programming facilities, but the need
also manifests itself in many far simpler situations.

Typical Solutions
=================

Typically code monkeys do the repetition manually. Manual code repetition is
highly unproductive, but is sometimes more readable to the untrained eye.

Another solution is to write an external program for generating the repeated
code or use some other extra linquistic means such as a smart editor.
Unfortunately, using external code generators has many disadvantages:
- writing the generator takes time (this could be helped by using a standard
generator)
- it is no longer productive to manipulate C++ code directly
- invoking the generator may be difficult
- automating the invocation of the generator can be difficult in certain
environments (automatic invocation is desirable for active libraries)

What About CPP?
===============

Because C++ comes with a preprocessor, one would assume that it would
support these kind of needs directly. Using CPP in this case is highly
desirable because:
- CPP is highly portably
- CPP is automatically invoked as part of the compiling process
- CPP meta code can be directly embedded into the C++ source code
- Compilers generally allow to view or output the preprocessed code, which
can be used for debugging or to copy-paste the generated code.

Most unfortunately, the CPP is a very low level preprocessor that
specifically does not support loop staments or recursive macros.

Solution: Generating Lists Using The CPP
========================================

Fortunately, the CPP expands macros recursively until it can not match any
macro (or runs into a macro that has already been expanded in an earlier
recursion). This way it is possible to implement fixed length recursion as
long as each recursion uses a distinct macro:

    #define REPEAT_1 first
    #define REPEAT_2 REPEAT_1 second
    #define REPEAT_3 REPEAT_2 third
    #define REPEAT_4 REPEAT_3 fourth

    REPEAT_4 now expands to: first second third fourth

In order to make this into something that can be reused, the repeated
statement must be given as parameter. This can be done by passing a function
style macro as a parameter that is invoked on each recursion and given the
recursion depth as an argument:

    #define REPEAT(cnt, FUN) REPEAT_##cnt(FUN)
    #define REPEAT_0(FUN)
    #define REPEAT_1(FUN) FUN(1)
    #define REPEAT_2(FUN) REPEAT_1 FUN(2)
    #define REPEAT_3(FUN) REPEAT_2 FUN(3)
    #define REPEAT_4(FUN) REPEAT_3 FUN(4)

    #define DEMO(i) demo_##i

    REPEAT(4, DEMO) now expands to: demo_1 demo_2 demo_3 demo_4

The above REPEAT() macro still suffers from a few problems:
- the repetition always starts from 1 - it is sometimes useful to start from
0
- the repetion count must be a numeric literal - macros are not allowed
- it is not directly possible to generate comma separated lists like
p1,p2,p3

The solution to all the above problems is to implement a robust CPP meta
function for generating lists:

    // S = SEPARATOR (1,2,...), E = ELEMENT (0,1,...)
    #define LIST(n,S,E) CAT(LIST_,n)(S, E)
    #define LIST_0(S,E)
    #define LIST_1(S,E) E(0)
    #define LIST_2(S,E) LIST_1(S,E) S(1) E(1)
    #define LIST_3(S,E) LIST_2(S,E) S(2) E(2)
    // ...

The LIST CPP meta function can now be used to generate lists of over 100
elements (obviously depending on compiler limits and macro complexity). For
example:

    #define EMPTY_FUN(i)
    #define COMMA_FUN(i) ,
    #define CLASS_FUN(i) , class A##i
    #define ARG_FUN(i) A##i

    template< class R LIST(3, EMPTY_FUN, CLASS_FUN) >
    yes_type is_function_helper( R (*)( LIST(3, COMMA_FUN, ARG_FUN) ) );

    // ALWAYS CLEAN UP AFTER YOURSELF!
    #undef EMPTY_FUN
    #undef COMMA_FUN
    #undef CLASS_FUN
    #undef ARG_FUN

Expands to (something like this):

    template <class R, class A0, class A1, class A2>
    yes_type is_function_tester(R (*)(A0, A1, A2));

Unfortunately, it is not possible to use LIST in the SEPARATOR or ELEMENT
macros. This can be solved by defining two distinct LIST primitives:

    // S = SEPARATOR (1,2,...), E = ELEMENT (0,1,...)
    #define LST0(n,S,E) CAT(LST0_,n)(S, E)
    #define LST0_0(S,E)
    #define LST0_1(S,E) E(0)
    #define LST0_2(S,E) LST0_1(S,E) S(1) E(1)
    #define LST0_3(S,E) LST0_2(S,E) S(2) E(2)
    // ...

    // S = SEPARATOR (1,2,...), E = ELEMENT (0,1,...)
    #define LST1(n,S,E) CAT(LST1_,n)(S, E)
    #define LST1_0(S,E)
    #define LST1_1(S,E) E(0)
    #define LST1_2(S,E) LST1_1(S,E) S(1) E(1)
    #define LST1_3(S,E) LST1_2(S,E) S(2) E(2)
    // ...

This makes it possible to implement all the is_function_overloads()
(excluding the ellipsis version) like this:

    #ifndef MAX_ARG_CNT_PLUS_1
    #define MAX_ARG_CNT_PLUS_1 31
    #endif

    #define EMPTY_FUN(i)
    #define COMMA_FUN(i) ,
    #define CLASS_FUN(i) , class A##i
    #define ARG_FUN(i) A##i

    #define IS_FUNCTION_FUN(n) \
        template< class R LST0(n, EMPTY_FUN, CLASS_FUN) > \
        yes_type is_function_helper( R (*)( LST0(n, COMMA_FUN, ARG_FUN) ) );

    LST1(MAX_ARG_CNT_PLUS_1, EMPTY_FUN, IS_FUNCTION_FUN)

    // ALWAYS CLEAN UP AFTER YOURSELF!
    #undef EMPTY_FUN
    #undef COMMA_FUN
    #undef CLASS_FUN
    #undef ARG_FUN
    #undef IS_FUNCTION_FUN

As you can see, the user can now customize the library, by defining the
MAX_ARG_CNT macro, without modifying the library source code.

History and Discussion
======================

I independently discovered this CPP technique, while analyzing Blitz++
source code in late 1998 / early 1999. Blitz++ defined overloaded
constructors for a vector type which seemed particularly ugly. I then
developed this technique to avoid the repetition. Of course, this technique
has probably originally been discovered before C++ existed.

I have been using these kind of macros for over a year now. Aside from
occasionally running into compiler limits, this technique has proven quite
robust and useful.

Two list primitives make it possible for O(1) tokens (excluding the O(n)
list generation macros) to expand to O(n*n) tokens, which can save a hell of
a lot of typing. On current hardware/software, this technique can also save
compiling times, because the CPP can be faster than IO.

Code readability is obviously not optimal, but in my opinnion, lengthy
repetition is definitely not more readable and is less desirable for a few
technical reasons.

The list primitive is not just a fancy technique for saving a few
keystrokes. It does, in fact, enable making libraries more flexible.

Use of this technique does not pollute the "global namespace" with macros.
The user is encouraged to undefine all local macros immediately after use.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk