Boost logo

Boost :

Subject: Re: [boost] Review Request: Variadic Macro Data library
From: Paul Mensonides (pmenso57_at_[hidden])
Date: 2011-02-20 07:09:45


On Sat, 19 Feb 2011 16:50:03 -0500, Edward Diener wrote:

> On 2/19/2011 3:42 PM, Gordon Woodhull wrote:

>> IMO you should ask Paul (directly off-list) before the review to see if
>> this is an option. True it couldn't be added without his permission.
>
> if during the review the majority of others feel that the library should
> be part of Boost PP, I am certainly willing to do that. But I really do
> not see the purpose of doing that beforehand.

Besides the difficulties with compilers, there are two root issues with
variadics (and placemarkers) in regards to the pp-lib.

First, adding proper support requires breaking changes (argument
orderings through the library, for example). If this is to happen, I'd
prefer it to happen all at once in one major upgrade.

Second, the way that the pp-lib would use variadics is not the same as
what Edward's library does. AFAIK, Edward's library treats variadic
content as a standalone data structure. By "variadic content," I'm
referring to a comma-separated list of preprocessing token sequences such
as

    a, b, c

as opposed to variadic tuples or sequences such as

    (a, b, c)
    (a)(b, c)(d, e, f)

In a general sense, I consider treating variadic content as a data
structure as going in the wrong direction. There are far better ways to
utilize variadics than as input data structures. In particular, data
structures get passed into algorithms (otherwise, they're pointless).
However, an given interface can only have one variadic "argument". It is
far more useful to spend that variadic argument on the auxiliary
arguments to the algorithm.

By "auxiliary arguments," I'm referring to additional arguments that get
forwarded by higher-order algorithms to the user-supplied macros that
they invoke. Take a FOR_EACH algorithm as an example. A FOR_EACH
algorithm requires a sequence, a macro to invoke for each element of that
sequence, and auxiliary data to be passed through the algorithm to the
macro invoked for each element. What frequently happens now, with all
such algorithms in the pp-lib, is that more than one piece of auxiliary
data needs to get passed through the algorithm, so it gets encoded in
another data structure. Each of those pieces of auxiliary data then need
to be extracted latter--which leads to massive clutter and inefficiency.

In reality, it comes down to two choices for interface:

1) FOR_EACH(macro, auxiliary_data, ...) where __VA_ARGS__ is the data
structure.

This scenario leads to the following (slightly simplified):

    #define M(E, D) \
        /* excessive unpacking with TUPLE_ELEM, */ \
        /* or equivalent, goes here */ \
        /**/

    FOR_EACH(M, (D1, D2, D3), E1, E2, E3)
   
2) FOR_EACH(macro, data_structure, ...) where __VA_ARGS__ is the
auxiliary data.

    #define M(E, D1, D2, D3) \
        /* no unpacking required */ \
        /**/

    FOR_EACH(M, (E1, E2, E3), D1, D2, D3)

The latter case is also extensible to scenarios where the elements of the
data structure are non-unary. For example,

    #define M(E, F, D1, D2, D3) // ...

    FOR_EACH(M, (E1, F1)(E2, F2)(E3, F3), D1, D2, D3)

The only time you really need to unpack is when the data structure is
truly variadic (i.e. elements have different arity) such as:

    #define M(E, D1, D2, D3) // possibly unpack EF

    VARIADIC_FOR_EACH(M, (a)(b, c)(d, e, f), D1, D2, D3)

(This scenario happens, but is comparatively rare. It happens in fancier
scenarios, and it happens with sequences of types, e.g. std::pair<int,
double>.)

IMO, the second interface option is far superior to the first.

As a concrete example, I recently had to generate some stuff for the
Greek alphabet. However, I didn't want to mess around with multi-byte
encodings directly. This is exactly what I needed, but it contains the
basic idea:

template<class T> struct entry {
    T id, lc, uc;
};

int main(int argc, char* argv[]) {
    std::vector<entry<const char*>> entries;
    #define _(s, id, lc, uc, type, enc) \
        CHAOS_PP_WALL( \
            entries.push_back(entry<type> { \
                enc(id), enc(lc), enc(uc) \
            }); \
        ) \
        /**/
    CHAOS_PP_EXPR(CHAOS_PP_SEQ_FOR_EACH(
        _,
        (alpha, α, Α)
        (beta, β, Î’)
        (gamma, γ, Γ)
        (delta, δ, Δ)
        (epsilon, ε, Ε)
        (zeta, ζ, Ζ)
        (eta, η, Η)
        (theta, θ, Θ)
        (iota, ι, Ι)
        (kappa, κ, Κ)
        (lambda, λ, Λ)
        (mu, μ, Îœ)
        (nu, ν, Ν)
        (xi, ξ, Ξ)
        (omicron, ο, Ο)
        (pi, Ï€, Π)
        (rho, ρ, Ρ)
        (sigma, σ, Σ)
        (tau, Ï„, Τ)
        (upsilon, Ï…, Î¥)
        (phi, φ, Φ)
        (chi, χ, Χ)
        (psi, ψ, Ψ)
        (omega, ω, Ω),
        const char*, CHAOS_PP_USTRINGIZE(8)
    ))
    #undef _
    for (auto i = entries.begin(); i != entries.end(); ++i) {
        std::cout << i->id << ": " << i->lc << ", " << i->uc << '\n';
    }
    return 0;
}

Regards,
Paul Mensonides


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk