Boost logo

Boost :

From: Paul Mensonides (pmenso57_at_[hidden])
Date: 2002-04-18 16:20:20


> "Paul Mensonides":
> [...]
> > Good. :) My conclusions are that the BPL should include functionality
> like this
> > for the purposes of debugging and error messages, etc..
> [...]
> > Ideally, you would like some things to be macro recursive (which always
> ends up
> > on one line) and some things to be file iterative (which always ends up
> on
> > different lines).
>
> If I understand correctly, then file iteration has the following benefits:
> - it can be faster on EDG based compilers.
> - it expands stuff on separate lines, which may make debugging easier.

Yes, more or less. I think that the usefulness of the output is somewhat
important. I was getting sick of seeing error messages like this:

error: something or other...
SOME_MACRO(SOME_LIMIT)
^

So I tried doing a preprocessor pass as a separate step. That doesn't help much
with large amounts of generated 'text'. Then I got thousands of characters all
on one line (which was wrapped) with an ambiguously located circumflex.

If it wasn't for that reason, I would consider the mechanism that I'm using to
be a hack, but the output is much cleaner.

> At the present, I don't know how readable the technique can be made, or
> how generic it is. I'd like to see the "chaos" library headers, that
> support the technique (I don't recall if they were available somewhere).
> Anyway, I think that this technique might be worth including in the
> preprocessor library or perhaps in some other library. If we decide to
> provide such support, then I'd certainly like Paul to compose a submission
> of some sort.

No they aren't available--they are my part of my personal library of all kinds
of stuff. I go through them and 'fix' some naming issues--for instance, the
primitives have no library prefix ala CHAOS_.

The preprocessor library itself is no where near the complexity of the Boost
preprocessor library that you made. The library was intended soley to produce
'lists' of stuff like parameters or whatever. At doing that, they are
excellent, but it doesn't have all the arithmetic facilities. I must also note,
that the library uses *both* recursive macro expansion and iterative file
expansion. The library is also cheating slightly with regards to EDG based
front-ends. There is only one set of macro recursion macros, which is throughly
unrolled for EDG based compilers (i.e. it includes an alternate file from the
*real* implementation). For instance... (simplified)

#define CHAOS_LINEAR_1(name) name(1)
#define CHAOS_LINEAR_2(name) CHAOS_LINEAR_1(name) name(2)
#define CHAOS_LINEAR_3(name) CHAOS_LINEAR_2(name) name(3)

...that is the good way. The library detects EDG and includes an alternate
implementation of this for Comeau C++, etc....

#define CHAOS_LINEAR_1(name) name(1)
#define CHAOS_LINEAR_2(name) name(1) name(2)
#define CHAOS_LINEAR_3(name) name(1) name(2) name(3)

// etc.

...which I agree is *terrible* but I generated with another program, so it isn't
*that* bad.

> ...
>
> It seems to me that file based iteration can only provide functionality
> similar to BOOST_PP_REPEAT(), but not functionality like BOOST_PP_FOR().
> BOOST_PP_REPEAT(COUNT,MACRO,DATA) repeats the given MACRO(INDEX,DATA) like
> this:
>
> MACRO(0,DATA) MACRO(1,DATA) ... MACRO(COUNT-1,DATA)

Actually, the file based inclusion mechanism can do *anything* and a lot more.
:) Simply because you can define macros that act as manifest parameters to the
file. Likewise, the file itself can use conditional compilation, etc.. My
implementation of it will perform file iteration to a depth of five, but I've
never needed anything that deep. It also abstracts the level of depth. For
instance, if in a file you want to iterate to another level, you have to set the
bounds of the iteration. That requires a file that *knows* what the name of
those macro 'parameters' are. The key part of this is that integer values *are*
possible to assign! When you define a macro to be another macro, normally that
only acts as a reference (i.e. extremely lazy evaluation), so what you need is a
way to actually force a macro to be evaluated. Two things can do that:
conditional directives and include directives. The library uses this approach
to abstract the iteration depth:

#define NEW_BOUNDARY 100
#include SET_NEXT_UBOUND()

After this, NEW_BOUNDARY is no longer required.

> As you can see, the DATA is passed, as is, to MACRO, but INDEX ranges from
> 0 to COUNT-1.
>
> On the other hand, BOOST_PP_FOR(STATE,PRED,OP,MACRO) makes it possible to
> evolve the STATE using the OP macro (the following is slightly
> simplified):
>
> MACRO( STATE )
> MACRO( OP(STATE) )
> MACRO( OP(OP(STATE)) )
> MACRO( OP(OP(OP(STATE))) )
> ...
>
> The repeated application of OP() to STATE is actually done incrementally.
> So, there will be only a linear amount of OP() applications. The
> incremental evolution of STATE allows many interesting repetitions to be
> performed in O(N) time instead of O(N*N) time. Of course, BOOST_PP_FOR()
> can do many things that are not possible with BOOST_PP_REPEAT().

My library doesn't directly have stuff like the application of OP()--probably
because I haven't needed it so far. :) But there is not reason that you can't
use both a macro recursive approach and a file iterative approach
simultaneously. In fact, I've found it to be much faster to use some of each
rather than all one or all the other on Comeau.

> ...
>
> I investigated the arg_tuple_size.h header, which uses the preprocessor
> library, and found out that the reason why it is expanded so slowly, even
> with g++, was that BOOST_MPL_TEMPLATE_PARAMETERS() uses BOOST_PP_ADD(), in
> such a way that the time complexity of
> BOOST_MPL_TEMPLATE_PARAMETERS(first, last, param) is
> O((last-first)*(last-first)). This is because the time complexity of
> BOOST_PP_ADD(X,Y) is currently O(Y).

Unfortunately, I'm not really familiar with the BPPL specifically. Though I am
familiar with the underlying ideas. I have found that EDG based compilers
absolutely *hate* multiply recursive expansion.

> When BOOST_PYTHON_MAX_ARITY is increased significantly, the time taken by
> BOOST_PP_ADD() grows quickly. Because there are O(N*N) parameters, and the
> expansion of a single parameter takes O(N) time, it takes O(N*N*N) time to
> expand arg_tuple_size.h for large N.

As I said before, it takes about 5 seconds for my version with the upper bound
at 50. That is partially because some macro recursion is unrolled and partially
because it avoids multiply recursive macros. I just tested it again after
raising the upper bound to 100--that took 18 seconds on an paltry 600Mhz P3.
Which, for the totally unrealistic function type of 100 arguments, is not that
bad. :)

> Currently, a simple way to reduce time taken by BOOST_PP_ADD(X,Y) is to
> make sure that Y < X. In this case it can be achieved simply by swapping
> the order of parameters to BOOST_PP_ADD() in the macro
> BOOST_MPL_AUX_TEMPLATE_PARAMETER(). This changes the time complexity of
> BOOST_MPL_TEMPLATE_PARAMETERS(first, last, param) to O(first *
> (last-first)), which, in this case, is much smaller than
> O((last-first)*(last-first)). On g++, preprocessing time was reduced from
> about 2 minutes to 8 seconds when BOOST_PYTHON_MAX_ARITY was 50.
>
> There are many ways in which BOOST_PP_ADD(X,Y) could be implemented in
> logarithmic time (without huge amounts of macro definitions, that is).
> Unfortunately, my earlier attempts on this suggest that it only speeds
> things up when the parameters X and Y become rather large (too large to be
> really useful). I think I'll spend some more time on this when I get some
> time.
>
> ...
>
> I think that a long term solution, to avoid performance problems like
> this, would be to drop BOOST_PP_REPEAT() and use BOOST_PP_FOR() based
> solutions. In most cases, BOOST_PP_FOR() makes it possible to perform
> repetition so that preprocessing time is O(N), where N is the number of
> tokens produced.
>
> For example, the time complexity of the recently added macro
> BOOST_PP_REPEAT_FROM_TO(FIRST,LAST,MACRO,DATA) is O(FIRST*(LAST-FIRST)).
> Using BOOST_PP_FOR(), it would be possible to provide a similar primitive
> whose time complexity would be the optimal O(LAST-FIRST).
>
> On the other hand, the use of BOOST_PP_FOR() means that the R parameter
> needs to be passed around explicitly by the user, which is a nuisance. I
> think I'll write up implementations of the BOOST_PP_ENUM and
> BOOST_PP_REPEAT macros using BOOST_PP_FOR and post them on the list for
> discussion.

I will clean up my 'linearization library' and post it somewhere for you. Along
with some examples. I'll get back to you on this. Just to verify my intent, I
wrote my own library for this purpose before I knew there was a preprocessor
library--so it is not intended to replace it. I do think there are some
principles that the library uses that are solid however. Speaking of
replacement, there is no way that it could because there isn't even close to the
number of facilities. My library produces lists of whatever you want. The only
'math' stuff it has are increment and decrement. But I personally think that it
is *very* easy to use, and the code using it is clear--and the preprocessor
output is actually somewhat readable. :)

Paul Mensonides


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk