Boost logo

Boost :

Subject: Re: [boost] [preprocessor] check if a token is a keyword (was "BOOST_PP_IS_UNARY()")
From: Paul Mensonides (pmenso57_at_[hidden])
Date: 2010-08-17 02:12:31


On 8/16/2010 9:21 PM, Lorenzo Caminiti wrote:

> Yes, I am aware of this "limitation". However, for my application it
> is not a problem to limit the argument of `IS_PUBLIC()` to
> pp-identifiers and pp-numbers with no decimal points (if interested,
> see "MY APPLICATION" below).
>
> 1) Out of curiosity, is there a way to implement `IS_PUBLIC()`
> (perhaps without using `BOOST_PP_CAT()`) so it does not have this
> limitation? (I could not think of any.)

The limitation is not BOOST_PP_CAT per se, but token-pasting in general.
  The "good" part of using BOOST_PP_CAT in combination with
BOOST_PP_IS_NULLARY, et al, is that they have been "hacked" together for
preprocessors that are broken. Effectively, the detection macros work
by manipulating the operational syntax of macro expansion. For that to
work, stuff has to happen (namely, macros being expanded) at roughly the
correct time. The basic problem with VC++, for example, is that they
don't, so the pp-lib works overtime to attempt to _force_ expansions all
over the library. Unfortunately, there is a limit to what can be
forced--particularly with more advanced manipulations of the macro
expansion process such as those used by Chaos where there is analogy to
the uncertainty principle (e.g. you cannot force expansion in may
contexts without changing the result = you cannot measure particle
velocity and position at the same time). Even with those types of
manipulations, however, there is no way to do the above with "smashing
the particles together and seeing what comes out."

The limitation is caused by the ridiculous limitation that token-pasting
arbitrary tokens together where the result is not a single token results
in undefined behavior. Even to detect this scenario, the simplest
implementation in a preprocessor is to simply juxtapose the characters
making up the tokens and re-tokenize them. If there is more than one,
issue diagnostic, otherwise insert the single token. A better
definition would be simply to insert the resulting sequence of tokens.

> 2) Also, does the expansion of any of the following result in
> undefined behavior? (I don't think so...)
>
> IS_PUBLIC(public abc) // Expand to 1.
> IS_PUBLIC(public::) // Expand to 1.
> IS_PUBLIC(public(abc, ::)) // Expand to 1.
> IS_PUBLIC(public (abc) (yxz)) // Expand to 1.
>
> (My application relies on some of these expansions to work.)

All of those look fine. Basically, what happens in the following

#define M(a) id ## a

The appearance of the formal parameter 'a' adjacent to the token-pasting
operator affects _which_ actual parameter is substituted. Namely, the
version of the actual parameter which has _not_ had macros replaced in
it. However, the token-pasting operation doesn't occur until after that
substitution, and its operands are only the two _tokens_ immediately
adjacent to it. E.g.

#define A() 123
#define B(x) x id ## x

B(A())
=> 123 id ## A()
=> 123 idA()

I.e. the token-pasting operator affects the expansion of the actual
parameter (at least in that substitution context), but its operands are
only the tokens on either side after that substitution.

Because of that, you're basically getting:

PREFIX_ ## public abc
PREFIX_ ## public ::
PREFIX_ ## public ( abc , :: )
PREFIX_ ## public ( abc ) ( yxz )

...all of which are okay.

> MY APPLICATION
>
> I am using `IS_PUBLIC()` and similar macros to program the
> preprocessor to *parse* a Boost.Preprocessor sequence of tokens that
> represents a function signature. For example:
>
> class c {
> public: void f(int x) const; // Usual function declaration.
> };
>
> class c {
> PARSE_FUNCTION_DECL( // Equivalent declaration using pp-sequences.
> (public) (void) (f)( (int)(x) ) (const)
> );
> };

What happens with stuff like pointers, or does that not matter for your
application? E.g. (public) (void) (f)( (int*)(x) ) (const) ?

> The parser macro above can say "the signature sequence starts with
> `public` so this is a member function" at a preprocessor
> metaprogramming level and then expand to special code as a library
> might need to handle member functions. The parser macros can even do
> some basic syntax error checking -- for example, if `(const)` is
> specified as cv-qualifier at the end of the signature sequence of a
> non-member function, the parser macro can check that and expand to a
> compile-time error like `SYNTAX_ERROR_unexpected_cv_qualifier` (using
> `BOOST_MPL_ASSERT_MSG()`).
>
> Most of the tokens within C++ funciton signatures are composed of
> pp-idenfitiers such as the words `public`, `void`, `f`, etc. There are
> some exceptions like `,` to separate funciton parameters, `<`/`>` for
> templates, `:` for constructors' member initializers, etc. The grammar
> of my preprocessor parser macros requires the use of different tokens
> in these cases. For example, parenthesis `(`/`)` are used for
> templates instead of `<`/`>`:
>
> template< typename T> f(T x); // Usual.
>
> PARSE_FUNCTION_DECL( // PP-sequence.
> (template)( (typename)(T) ) (f)( (T)(x) )
> );
>
> (Instead of `(template)(<) (typename) (T) (>) (f)( (T)(x) )` which
> will have caused the parser macro to fail when inspecting `(<)` via
> one of the `IS_XXX()` macros as per the limitation from using
> `BOOST_PP_CAT()` mentioned above.)
>
> The grammar of my preprocessor parser macros clearly documents that
> only pp-identifiers can be passed as tokens of the function signature
> sequence. Therefore, the "limitation" of `IS_PUBLIC()` indicated above
> is not a problem for my application.
>
>
> Thank you very much.

You're welcome. I don't know the ultimate purpose of this encoding, but
the encoding itself doesn't look too bad.

Regards,
Paul Mensonides


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk