Boost logo

Boost :

Subject: Re: [boost] Review Request: Variadic Macro Data library
From: Edward Diener (eldiener_at_[hidden])
Date: 2011-02-21 18:29:46


On 2/21/2011 2:37 PM, Paul Mensonides wrote:
> On Mon, 21 Feb 2011 12:57:05 -0500, Edward Diener wrote:
>
>> On 2/21/2011 3:57 AM, Paul Mensonides wrote:
>
>>> Another way to provide comfort is via education. Hardcore pp-
>>> metaprogramming knowledge is not required for this.
>>
>> Providing function-like syntax for invoking a macro with a variable
>> number of parameters, as an alternative to pp-lib data syntax, is
>> important to end-users and library developers if just for the sake of
>> familiarity and regularity. A programmer using a "call" syntax which may
>> be a macro or a function is not going to stop and say: this is a
>> function so I can call it as 'somefunction(a,b,c)', this is a macro and
>> therefore I must call it as 'somemacro((a,b,c))'. Instead he will ask
>> that the same syntax be applied to both. You seem to feel this is wrong
>> and that someone invoking a macro should realize that it is a macro (
>> and normally does because it is capital letters ) and therefore be
>> prepared to use a different syntax, but I think that regularity in this
>> respect is to be valued.
>
> Most C/C++ developers perceive macro expansion mechanics to be similar to
> function call mechanics. I.e. where a user "calls" a macro A, and that
> macro "calls" the macro B, the macro B "returns" something, which is, in
> turn "returned" by A. That is fundamentally *not* how macro expansion
> behaves. The perceived similarity, where there is none (going all the
> way back to way before preprocessor metaprogramming) is how developers
> have gotten into so much trouble on account of macros.

OTOH users of macros are not concerned, as developers should be, of how
the macro expands. They are just given a macro syntax to use which the
developer supposes should feel natural to them.

>
> I take serious issue with anything that intentionally perpetuates this
> mentality. It is one thing if the syntax required is the same by
> coincidence. It's another thing altogether when something is done to
> intentionally make it so.

I really feel you are stretching your case for why you do not like
#define SOME_MACRO(...) as opposed to #define SOME_MACRO((...)). I do
understand your feeling that variadics can be more easily misused than
pp-lib data types. But to me that is a programmer problem and not your
problem.

>
>>> ----
>>>
>>> BOOST_VMD_DATA_ELEM(n, ...)
>>>
>>> The direct analog of this in Chaos is CHAOS_PP_VARIADIC_ELEM(n, ...).
>>> If I (or you) added it to the pp-lib, I would prefer it be called
>>> BOOST_PP_VARIADIC_SIZE(...).
>>
>> Did you mean BOOST_PP_VARIADIC_ELEM(n,...) ?
>
> Yes, sorry!
>
>>> ----
>>>
>>> BOOST_VMD_DATA_TO_PP_TUPLE(...)
>>>
>>> Chaos has no direct analog of this because (as you know), it's
>>> pointless (unless you're doing something internally to for compiler
>>> workarounds).
>>
>> I do not think it is pointless. I am going variadics -> tuple. The
>> end-user wants regularity, even if its a no-brainer to write '(
>> __VA_ARGS__ )'.
>>
>> I think this is where we may differ, not technically, but in our view of
>> what should be presented to an end-user. I really value regularity ( and
>> orthogonality ) even when something is trivial. My view is generally
>> that if it costs the library developer little and makes the end-user see
>> a design as "regular", it is worth implementing as long as it is part of
>> the design even if it is utterly trivial. You may feel I am cosseting an
>> end-user, and perhaps you are right. But as an end-user myself I
>> generally want things as regular as possible even if it causes some very
>> small extra compile time.
>
> It isn't the end of the world to provide it for the sake of symmetry.
>
>>> The primary reason that there is no direct conversion (other than
>>> (__VA_ARGS__)) to the other data types is that with variadic data there
>>> is no such thing as an empty sequence and there is no safe way to
>>> define a rogue value for it without artificially limiting what the data
>>> can contain.
>>
>> I understand this.
>>
>> My point of view is that although it's theoretically a user error to use
>> an empty sequence for a variadic macro even if a corresponding pp-lib
>> data type can be empty, in reality it should be allowed since there is
>> no way to detect it. I understand your point of view that you want to do
>> everything possible to eliminate user error, and I agree with it, but
>> sometimes nothing one can do in a language is going to work ( as you
>> have pointed out with variadics and an empty parameter ).
>
> That is not what I'm referring to. To clarity, using the STL as an
> example, a typical algorithm processes a finite sequence by progressively
> closing a range of iterators [i,j). Effectively, this iterator range is
> a "view" of a sequence of elements. (The underlying data structure also
> has its own natural view.) However, with the preprocessor, there is no
> indirection and there is no imperative functionality (i.e. assignment).
> Because of that, you cannot form views (in whatever form). Instead, you
> have to embed the entire data structure.
>
> At that point, you can do one of two things. Either use the data
> structure itself as your "view" and progressively make it smaller as you
> "iterate" or you can embed the data structure into another data structure
> which let's you add an arbitrary "terminal" state--the equivalent of the
> iterator range [j,j). For variadic content as a sequence, you cannot
> directly use the variadic content as the view because you cannot encode
> this terminal state. Instead, you'd have to go with option two, but then
> you pay the price for it elsewhere.

I understand what you are saying. The two basic pieces of functionality
( size and element access ) and the back and forth conversions are
certainly a very limited set of functionality. Anybody who wants to try
doing further tricks with a variadic sequences, as opposed to your own
pp-lib data, is welcome to it but my original goal was really just to
provide a syntax interface for variadics to pp-lib.

>
>> Going the other way from a pp data type to variadics, I admit I did not
>> consider the case in my current library where the pp data type is empty.
>> Since this can be detected on the pp data type side, I think I can put
>> out a BOOST_PP_ASSERT_MSG(cond, msg) in that case, or I can choose to
>> ignore it if that is what I decide to do. But in either case it is a
>> detectable problem.
>
> For the most part, however, these macros already exist. They are named
> (e.g.) BOOST_PP_SEQ_ENUM. However, some do not exist such as
> BOOST_PP_ARRAY_ENUM, and others have different naming conventions such as
> BOOST_PP_TUPLE_REM_CTOR. For the sake of symmetry, BOOST_PP_ARRAY_ENUM
> and BOOST_PP_TUPLE_ENUM could be added. However, having them use the ENUM
> nomenclature is preferable for several reasons. First, because it
> expresses a distinction between a comma-separated list of arguments
> (variadic content) and a comma-separated list of elements which provides
> a definition that avoids the zero-element vs. single-empty-element
> problem. Second, is that if something like BOOST_PP_SEQ_ENUM is used to
> attempt to create a comma-separated list of _arguments_, the user is in
> for a world of hurt trying to write portable code.

Your names are fine with me, as I indicated further in my previous reply.

>
>
>>> BOOST_VMD_PP_TUPLE_ELEM(n, tuple)
>>>
>>> The direct analog of this in Chaos is CHAOS_PP_TUPLE_ELEM(n, tuple)
>>> where it ignores the 'n' if variadics are enabled.
>>
>> The 'n' tells which tuple element to return. How can it be ignored ?
>>
>>
>>> CHAOS_PP_TUPLE_ELEM(?, 0, (a, b, c)) => a
>>
>> OK, I see. the '?' is just a marker for the size of the tuple when there
>> are variadics and must be specified when there are not.
>
> Sorry, I was mixing up the arguments. Without variadics, you must have a
> size. With variadics, you don't need it, so you can leave it there for
> compatibility with the non-variadic scenario and ignore it and
> additionally provide an "overload" that doesn't have it at all. Except
> for compiler workarounds (which I'm sure you know how to solve in this
> case), detecting the difference between two and three arguments (where it
> must be either 2 or 3 arguments) is simple:
>
> #if VARIADICS
>
> #define TUPLE_ELEM(...) \
> CAT( \
> TUPLE_ELEM_, \
> TEST_23(__VA_ARGS__, 3, 2,) \
> )(__VA_ARGS__) \
> /**/
>
> #define TEST_23(_1, _2, _3, n, ...) n
>
> #define TUPLE_ELEM_2(n, tuple) // ...
> #define TUPLE_ELEM_3(size, n, tuple) TUPLE_ELEM_2(n, tuple)
>
> #else
>
> #define TUPLE_ELEM(size, n, tuple) // ...
>
> #endif
>
> This is a very fast dispatch.
>
>> That is an interesting technique below. Bravo ! As long as it is
>> documented, since it confused me until I took a few more looks and
>> realized what you are doing.
>
> It is just a dispatcher to emulate overloading on number of arguments.
> You'd actually do something that makes the dispatch as fast as possible
> (as above) which is easy with a small set of possibilities (like 1|2 or 2|
> 3).
>
>>> BOOST_VMD_PP_{ARRAY,LIST,SEQ}_TO_DATA(ds)
>>>
>>> These I don't see the point of these--particularly with the pp-lib
>>> because of compiler issues. These already exist as BOOST_PP_LIST_ENUM
>>> and SEQ_ENUM. There isn't an ARRAY_ENUM currently, but it's easy to
>>> implement. The distinction between these names and the *_TO_DATA
>>> variety is that these are primary output macros.
>>
>> That's just a name. Names can always be changed. My macros are the same
>> as yours in that they are output macros which convert the pp-lib data
>> types to variadic data.
>
> The primary distinction is the perspective induced by the names. If
> user's attempt to use these macros to produce macro argument lists are in
> for portability problems. Particularly:
>
> #define REM(...) __VA_ARGS__
>
> #define A(im) B(im) // im stands for "intermediate"
> // (chaos-pp nomenclature)
> #define B(x, y) x + y
>
> A(REM(1, 2)) // should work, most likely won't on many preprocessors

I understand your concerns. But I don't think you can do anything about
how programmers use things. You provide functionality because it has its
uses. If some of the uses lead to potential problems because of
programmer misunderstanding or compiler weakness you warn the
programmer. That's the best you can do without removing decent
functionality just because of programmer misuse or compiler fallability.
Of course good docs about pitfalls always help.

>
> My understanding is that you want to take a list of macro arguments and
> convert it to something that can be processed as a sequential data
> structure. That's one concept. Converting from a sequential data
> structure to a list of comma-separated values is another concept. But
> converting from a sequential data structure to a list of macro arguments
> is another concept altogether--one that is fraught with portability
> issues that cannot be encapsulated by the library.
>

I am more than aware of that.

>>> If an attempt is made by a user
>>> to use the result as macro arguments, all of the issues with compilers
>>> (e.g. VC++) will be pulled into the user's domain.
>>
>> The returned variadic data is no different from what the user will enter
>> himself when invoking a variadic data macro. The only compiler issue I
>> see is the empty variadic data one.
>
> I think the difference is conceptual. A list of comma-separated things
> (like function parameters, structure initializers, etc.) is conceptually
> different that a list of macro arguments. The going back to a list of
> arguments part is where things go wrong.
>
>>> BOOST_VMD_DATA_TO_PP_TUPLE(...)
>>> -> (nothing, unless workarounds are necessary)
>>
>> I know its trivial but I still think it should exist.
>
> It is quite possible that workarounds need to be applied anyway to (e.g.)
> force VC++ to "let go" of the variadic arguments as a single entity.

I will look further into this issue. I did have a couple of VC++
workarounds I had to use, which I was able to solve thanks to your own
previous cleverness dealing with VC++.

>
>>> BOOST_VMD_DATA_TO_PP_ARRAY(...)
>>> -> BOOST_PP_TUPLE_TO_ARRAY((...))
>>> or BOOST_PP_TUPLE_TO_ARRAY(size, (...))
>>>
>>> BOOST_VMD_DATA_TO_PP_LIST(...)
>>> -> BOOST_PP_TUPLE_TO_LIST((...))
>>> or BOOST_PP_TUPLE_TO_LIST(size, (...))
>>>
>>> BOOST_VMD_DATA_TO_PP_SEQ(...)
>>> -> BOOST_PP_TUPLE_TO_SEQ((...))
>>> or BOOST_PP_TUPLE_TO_SEQ(size, (...))
>>
>> For the previous three, see above discussion about using
>> SOME_MACRO(a,b,c) vs. SOME_MACRO((a,b,c)). I do understand your reason
>> for this as a means of getting around the empty-variadics user error.
>
> It isn't that. I don't like interface bloat. That's like not being able
> to decide on size() versus length() so providing both.
>
> If the use case is something like what you mentioned before:
>
> #define MOC(...) /* yes, that's you, Qt */ \
> GENERATE_MOC_DATA(TUPLE_TO_SEQ((__VA_ARGS__))) \
> /**/
>
> Then why does the TUPLE_TO_SEQ((__VA_ARGS__)) part matter to the
> developer who invokes MOC?

Because I am not converting a tuple to a seq but a variadic sequence to
a seq, and I feel the syntax should support that idea.

>
>> But I still feel that treating variadic data here as tuples is wrong
>> from the end-user point of view even though it elegantly solves the
>> empty variadic data problem. In my internal code I am solving the
>> problem in the exact same way, but I am keeping the syntax as
>> SOME_MACRO(a,b,c) as opposed to SOME_MACRO((a,b,c)).
>>
>> So I would say, please consider using the SOME_MACRO(a,b,c) instead as I
>> am doing.
>>
>> I would even say to change my names to:
>>
>> BOOST_PP_ENUM_TUPLE(...)
>> BOOST_PP_ENUM_ARRAY(...)
>> BOOST_PP_ENUM_LIST(...)
>> BOOST_PP_ENUM_SEQ(...)
>
> I'm not terribly opposed to just BOOST_PP_TO_TUPLE(...), etc..
>
> #define BOOST_PP_TO_TUPLE(...) (__VA_ARGS__)
> #define BOOST_PP_TO_ARRAY(...) \
> (BOOST_PP_VARIADIC_SIZE(__VA_ARGS__), BOOST_PP_TO_TUPLE(__VA_ARGS__) \
> /**/
> // BTW, an "array" is a pointless data structure
> // when you have variadics, but whatever
> #define BOOST_PP_TO_LIST(...) \
> BOOST_PP_TUPLE_TO_LIST((__VA_ARGS__)) \
> /**/
> #define BOOST_PP_TO_SEQ(...) \
> BOOST_PP_TUPLE_TO_SEQ((__VA_ARGS__)) \
> /**/
>
> I'm a lot more opposed to going back from a proper data structure to an
> "argument list".

Then I will go back to an 'element list" <g>. If the end-user uses it as
an "argument list" you can sue me but not for too much because I am
poor. <g><g>

>
>> Again I value the orthogonality of the pp-data to variadic data idea in
>> common names. BOOST_PP_TUPLE_REM_CTOR does not suggest that to the
>> end-user. How about:
>>
>> #define BOOST_PP_TUPLE_ENUM(tuple) \
>> BOOST_PP_TUPLE_REM_CTOR(tuple)
>>
>> in order to mimic your three following names.
>
> Sure, but with a better definition:
>
> #define BOOST_PP_TUPLE_ENUM BOOST_PP_TUPLE_REM_CTOR

Yes, that's better.

>
>>> BOOST_VMD_PP_ARRAY_TO_DATA(array)
>>> -> BOOST_PP_ARRAY_ENUM(array)
>>>
>>> BOOST_VMD_PP_LIST_TO_DATA(list)
>>> -> BOOST_PP_LIST_ENUM(list)
>>>
>>> BOOST_VMD_PP_SEQ_TO_DATA(seq)
>>> -> BOOST_PP_SEQ_ENUM(seq)
>>>
>>> also add:
>>> BOOST_PP_REM, BOOST_PP_EAT
>>
>> OK.
>
> These latter two (REM and EAT) having nothing to do with data structures
> per se, but they are extremely useful macros.

I agree.

>
>>> The basic gist is to add the low-level variadic stuff and adapt the
>>> existing tuple stuff to not require the size.
>>
>> I think our only real disagreements can be summed up as:
>>
>> I want the end-user to view variadic data as such from a perceptual
>> point of view, even with the
>> empty-variadics-is-an-error-which-can-not-be-caught problem. That is why
>> I supply the various conversions from variadic sequences to pp-lib types
>> and back explicitly, and I want some regularity in names reflecting that
>> although I do not insist on my own names.
>>
>> You feel that variadics as input for conversion should in general be
>> treated as a pp-lib tuple since creating a tuple from variadic macro
>> data is trivial.
>
> I don't like interface bloat, but if it is minor, it isn't the end of the
> world.
>
> The one thing that I really don't like is the blending of what I consider
> two different concepts: output and return value (even though I'm going
> against my own diatribe about macros != functions above by calling it
> "return value").
>
> Going from a data structure to a list of comma-separated values (like
> enumerators, function arguments, whatever) is output and is reflected by
> the name ENUM. Going from a data structure to a list of comma-separated
> macro arguments is return value (for input into other macros as disparate
> arguments). This latter use scenario is fraught with portability
> problems on the user end, and not necessarily ones that immediately show
> up.

Then let's use the ENUm name for output. And I will add a section to the
docs explaining the danger of using such output as arguments to other
macros.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk