Boost logo

Boost :

From: Tobias Schwinger (tschwinger_at_[hidden])
Date: 2006-02-15 15:28:48


Paul Mensonides wrote:
>>>The process of macro expansion is best viewed as the process of
>>>scanning for macro expansion (rather than the process of a single
>>>macro expansion alone).
>>>When the preprocessor encounters a sequence of
>>
>>preprocessing tokens
>>
>>>and whitespace separations that needs to be scanned for macro
>>>expansion, it has to
>>>perform a number of steps. These steps are examined in this
>>>document in
>>>detail.
>>
>>Strike that paragraph. It uses terms not yet defined and
>>doesn't say much more than the title (assuming it's still
>>"how macro expansion works").
>
>
> The paragraph might not be in the right spot, but what the paragraph says is
> important. The process of macro expansion includes more than just expanding a
> single macro invocation. Rather, it is the process of scanning for macro
> expansions, and the whole thing is defined that way.
>

It becomes clear there is more to macro expansion than expanding a single macro and that multiple steps are required when reading the text...

The paragraph seems to try an introduction but does a bad job, IMO.

>>The reader probably has no idea what "painted" means at this
>>point. Indicate the forward-declaration by "see below" or
>>something like that.
>
>
> I do in the very next sentence.
>

Yeah, but with too much text, IMO.

>>>
>>>[Locations]
>>>
>>>There are several points where the preprocessor must scan a
>>>sequence
>>>of tokens looking for macro invocations to expand.
>>>The most obvious of these is between preprocessing
>>>directives (subject
>>>to conditional compilation). For example,
>>
>>I had to read this sentence multiple times for me to make sense...
>
>
> What part was difficult to follow? It seems pretty straightforward to me (but
> then, I know what I'm looking for).
>

"Between preprocesing directives" -- what?!

Sure, it is correct. But it's too much from the viewpoint of the preprocessor than from where your reader is at.

<snip>

>>in undefined
>>
>>>behavior. For example,
>>>
>>> #define MACRO(x) x
>>>
>>> MACRO(
>>> #include "file.h"
>>> )
>>
>>Indicate more clearly that this code is not OK.
>
>
> The next sentence says that it is undefined behavior. I'm not sure how to make
> it more clear than that.
>

An obvious sourcecode comment (e.g. in red).

>>>[Blue Paint]
>>>
>>>If the current token is an identifier that refers to a macro, the
>>>preprocessor must check to see if the token is painted.
>>>If it is painted, it outputs the token and moves on to the next.
>>>
>>>When an identifier token is painted, it means that the preprocessor
>>>will not attempt to expand it as a macro (which is why it
>>
>>outputs it
>>
>>>and moves on). In other words, the token itself is flagged as
>>>disabled, and it behaves like an identifier that does not
>>
>>corresponds
>>
>>>to a macro. This disabled flag is commonly referred to as "blue
>>>paint," and if the disabled flag is set on a
>>>particular token, that token is called "painted." (The
>>>means by which an
>>>identifier token can become painted is described below.)
>>
>>Remove redundancy in the two paragraphs above.
>

I believe I was unclear, here:

The redundancy isn't the problem (redundancy is actually a good thing in documentation, when used right) but too much redundancy in one spot...

>>>In the running example, the current token is the identifier OBJECT,
>>>which _does_ correspond to a macro name. It is not
>>
>>painted, however,
>>
>>>so the preprocessor moves on to the next step.
>>>
>>>[Disabling Contexts]
>>>
>>>If the current token is an identifier token that corresponds to a
>>>macro name, and the token is _not_ painted, the preprocessor must
>>>check to see if a disabling context that corresponds to the macro
>>>referred to by the identifier is active. If a corresponding
>>>disabling context is active, the preprocessor paints the identifier
>>>token, outputs it, and moves on to the next token.
>>>
>>>A "disabling context" corresponds to a specific macro and
>>>exists over
>>>a range of tokens during a single scan. If an identifier
>>>that refers
>>>to a macro is found inside a disabling context that
>>>corresponds to the
>>>same macro, it is painted.
>>>Disabling contexts apply to macros themselves over a given
>>>geographic
>>>sequence of tokens, while blue paint applies to particular
>>>identifier
>>>tokens. The former causes the latter, and the latter is what
>>>prevents "recursion" in macro expansion. (The means by which a
>>>disabling cotnext comes into existence is discussed below.)
>>>
>>>In the running example, the current token is still the identifier
>>>OBJECT. It is not painted, and there is no active
>>>disabling context that would cause it to be
>>>painted. Therefore, the preprocessor moves on to the next step.
>>
>>The introductions of these terms feels structurally too
>>aprupt to me. Introduce these terms along the way, continuing
>>with the example.
>
>
> They appear at the first point where their definition must appear.
>

I believe it's useful to sustain it.

<snip>

>>>from the replacement list.
>>>
>>> + X OBJECT F() +
>>> | |
>>> |__________|
>>> |
>>> OBJECT disabling context (DC)
>>
>><-- explain what a disabling context and then what blue paint
>>is is here
>
>
> Do you mean that they should be defined here for the first time, or that they
> should be defined here again (but maybe with less detail)?
>

I meant: introduce the terms here.

>>>function-like macro has no formal
>>>parameters, and therefore any use of the stringizing operator is
>>>automatically an error.) The result of token-pasting in F's
>>>replacement list is
>>
>>It's not clear to me why the stringizing operator leads to an
>>error rather than a '#' character. Probably too much of a
>>sidenote, anyway.
>
>
> I don't know the rationale for why it is the way it is.

In this case, "therefore" is a bit strange...

>>>[Interleaved Invocations]
>>>
>>>It is important to note that disabling contexts only exist during a
>>>_single_
>>>scan. Moreover, when scanning passes the end of a disabling
>>>context, that
>>>disabling context no longer exists. In other words, the
>>>output of a
>>>scan results only in tokens and whitespace separations. Some of
>>>those tokens might be painted (and they remain painted), but
>>>disabling contexts are not part of the result of scanning.
>>
>>>(If they were, there would be no need for blue paint.)
>>
>>Misses (at least) a reference to 16.3.4-1 (the wording "with
>>the remaining tokens of the source" (or so) is quite nice
>>there, so consider using something similar).
>

I have to clarify: I'm missing a hint (in the text not the examples) that tokens from outside the replacement list can form a macro invocation together with expansion output. The sentence from 16.3.4-1 is actually quite good.

>
>>I believe I wouldn't really understand what you are talking
>>about here without knowing that part of the standard. "A
>>single scan" -- the concept of rescanning was introduced too
>>periphicially to make much sense to someone unfamiliar with the topic.
>
>
> This all comes back to the beginning--the process is scanning a sequence of
> tokens for macros to expand (i.e. the first paragraph that you said I should
> strike). This entire process is recursively applied to arguments to macros
> (without begin an operand...) and thus this entire scan for macros to expand can
> be applied to the same sequence of tokens more than once. It is vitally
> important that disabling contexts don't continue to exist beyond the scan in
> which they were created, but that blue paint does. As I mentioned, there would
> be no need for blue paint--what the standard calls "non-replaced macro name
> preprocessing tokens"--if the disabling contexts weren't transient.
>

Now for the "rescanning" part: You don't have to introduce that term. Anyway I wouldn't have figured out what "a _single_ scan" was supposed to mean without knowing it, so it feels to me here is something missing.

> I'm pretty sure that I don't use the term "rescanning" anywhere in the whole
> article (yep, I checked, and I don't).
>

"Rescanning" comes from the standard, of course. I bit myself through chapter 16 because I wanted to know how things work before you posted this article.

>>>In C++, if any argument is empty or contains only whitespace
>>>separations, the behavior is undefined. In C, an empty
>>>argument is
>>>allowed, but gets special treatment. (That special treatment is
>>>described below.)
>>
>>It requires at least C99, right? If so, say it (it's likely
>>there are C compilers that don't support that version of the
>>language).
>
>
> As far as I am concerned, the 1999 C standard defines what C is until it is
> replaced by a newer standard. Likewise, 1998 standard defines what C++ is until
> it is replaced by a newer standard. I.e. an unqualified C implies C99, and
> unqualified C++ implies C++98. If I wished to reference C90, I'd say C90.
> Luckily, I don't wish to reference C90 because I don't want to maintain an
> article that attempts to be backward compatible with all revisions of a
> language. This is precisely why I should have a note above that variadic macros
> are not part of C++, BTW, even though I know they will be part of C++0x.

The previous version of the language is still widely used and taught so disambiguation makes some sense, IMO.

> Furthermore, deficiencies of compilers (like not implementing the language as it
> is currently defined or not doing what they should during this process) is not a
> subject of this article.
>
> OTOH, it wouldn't hurt to mention in the "Conventions" section that, at the time
> of this writing, C is C99 and C++ is C++98.
>

It adds clutter noone wants to read -- adding the version number still seems the better solution to me ;-).

>>>[Virtual Tokens]

BTW. I would've probably called them "control tokens" in analogy to "control characters" -- "virtual" has that dynamic-polymorphism-association, especially to C++ programmers with non-native English...

>>I hope it's of any use.
>
> Definitely. Thanks again.
>

You're welcome -- it's my way to thank you for your support.

--
Tobias

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk