Boost logo

Boost :

From: Daryle Walker (darylew_at_[hidden])
Date: 2005-08-19 08:02:13

Wave is our C++ preprocessor, but preprocessing is the third phase of
translating a file. (Looking at section 2.1 in the standard). I have a gut
feeling that all the compilers out there mush the first three phases
together in parsing a file. Glancing over the Wave docs gives me the same
impression about it. Are either one of these feelings accurate (this
requires a separate answer for each parser)? If the answer for Wave is
"yes", could we separate them, at least as an option? I feel that this is
important so we can gain full understanding of each phases. It may be more
complicated[1], and most likely slower, but it could represent a clean
implementation. (BTW, what phases does Wave act like?)

The first two[2] phases are:

1. Native characters that match basic source characters are converted as so
(including line breaks). Trigraphs are expanded to basic source[3]. Other
characters are turned into internal Unicode expansions (i.e. act like
"\uXXXX" or \Uxxxxxxxx"[4]).
2. The backslash-newline soft line-break combination are collapsed, folding
multiple native lines into one logical line. We should spit out an error if
the folding creates Unicode escapes. For non-empty files, we need to spit
out errors if the last line is not a hard line-break, either a non-newline
character or a backslash-newline combination is forbidden.

[1] Our "Wave-1" would convert the original text (iterators) into phase-1
tokens. Our "Wave-2" would convert phase-1 token (iterators) into phase-2
tokens, etc. Remember that any file-name and line/column positions will
have to be passed through each phase.
[2] I thought Wave just did phase-3, with phases 1 and 2 thrown in at the
same time. But now I'm not sure which phase Wave stops at. I don't think
it can go past phase-4, because doing phase-5 needs knowledge of the
destination platform.
[3] Only '?' characters that are part of a valid trigraph sequence are
converted; all others are left unchanged.
[4] But actual "\uXXXX" resolution doesn't happen until phase 5!

Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com

Boost list run by bdawes at, gregod at, cpdaniel at, john at