Boost logo

Boost :

From: Daryle Walker (darylew_at_[hidden])
Date: 2005-12-12 03:55:18


[Sorry if this has already been reported, fixed, and/or superceded.]

Looking at <http://www.boost.org/libs/wave/doc/token_ids.html>, I see
various kinds of tokens. There are tokens for the preprocessor that are
seen by the lexer and don't make it to the preprocessing iterator level.
The other sets of tokens do make it to that level, modulo any
transformations. The trigraphs are put in the operator token list.
However, trigraphs should not be there. They are processed before anything
else, even before the preprocessor tokens. So there should be another level
of lexer working here. As is, it doesn't seem that you could use the
trigraph for "#", "??=", for preprocessor directives.

    ??=include <cstdio> // this should work

On a related note, I thought maybe Wave should use a generator interface:

    template < typename Iterator, typename FileID >
    class phase1
    {
    public:
        phase1( Iterator b, Iterator e, FileID id );
        operator bool() const; // TRUE while not done
        cpp_p1_char_type operator ()();
    };

    template < typename Iterator, typename FileID >
    class phase2
    {
    public:
        explicit phase2( phase1<Iterator, FileID> const &p );
        operator bool() const; // TRUE while not done
        cpp_p2_line_string_type operator ()();
    };

    //...

You generally can't rewind, of course. The cpp_p1_char_type would contain
the expanded character's identity AND some indicator of its location
(starting iterator, file ID, and line, row, and un-lined offset numbers).
The cpp_p2_line_string_type would carry the locations for each character in
its string. Then the tokens of later phases would know the location of
their first characters.

-- 
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk