Boost logo

Boost :

From: Dan Nuffer (dnuffer_at_[hidden])
Date: 2002-02-26 13:35:56

Kevin S. Van Horn wrote:
> On a similar note, although the GNU version of lex (flex) does support
> C++, it doesn't support generic programming (it wants to have an iostream
> as input.) Again, does anyone know of a version of lex that supports
> generic programming?

The only generic C++ lexer I know of is one of the examples I wrote for
Spirit, called slex. It's not the most featureful lexer, but it does work.

Here is a summary of the interface:

template <typename IteratorT = const char*, typename TokenT = int,
             typename CallbackT = void(*)(const IteratorT&,
                                          const IteratorT&,
                                          const TokenT&,
class lexer
         typedef CallbackT callback_t;
         typedef typename std::iterator_traits<IteratorT>::value_type

         lexer(unsigned int states = 1);

         void register_regex(const std::basic_string<char_t>& regex,
                 const TokenT& id, const CallbackT& cb = CallbackT(),
                 unsigned int state = 0);

         TokenT next_token(IteratorT& first, IteratorT& last);

         void create_dfa();

         void set_case_insensitive(bool insensitive);

         bool load (std::ifstream &in, long unique_id = 0);
         bool save (std::ofstream &out, long unique_id = 0);

It works for both char and wchar_t input.
Unfortunately it doesn't work with input iterators, because I haven't
implemented buffering yet. You have to use forward iterators.

Another scanner generator you might also check out is re2c. It
generates C code, but the generated code is easily customizable via
macros, so it could probably work in a generic C++ fashion.

--Dan Nuffer

Boost list run by bdawes at, gregod at, cpdaniel at, john at