|
Boost : |
From: Dan Nuffer (dnuffer_at_[hidden])
Date: 2002-02-26 13:35:56
Kevin S. Van Horn wrote:
>
> On a similar note, although the GNU version of lex (flex) does support
> C++, it doesn't support generic programming (it wants to have an iostream
> as input.) Again, does anyone know of a version of lex that supports
> generic programming?
>
The only generic C++ lexer I know of is one of the examples I wrote for
Spirit, called slex. It's not the most featureful lexer, but it does work.
Here is a summary of the interface:
template <typename IteratorT = const char*, typename TokenT = int,
typename CallbackT = void(*)(const IteratorT&,
IteratorT&,
const IteratorT&,
const TokenT&,
lexer_control<TokenT>&)>
class lexer
{
public:
typedef CallbackT callback_t;
typedef typename std::iterator_traits<IteratorT>::value_type
char_t;
lexer(unsigned int states = 1);
void register_regex(const std::basic_string<char_t>& regex,
const TokenT& id, const CallbackT& cb = CallbackT(),
unsigned int state = 0);
TokenT next_token(IteratorT& first, IteratorT& last);
void create_dfa();
void set_case_insensitive(bool insensitive);
bool load (std::ifstream &in, long unique_id = 0);
bool save (std::ofstream &out, long unique_id = 0);
};
It works for both char and wchar_t input.
Unfortunately it doesn't work with input iterators, because I haven't
implemented buffering yet. You have to use forward iterators.
Another scanner generator you might also check out is re2c. It
generates C code, but the generated code is easily customizable via
macros, so it could probably work in a generic C++ fashion.
http://www.tildeslash.org/re2c/
--Dan Nuffer
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk