|
Boost : |
From: christopher diggins (cdiggins_at_[hidden])
Date: 2004-12-27 20:37:15
I just wrote a quick and dirty comparison between YARD and Spirit and YARD
performs roughly 10x faster as a toy C++ tokenizer. I know Joel, I said I
wouldn't do any comparisons, but I couldn't resist, what with Dave's claim
to be outperforming Spirit by 50x!
This increased performance of YARD is due to the fact that YARD generates
the parser at compile-time, rather than at run-time. Clearly I am not using
an optimized Spirit grammar, I opted instead to implement both grammars in a
naive and straightforward manner. Here is the full Spirit grammar I used:
single_comment_p = str_p("//") >> *(~ch_p('\n')) >> ~ch_p('\n');
full_comment_p = str_p("/*") >> anychar_p - str_p("*/");
comment_p = single_comment_p | full_comment_p;
ws = +(space_p | comment_p);
escape_char_p = ch_p('\\') >> anychar_p;
string_literal_p = ch_p('"') >> *(escape_char_p | ~ch_p('"')) >>
ch_p('"');
char_literal_p = ch_p('\'') >> (escape_char_p | ~ch_p('\'')) >>
ch_p('\'');
ident_p = (alpha_p | ch_p('_')) >> +(alnum_p | ch_p('_'));
number_p = real_p;
cpp_token = ws | char_literal_p | string_literal_p | number_p |
ident_p[&inc_counter];
tokens = *(cpp_token | anychar_p);
I would appreciate any suggestions on how to improve the Spirit grammar. The
YARD grammar is far more verbose, here is only a small snippet:
struct MatchBeginFullComment : public
re_and<
MatchChar<'/'>,
MatchChar<'*'>
>
{ };
struct MatchEndFullComment : public
re_and<
MatchChar<'*'>,
MatchChar<'/'>
>
{ };
struct MatchFullComment : public
re_and<
MatchBeginFullComment,
MatchEndFullComment
>
{ };
struct MatchComment : public
re_or<
MatchSingleLineComment,
MatchFullComment
>
{ };
Anyway you get the picture, YARD is verbose but quite fast. I will be
including the full source in the next YARD release.
Christopher Diggins
http://sourceforge.net/projects/yard-parser
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk