Boost logo

Boost :

From: christopher diggins (cdiggins_at_[hidden])
Date: 2004-12-27 20:37:15

I just wrote a quick and dirty comparison between YARD and Spirit and YARD
performs roughly 10x faster as a toy C++ tokenizer. I know Joel, I said I
wouldn't do any comparisons, but I couldn't resist, what with Dave's claim
to be outperforming Spirit by 50x!

This increased performance of YARD is due to the fact that YARD generates
the parser at compile-time, rather than at run-time. Clearly I am not using
an optimized Spirit grammar, I opted instead to implement both grammars in a
naive and straightforward manner. Here is the full Spirit grammar I used:

      single_comment_p = str_p("//") >> *(~ch_p('\n')) >> ~ch_p('\n');
      full_comment_p = str_p("/*") >> anychar_p - str_p("*/");
      comment_p = single_comment_p | full_comment_p;
      ws = +(space_p | comment_p);
      escape_char_p = ch_p('\\') >> anychar_p;
      string_literal_p = ch_p('"') >> *(escape_char_p | ~ch_p('"')) >>
      char_literal_p = ch_p('\'') >> (escape_char_p | ~ch_p('\'')) >>
      ident_p = (alpha_p | ch_p('_')) >> +(alnum_p | ch_p('_'));
      number_p = real_p;
      cpp_token = ws | char_literal_p | string_literal_p | number_p |
      tokens = *(cpp_token | anychar_p);

I would appreciate any suggestions on how to improve the Spirit grammar. The
YARD grammar is far more verbose, here is only a small snippet:

  struct MatchBeginFullComment : public
  { };

  struct MatchEndFullComment : public
  { };

  struct MatchFullComment : public
  { };

  struct MatchComment : public
  { };

Anyway you get the picture, YARD is verbose but quite fast. I will be
including the full source in the next YARD release.

Christopher Diggins

Boost list run by bdawes at, gregod at, cpdaniel at, john at