Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2005-09-16 00:57:45


Answers inline...

Darren Cook wrote:
> Do you think the library should be accepted as a Boost library? Yes, but
> conditional on having benchmarks showing a worthwhile speed improvement
> over boost.regex. (Or alternatively over spirit.) Without that there is
> no strong reason to have it in Boost along with both Spirit and boost.regex.
>

There are results from of performance benchmarks of static xpressive vs.
dynamic xpressive vs. Boost.Regex in the Appendix of xpressive's
documentation. You must have missed it. See:

http://boost-sandbox.sf.net/libs/xpressive/doc/html/xpressive/perf.html

In short, xpressive comes out consistently ahead of Boost.Regex on short
matches, and roughly on par for longer matches (with wide variation).
Results are shown for both gcc 3.4 and vc7.1. The xpressive download
includes the code for the perf test, so you can run it yourself, if you
like.

> * user_s_guide.html
> As I read I assumed "sregex" meant static (compile-time) regex. I then
> thought compile() must be very clever and wondered why bother with the
> alternative ">>" syntax.
> So I think you need to make it clearer on this page that sregex means
> std::string regex, and that compile() is for a run-time regex, and the
> ">>" syntax is for a compile-time regex.

Agreed.

> * creating_a_regex_object.html
> 1. Either the meaning of Perl's /s modifier needs to be defined
> clearly, or the difference between "_" and "~_n" needs to be shown with
> an example (incidentally none of your examples at examples.html match
> strings with carriage-returns).

Agreed. FYI, "_" matches any one character. ~_n matches any character
that is not '\n'. I also need to describe _ln which matches a logical
newline (eg., "\n" or "\r" or "\r\n" or other line separators) and ~_ln
which matches any one character that is not a line separator. This all
needs to be documented better.

> 2. I see I can use icase("Abc") but is there a way to say the whole
> regex should be case-insensitive? I.e. the equivalent of:
> "/match something/i"

You can just wrap the whole regex in icase(). I need to show an example
of that.

> * grammars_and_nested_matches.html
> In the example that starts:
> sregex parentheses;
> parens = '('
>
> should "parens" actually be "parentheses" ?

Yes. My bad.

> 2. In Filtering Nested Results, I wasn't clear what the purpose was. Is
> it to show all the name matches before all the id matches? If so,
> choosing a less regular example string would help, e.g. with more names
> than ids, names following names some of the time, etc.

I'm not at all sure of the utility of the nested results filter, and I
may just cut it. After matching a regex that contains nested regexes,
the match_results object contains nested results. Figuring out which
results correspond to which regex can be difficult. The filter lets you
see only those results corresponding to a particular nested regex. But
I've yet to need it in practice. *shrug*

> 3. "See the definition of output_nested_results in the Examples section."
> I think that function should be moved to
> grammars_and_nested_matches.html; it seemed out of place in examples.html.

You're right it doesn't belong in Examples. But I didn't want to clutter
the user doc with what is really an implementation detail. I'll think
about it.

> * Other
> 1. I'd like to see some fuller examples, that show the I/O part as
> well. E.g. a full program that takes a list of email addresses on stdin,
> one per line, and spits out a list of the illegal ones.

Haha! Have you /seen/ the regex that matches email addresses? It's 5
pages long. But I get the idea -- examples are important. I'll see what
I can come up with.

> 2. Benchmarking. I wanted to see the relative speed of compile-time vs.
> run-time vs. boost::regex (and ideally vs. PCRE or a scripting language)
> on some realistic application.

It's in there.

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk