|
Boost : |
From: Rob Stewart (stewart_at_[hidden])
Date: 2005-05-18 14:14:15
From: "Eric Niebler" <eric_at_[hidden]>
>
> Docs at:
> http://boost-sandbox.sf.net/libs/xpressive
I was reading through a portion of the docs and a few issues came
to mind.
This one applies to Boost.RegEx, too, but I'll ask you: Why have
both regex_match() and regex_search() when the latter can behave
like the former by adding two anchors?
Why does the regex_token_iterator<> ctor use a magic number like
-1 to indicate behavior rather than a named value? (I just
clicked through to the reference and see that it takes a
regex_constants::match_flag_type, but
http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/xpressive/examples.html#examples.split_a_string_using_a_regex_as_a_delimiter
shows passing -1 -- with an explanatory comment -- instead. This
leads to confusion.)
The following items are from the "Perl syntax vs. Static
xpressive syntax" table in
http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/xpressive/creating_a_regex_object.html:
You seem to suggest that the xpressive equivalent of Perl's
"a|b" must be spelled "a | b" but as far as I can see, the
whitespace is irrelevant, so calling attention to it suggests
a difference that doesn't exist.
"bos" and "eos" are a little odd. First, it seems like
"sequence" should be "input." Second, I usually think of
SOF/EOF and SOL/EOL pairs rather than BOF/EOF and BOL/EOL.
Thus, I'd have gone with "soi" and "eoi" at the least.
Unfortunately, in an effort to keep them short, they aren't
terribly mnemonic. How about "start" and "end" (or "beg" and
"end" if you want to go with just three letters)?
. appears twice in the table with two different equivalences.
It may be that the two are effectively the same, but they
aren't grouped and the "Meaning" doesn't point out their
equivalence.
Considering how much you compare xpressive to Perl's REs, I'm
surprised you opted for ~_d instead of _D, for example. I'm
not saying that would be better, but the disconnect from Perl
didn't seem necessary in this case. (I do recognize that
you're using ~ to mean negation of the following subexpression
in many other cases, so perhaps you just determined that being
consistent in expressing negation was more important.)
For "[abc]," you show to different xpressive equivalents, each
in its own row of the table. Why not combine them into a
single row? (Same for any other cases like that.)
A tool that converts a Perl-style RE to xpressive (static
notation certainly, and dynamic if there are any differences)
would be quite helpful (for those that know Perl's REs).
-- Rob Stewart stewart_at_[hidden] Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk