From: Rob Stewart (stewart_at_[hidden])
Date: 2005-05-18 16:01:59
From: "Eric Niebler" <eric_at_[hidden]>
> Rob Stewart wrote:
> > Why does the regex_token_iterator<> ctor use a magic number like
> > -1 to indicate behavior rather than a named value? (I just
> > clicked through to the reference and see that it takes a
> > regex_constants::match_flag_type, but
> > http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/xpressive/examples.html#examples.split_a_string_using_a_regex_as_a_delimiter
> > shows passing -1 -- with an explanatory comment -- instead. This
> > leads to confusion.)
> Again, I'm just following the standard here, but providing a named
> constant would be a nice addition. The -1 is an optional 4th parameter,
> and the match_flag_type is an optional 5th parameter -- so there should
> be no confusion.
Apparently, I can't count. I was matching the -1 with the
match_flag_type parameter. Whatever the type, it ought to use
named values. Perhaps there's time to improve the proposed
> > The following items are from the "Perl syntax vs. Static
> > xpressive syntax" table in
> > http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/xpressive/creating_a_regex_object.html:
> > You seem to suggest that the xpressive equivalent of Perl's
> > "a|b" must be spelled "a | b" but as far as I can see, the
> > whitespace is irrelevant, so calling attention to it suggests
> > a difference that doesn't exist.
> Naturally whitespace is irrelevant. That's how C++ works. I don't think
> this should be a source of confusion for people.
Of course. I was just pointing out that the Perl syntax was
shown without whitespace (necessary) and the C++ with (not
necessary). Many writing C++ can be confused over matters like
this. Of course, if you show the xpressive version as "a|b" such
people won't think they can write "a | b." Doing so, however,
does avoid a gratuitous difference, don't you think?
Maybe a note clarifying that while spaces are significant in a
Perl or, for that matter, a dynamic xpressive RE, they aren't
significant in a static xpressive RE other than in literals.
> > "bos" and "eos" are a little odd. First, it seems like
> > "sequence" should be "input." Second, I usually think of
> > SOF/EOF and SOL/EOL pairs rather than BOF/EOF and BOL/EOL.
> > Thus, I'd have gone with "soi" and "eoi" at the least.
> > Unfortunately, in an effort to keep them short, they aren't
> > terribly mnemonic. How about "start" and "end" (or "beg" and
> > "end" if you want to go with just three letters)?
> The regex std proposal has match flags match_not_bol and match_not_eol,
> so I'm reusing this terminology. Boost.Regex also has match_not_bob for
> "beginning of buffer". This is not proposed for standardization, and I
> don't think the term "buffer" is appropriate anyway. You like "input"
> but I prefer "sequence". I dislike "input" becauase it might suggest to
> people that input iterators are acceptable to the regex algorithms,
> where as a bidirectional sequence is what is required.
What about "beg" and "end?" I realize they aren't reusing the
proposed terminology, but they avoid the "sequence/buffer/input"
> > Considering how much you compare xpressive to Perl's REs, I'm
> > surprised you opted for ~_d instead of _D, for example. I'm
> > not saying that would be better, but the disconnect from Perl
> > didn't seem necessary in this case.
> It is necessary. _D is an illegal identifier, reserved to the
> implementation. All identifiers that begin with an underscore and a
> capital letter are illegal in user code. Even if that were not the case,
> ALL CAPS is reserved for macros by convention. That's how I ended up
> with ~_d.
Doh! Where was my mind? Of course that's not a legal
identifier. Clearly I was doing too many things at once at that
(I'd hardly consider that all caps thus implying a macro,
-- Rob Stewart stewart_at_[hidden] Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;