|
Boost Users : |
Subject: Re: [Boost-users] Regex problem: cannot parse terms containing OR
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2009-09-26 22:40:08
On Sat, Sep 26, 2009 at 5:04 PM, Ramon F Herrera <ramon_at_[hidden]> wrote:
>
> I have been able to parse stuff more complicated than this, but now I am
> stuck with something seemingly simpler.
>
> The expression being parsed is the common sequence:
>
> Â 1,2-5,7,8-11
>
> Question 1: Notice my approach. I first match the whole expression, with
> "regex_match", to make sure that it is valid (that works great). Next, I use
> "regex_iterator" to break down the parts. Is that good practice? Am I being
> inefficient/redundant?
>
> Question 2: My code below only extracts "range terms" ("x-y"), for some
> reason I cannot extract "number terms".
>
> As a workaround, I can always feed my data like this:
>
> 1-1,2-5,7-7,8-11
>
> but, after a lot of tries, would love to learn how to do this properly.
>
> TIA,
>
> -Ramon
>
> -----------------------------------------------------
>
> #include <iostream>
> #include <boost/regex.hpp>
> using namespace std;
>
> bool
> term_callback(const boost::match_results<std::string::const_iterator>& what)
> {
> Â Â for (unsigned int i = 0; i < what.size(); i++) {
> Â Â Â Â cout << "what[" << i << "]: " << Â what[i].str() << endl;
> Â Â Â Â cout << "---------" << endl;
> Â Â }
> Â Â return true;
> }
>
> int
> main(int argc, char *argv[])
> {
>   const char hyphen    = '-';
>   const char left_paren  = '(';
> Â Â const char right_paren = ')';
>   const char bar     = '|';
>   const char comma    = ',';
>   const char star     = '*';
>
>   const string number   = "[0-9]+";
>   const string range   = number + hyphen + number;
>   const string term    = left_paren + number + bar + range + right_paren;
>   const string sequence  = term + bar + left_paren + term + comma +
> right_paren + star + term;
>
> Â Â boost::regex expression(sequence);
> Â Â boost::regex piece(range);
> Â Â boost::cmatch matches;
>
> Â Â char argument[1024];
> Â Â strcpy(argument, argv[1]);
>
> Â Â if (!boost::regex_match(argument, matches, expression)) {
> Â Â Â Â cerr << "There is no match" << endl;
> Â Â Â Â return 1;
> Â Â }
>
> Â Â string text = argument;
>
> Â Â boost::sregex_iterator m1(text.begin(), text.end(), piece);
> Â Â boost::sregex_iterator m2;
> Â Â for_each(m1, m2, &term_callback);
>
> Â Â return 0;
> }
Do note, if you are wanting to do something with your numbers, like
convert them to numbers and do some operations on them, there is a
much easier way to do this if you use Boost.Spirit2.1 instead of
Boost.Regex. Your problem is more of a parsing problem then a
matching problem, and regex is nice for matching, and Spirit2.1 is
better for parsing. If you are interested then I or someone else
could whip up some code that does the same thing in Spirit2.1, but
will run a whole lot faster and be a lot easier to use.
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net