|
Boost Users : |
From: Christoph Duelli (duelli_at_[hidden])
Date: 2008-05-29 13:03:01
[I am using Boost 1.35, on Linux, running gcc 4.1.2]
I want to parse a string like the following
--- ResA { opt1=>val1, opt2 => val2}, ResB ResC {opt3=> valllll } ResD --- I thought that maybe it is easier to to this using xpressive rather than Spirit (which I have used for more complicated stuff so far). I am happy to say, that with the very helpful docs I was able to create a sregex that parses the above. Also, I found it the solution is quite short, so I am basically happy with xpressive. As this was my first try, there is certainly (lots of) room for improvement: I have attached my program, and would welcome comments (and suggestions for enhancements.) #include <boost/xpressive/xpressive.hpp> #include <boost/xpressive/regex_actions.hpp> using namespace boost::xpressive; using namespace std; #include <boost/foreach.hpp> #include <iostream> #include <map> int main(int, char **) { string input = "ResA { opt1=>val1, opt2 => val2}, ResB\n" " ResC {opt3=> valllll }\n\n\n" "ResD"; typedef map<string, string> options_t; typedef map<string, options_t> resources_t; resources_t result; // The name of the resource we are parsing now. string res_name; // Pointer to options (map) of the resource we are parsing now. options_t *optsref; // Match an option(name) and its value. Both strings, // separated by =>, and then stuff the result into the // options-map of the resource we are parsing now. sregex rx_opt = ( (s1= +_w) >> *_s >> "=>" >> *_s >> (s2= +_w) ) [ (*ref(optsref)) [s1] = s2 ]; // A resource has a name and, enclosed in {}, option=>name // pairs. We store the name of the resource and the address // of its options map. sregex rx_res = *_s >> (s1= +_w)[ref(res_name)=s1, ref(optsref)=&(ref(result)[s1])] >> *_s >> optional( '{' >> *_s >> rx_opt >> *(*_s >> ',' >> *_s >> rx_opt) >> *_s >> '}' >> *_s); // A line may contains comma separated resource definitions. sregex rx_line = rx_res >> * (*_s >> ',' >> *_s >> rx_res); // A file consists of multiple lines. sregex rx_file = rx_line >> *(*_n >> rx_line) >> *_n; if(regex_match(input, rx_file)) { // output the parsed structure. cerr << "resname="<<res_name<<endl; BOOST_FOREACH(const resources_t::value_type &p, result) { cerr << p.first << " : " << endl; BOOST_FOREACH(const options_t::value_type &o, p.second) cerr << " " << o.first << " => " << o.second << endl; } } else cerr << "NO MATCH!" << endl; return 0; } In particular, I'd like to know: 1) Is it possible to avoid my (ugly) use of "*optsref"? I would have liked to write something like sregex rx_opt = ( (s1= +_w) >> *_s >> "=>" >> *_s >> (s2= +_w) ) [ (ref(result[ref(res_name)])) [s1] = s2 ]; ie nest the maps directly. The way I tried, it would not compile. 2) Can I tell xpressive to allow arbitrary whitespace around ">>"? I'd rather avoid cluttering the regexes with all that ">> *_s". 3) If the string does *not* match the sregex, can I find out where the failure occured (ie what is the length of the prefix that could have been completed to a succesful match.)? Thank you and best regards, keep up the good work, Christoph
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net