Boost logo

Boost Users :

Subject: Re: [Boost-users] Can I use Boost.Regex with multilne text to be recognized?
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2009-09-14 21:02:22


On Mon, Sep 14, 2009 at 6:35 PM, Ramon F Herrera <ramon_at_[hidden]> wrote:
> I am trying to parse multiple files with the structure indicated below. I
> sort of got started, but I hate it if I am going to hit a wall.
>
> I guess I could start by defining a line like this:
>
> string variable   = "([A-Za-z0-9][\\w\\h\\(\\)\\-\\.,/&]*)";
> char equal_sign   = '=';
> string value      = "(.+)";
> string assignment = variable + equal_sign + value;
>
> string line       = assignment + eol;
>
> Any tips and hints are most appreciated and welcome...
>
> -Ramon
>
> ---------------------
>
> [Unique ID 1]
> Variable Name = Variable Value
> Variable Name = Variable Value
> Variable Name = Variable Value
>
> [Unique ID 2]
> Variable Name = Variable Value
> Variable Name = Variable Value
> Variable Name = Variable Value
>
> [Unique ID 3]
> Variable Name = Variable Value
> Variable Name = Variable Value
> Variable Name = Variable Value

You could do that using regex, but you will have to parse it into a
structure yourself. You did not state, so I will just assume that the
things in [] are section headings (ala an ini file) and that they are
required first, and the Variable Name's can be duplicated (ala an ini
file) and ordered. If so it might just be easier to use boost::spirit
2.1 as it can do the parsing and filling in your data structure all in
one step, and it will run a great deal faster then regex. Something
like this code would probably work:

// Have not tested this code, writing it inside the email client itself...

std::map< std::string, std::vector< std::pair<std::string,std::string>
> > dataStuff;

using namespace boost::spirit;
using namespace boost::spirit::qi;
using namespace boost::spirit::standard;
bool successful = parse(inputstream.begin(),inputstream.end(),
    *( '[' >> *(print-']') >> ']' >> eol
>> ( +(print-(*space>>'=')) >> *space >> '=' >> *space
>> +print >> eol
        )
    )
    ,dataStuff);

As always, I make no guarantees of the quality of my above code when I
am running on 6 hours past when I should be sleeping.
You can also add a _pass semantic action to the first string match so
you can absolutely ensure that each section ([]) name will be unique.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net