Boost logo

Boost Users :

Subject: Re: [Boost-users] [Spirit] Parser omits certain characters
From: beet (r.berlich_at_[hidden])
Date: 2013-11-04 19:04:46


Dear Tongari, dear all,

Am 01.11.13 03:42, schrieb TONGARI J:
> 2013/11/1 beet <r.berlich_at_gemfony.eu <mailto:r.berlich_at_gemfony.eu>>
>
> Dear all,
>
> For all of the following, please see the attached test-case.
>
> I would like to specify certain characterstics of variables in the
> following way: d(MY_DPAR_01,-10.3,12.8,100) . So in this case there is a
> type-identifyer (d stands for "double") and then, in parantheses, a
> variable name (or alternatively an index), a lower and upper boundary
> and some integer parameter. What is contained in the list depends on the
> parameter type.
>
> It should then be possible to specify any number of variables, including
> their properties, in a comma-separated list.
>
> So I need to dissect a string like the following:
>
> "d(MY_DPAR_01,-10.3,12.8,100), d(0,-10.3,12.8,100), i(SOME_IPAR_17,
> 0,5), b(SOME_BPAR)"
>
> The first step, as I see it, is to seperate the "outer" list and to
> store each variable description in a boost::tuple<char, std::string>,
> where the char holds the type identifier and the std::string the part
> inside the parantheses. Then, in the second step, I want to parse that
> string according to what is expected for this particular type.
>
> I have now run into the problem that the following rule in Boost.Spirit
> "swallows" all non-alphanumeric characters.
>
> qi::rule<std::string::const_iterator, std::string(), ascii::space_type>
> varSpec = +(alnum | '_' | ',' | '.' | '+' | '-');
>
>
> Note '_' is actually qi::lit('_'), which exposes *no* attribute, unlike
> qi::char_ (and qi::alnum, etc) which gives you a char.
> I'd suggest +qi::char_("a-zA-Z0-9_,.+-") for varSpec.

thanks -- this worked nicely.

I am now trying to refine my parser. I would like to distinguish the
following cases in a string:

0 --> should be parsed into an integer
SOME_VAR --> should be parsed into a string
SOME_VAR[0] --> should be parsed into a string and an integer

I have created the following rule:

qi::rule<std::string::const_iterator, VARTYPE(), ascii::space_type>
varReference =
( (attr(0) >> attr("empty") >> uint_)
| (attr(1) >> identifier >> '[' >> uint_ >> ']')
| (attr(2) >> identifier >> attr(0)) );

Here, VARTYPE is a typedef for
boost::tuple<std::size_t, std::string, std::size_t>, and "identifier"
stands for lexeme[+char_("0-9a-zA-Z_")] as per your suggestion.

The idea is to provide the user with a mode-variable to allow easy
distinction between all three cases, and to fill unused parts of the
tuple with some placeholder (such as "empty" or 0), with the help of
attr(). So the first entry of the tuple represents the mode, the second
the variable name and the third an optional index.

Now, parsing a string like "MY_DPAR_02[3]" yields "emptyMY_DPAR_02" for
the std::string component of the tuple, and "SOME_IPAR_17" results in
"emptySOME_IPAR_17SOME_IPAR_17" . Parsing a single 0 yields the correct
result.

So the parser seems to go through all three components of varReference,
until it finds a matching rule. However, instead of overwriting the
std::string with the string it has found, it appears to concatenate the
strings of all rules it has gone through.

This is a bit mysterious to me and I would appreciate your help.

I'm using Boost 1.54 64 bit, the above happens on Ubuntu Linux 13.10,
g++ 4.8.1 .

In any case thanks,
Beet


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net