Boost logo

Boost Users :

Subject: Re: [Boost-users] [Spirit] Qi lexeme only taking the first word
From: Gavin Lambert (boost_at_[hidden])
Date: 2018-11-06 22:35:55


On 7/11/2018 11:01, Michael Powell wrote:
> I've got a couple of rules that are perplexing to me. First,
>
> rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];
>
> In and of itself, id is working fine. Then I've got a "full id":
>
> rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);
>
> Where:
>
> struct full_id_t {
> std::string val;
> };
>
> full_id_t::val is quite intentional for reasons elsewhere in the grammar.
>
> The perplexity comes in, it seems lexeme is only shaving off the first
> word as the val.
>
> For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Again, I don't really know anything about Spirit, but it's reasonable to
assume that "lexeme" will group its input sequence into a single token
output, which is the result of id as a single std::string.

Meanwhile in full_id you're specifying a sequence of input tokens, so it
will also output a sequence of tokens (which can presumably be captured
as a std::vector<std::string>, not simply a std::string).

Most likely (though again this is just a guess) given the input
"two.oranges.red.test" you should end up with std::vector<std::string> {
"two", "oranges", "red", "test" }.

This is probably what you want (as it will simplify later use of
subcomponents), especially if the language allows whitespace around the ".".

If you want to disallow whitespace around the "." and get it as a single
string token, then yes, you will probably have to make full_id call
lexeme. I don't know whether that will require extracting the inner
part of id to a separate rule so that lexeme only ends up being called
once or if you can "nest" uses of lexeme.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net