Boost logo

Boost Users :

Subject: Re: [Boost-users] [Spirit] Qi lexeme only taking the first word
From: rmawatson rmawatson (rmawatson_at_[hidden])
Date: 2018-11-07 01:12:04


It's been a long while since I've used spirit::qi. But What it looks like is happeneing in your setup is something liek this,

When you have:

qi::rule<It, AST::full_id_t()> full_id;

the attribute is vector<string>

When it matches

id >> *(char_('.') >> id)

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar.

spirit appears to compare your target attribute with the synthesised attribute of the parser and for any (trailing?) members
of the synthesised attribute that do not match in your attribute, it marks them as unused_type and they are not assigned.

You can see overload of assign to is used in your example if you breakpoint it -> boost\spirit\home\qi\detail\assign_to.hpp line 399.

It appears in boost\spirit\home\qi\operator\sequence_base.hpp line 74, where the predicate
traits::attribute_not_unused<Context, Iterator> is passed to spirit::any_if (boost\spirit\home\support\algorithm\any_if.hpp line 186.)
it will basically discard attributes where the LHS sequence is not matched with the RHS.

You can see this in your example by adding an additional member to

    struct full_id_t {
        std::string val;
        std::vector<std::string> others;
    };

    BOOST_FUSION_ADAPT_STRUCT(AST::full_id_t, val, others)

Your missing bits will appear in this std::vector, as they are now not silently discarded.
http://coliru.stacked-crooked.com/a/51f16c6deff45309

I think what the problem fundamentally is the attribute propagation is different when you have a string to when you have a vector<string> as in your two examples.
the first kicks in whatever logic exists to flatten the LHS attribute into a string, the second takes the first element, assigns it
and marks the rest as unused.

One thing you can do is use qi::as<std::string>()[ id >> *(char_('.') >> id) ] to force conversion of synthesised attribute to a string to happen
before it is assigned to your attribute.
http://coliru.stacked-crooked.com/a/6a060343a390f037

I've only had a quick look and this is pretty half hearted analysis. You'll really have to dig deep to find out exactly what is going on, but I suspect
this is somewhat along the right lines.
________________________________
From: Boost-users <boost-users-bounces_at_[hidden]> on behalf of Michael Powell via Boost-users <boost-users_at_[hidden]>
Sent: 06 November 2018 23:03
To: boost-users_at_[hidden]
Cc: Michael Powell
Subject: Re: [Boost-users] [Spirit] Qi lexeme only taking the first word

On Tue, Nov 6, 2018 at 5:40 PM Michael Powell <mwpowellhtx_at_[hidden]> wrote:
>
> On Tue, Nov 6, 2018 at 5:01 PM Michael Powell <mwpowellhtx_at_[hidden]> wrote:
> >
> > Hello,
> >
> > I've got a couple of rules that are perplexing to me. First,
> >
> > rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];
> >
> > In and of itself, id is working fine. Then I've got a "full id":
> >
> > rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);
> >
> > Where:
> >
> > struct full_id_t {
> > std::string val;
> > };
> >
> > full_id_t::val is quite intentional for reasons elsewhere in the grammar.
> >
> > The perplexity comes in, it seems lexeme is only shaving off the first
> > word as the val.
> >
> > For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.
> >
> > Perhaps I should defer specifying the lexeme part of id until later?
>
> I elaborated a little on the "simple" full id sub-grammar, but I
> cannot repro using the GCC compiler. I'm wondering if this has
> anything to do with the VS2017 fpos issue?
>
> http://coliru.stacked-crooked.com/a/adeb42ce2f19b0fd
>
> Or there may be insufficient context in the web compiler to adequately demo.

I got a repro:

http://coliru.stacked-crooked.com/a/069a44296240be7e

Although the reasons as to why I do not know.

It is a difference in attribute synthesis. When full_id synthesizes a
std::string(), the conversion to full_id_t() "just works" magically.
I'm guessing by happy accident based on the std::string val being the
only member (adaptation, etc).

But when I change the synthesis to be its "true" type, that is,
AST::full_id_t(), suddenly I see the same behavior.

Really and truly, I do not know why. Everything else being equal why
would one approach be any different than the other?

Anyone with some Spirit, Fusion, AST, insights?

Thanks!

For now, I'll run with it as has been exposed here, but it's a bit
troubling to me not knowing the difference.

> > Thoughts? Suggestions?
> >
> > Thank you!
> >
> > Best regards,
> >
> > Michael Powell
_______________________________________________
Boost-users mailing list
Boost-users_at_[hidden]
https://lists.boost.org/mailman/listinfo.cgi/boost-users



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net