Boost logo

Boost Users :

Subject: Re: [Boost-users] [spirit] qi xml parser
From: Michael Powell (mwpowellhtx_at_[hidden])
Date: 2014-06-30 18:07:50


On Mon, Jun 30, 2014 at 4:44 PM, Michael Powell <mwpowellhtx_at_[hidden]> wrote:
> On Mon, Jun 30, 2014 at 3:14 PM, Michael Powell <mwpowellhtx_at_[hidden]> wrote:
>> Hello,
>>
>> I am building out a general use xml parser including attributes,
>> arbitrary number of elements, and so on.
>>
>> So far so good, makes sense parsing names and so forth. However, how
>> do you handle element content? Which could either be a string, or zero
>> or more other elements (basically of the same rule as the enclosing
>> element rule).
>>
>> It would seem you need a terminus, the empty element tag. In such a
>> way that populates the parent (initial) element, and its children (of
>> the same element kind).
>>
>> I'll be adapting structs to capture the results. I am also using a
>> couple of helpful references, for instance:
>>
>> http://www.w3.org/TR/xml11/
>> http://stackoverflow.com/questions/9473843/boost-spirit-how-to-extend-xml-parsing
>
> I'm not sure reading the Xml specification, and some boost tickets
> from several years ago, the following couldn't represent content:
>
> content %= *(chars_ - chars_("<&")) | *(comment | child_element);
>
> Where comment is defined as expected. child_element is the potential
> for recursion into the element grammar where content is defined.
> Basically a member variable of the same type as the container struct
> (element grammar).

Indeed, I cook up a simple(ish) example, and I get the error:

Error 3 error C2460:
'xml::xml_element_grammar<std::_String_const_iterator<std::_String_val<std::_Simple_types<char>>>,boost::spirit::ascii::space_type>::child_element'
: uses 'xml::xml_element_grammar<std::_String_const_iterator<std::_String_val<std::_Simple_types<char>>>,boost::spirit::ascii::space_type>',
which is being defined i:\source\kingdom
software\cppxml\xml\xiparser.h 187 1 xml

Nothing fancy, fairly plain-old-Xml there:

    using boost::spirit::qi::phrase_parse;
    using boost::spirit::ascii::space;

    std::string txt = "<test><one /><two>2</two><three att=\"3\"/></test>";

    xml::xml_element_grammar<> g;
    xml::xelement element;

    bool result = phrase_parse(txt.cbegin(), txt.cend(), g, space, element);

How do you model when parent needs to look like a child, depending on
the direction of the grammar's rule? In other words, the defining rule
is a "parent", but when it's done parsing, it could very well operate
like a child to a container parent.

>> Also not sure quite how to capture the adapted parts at strategic rule
>> opportunities.
>>
>> My domain model will look something like this, keeping it simple as possible:
>>
>> struct xattribute {
>> std::string name;
>> std::string value;
>> };
>>
>> typedef std::vector<xattribute> xattribute_vector;
>>
>> struct xelement;
>>
>> typedef std::vector<xelement> xelement_vector;
>>
>> struct xelement {
>> std::string name;
>> std::string content;
>> xattribute_vector attributes;
>> xelement_vector children;
>> };
>>
>> Thanks...
>>
>> Best regards,
>>
>> Michael Powell


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net