Boost logo

Boost :

Subject: Re: [boost] XML Digester
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2009-01-05 18:08:11


Themis Vassiliadis wrote:
> I'm finishing a preliminary version of a class with similar behavior
> like Java Digester.
>
> In this class the developer defines triggers witch will be fired when
> specific tags are found from XML file.
>
> Let me explain better with an example:
>
> A simple approach will be to use the following Digester in the
> following way to set up the parsing rules, and then process an input
> file containing this document:
>
>
>
> digester = new boost::xml_digester::xml_digester((std::istream*)&st);
>
> digester->setValidating( false );
>
> digester->addObjectCreate<classFoo>("foo");
> digester->addObjectCreate<classBar>("foo/bar");
>
> digester->addCallMethod<classBar>("foo/bar/prop1", &classBar::setProp1);
> digester->addSetProperty<classBar>("foo/bar/prop2", &classBar::prop2);
>
> digester->addSetNext<classFoo>("foo/bar", &classFoo::addBar);
>
> digester->parse();
>
> ....
>
> class classBar {
> std::string prop1;
> public:
> std::string prop2;
>
> void setProp1(std::string value) {
> prop1 = value;
> }
> };
>
> class classFoo {
> std::vector<classBar*>obj_bars;
> public:
> void addBar(void *instance) {
> obj_bars.push_back((classBar*)instance);
> }
> };
>
>
>
> XML:
>
> <?xml version='1.0' encoding='UTF-8'?>
> <foo>
> <bar>
> <prop1>xxxxxx</prop1>
> <prop2>yyyyyy</prop2>
> </bar>
> <bar>
> <prop1>zzzzzz</prop1>
> <prop2>kkkkkk</prop2>
> </bar>
> </foo>

Hi Themis,

Here is approximately how I would do that using RapidXML:

xml_document<char> doc;
// (I'll skip the file access stuff. I tend to use mmap(). That just
// complicates things here.)

for (xml_node<char> foo_node = doc.first_node("foo");
      foo_node; foo_node = foo_node->next_sibling("foo")) {
   classFoo foo;
   for (xml_node<char> bar_node = foo_node.first_node("bar");
        bar_node; bar_node = bar_node->next_sibling("bar")) {
     classBar* bar_p = new classBar;
     xml_node<char> prop1_node = bar_node->first_node("prop1");
     if (prop1_node) {
       bar_p->setProp1(prop1_node.value());
     }
     xml_node<char> prop2_node = bar_node->first_node("prop2");
     if (prop2_node) {
       bar_p->prop2 = prop2_node.value();
     }
     foo.addBar(bar_p);
   }
}

It's true that your code is more concise, but it's not *much* more
concise; on the other hand, it's another layer of stuff to learn and it
is inevitably less flexible than doing it "by hand".

Perhaps the problem is that your example is too trivial to demonstrate
the real advantage of the approach.

I think that it would be worth investigating how Spirit or something
spirit-like could be applied to this problem (PSEUDO-CODE):

rule_t prop1 = element("prop1");
rule_t prop2 = element("prop2");
rule_t bar = element("bar")(*(prop1|prop2));
rule_t foo = element("foo")(*bar);
rule_t doc = *foo;
doc.parse(input);

That's missing the semantic actions, which I have always considered
Spirit's weak point; I believe Spirit2 does better but I haven't investigated.

Perhaps a domain-specific-language for writing DTDs is possible?

Cheers, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk