Boost logo

Boost :

From: Andreas Pokorny (andreas.pokorny_at_[hidden])
Date: 2004-11-12 13:46:20


Hi,
Nice!

I recently worked on a xml parser and generator library. I had to work
on several different xml formats, and writing sax code for all these
formats looked like a stupid repetitive process. Using a dom parser
did not help either, there still was lots of code which just forwarded
parsed contents of strings to some method, or data structures. So i
started working on a way to describe a xml format in c++ code, and
generate the sax binding code for that format.
   So I also had to figure out how arbitrary objects of a certain type
can be filled with the data in a string of the xml element. So I
defined a 'value property' type for values, and a container property for
sequence container ( other might follow ). These properties carry the
type information, and the access path for reading and writing of that
property. Usually the properties are grouped together in a so called
property map, which maps certain key types on the property type ( and
object ).
  With this meta information tool it was possible to build a class
interface independent format description that generates a sax parser
using the expat library to forward xml items directly to the structures.

I have some example code, which is a reduced version of a real format:
// two key types used for the property_map:
struct Data{};
struct Name{};

// Node is
struct Node
{
  private:
    std::vector<Node*> nodes;
    std::string name,data;
  public:
    typedef property_map< mpl::vector<
         con<Node>, // adds the type con<Node,Node> so Node is
                           // key and data, so Node can be used to access a
                           // container property, that reflects a sequence
                           // container of Nodes
         elem<Name, std::string>, // a std::string value property
         elem<Data,std::string> // like above but identified using Data
>,
       Node
> type_i;
   static type_i const& get_info();
};

I stripped the code in get_info, which initiallizes the property_map
structure, bascially because the init code is pretty dense and needs a lot of
improvement.

There is another structure called RootNode which describes a similar structure
but without Data. An example file for that format could look like that:
<?xml version="1.0"?>
<root_node name="example_tree">
 <node name="empty" data="0" />
 <node name="base_item1" data="124">
  <node name="triple_obj" data="22">
   <node name="hs1" data="9"/>
   <node name="hs2" data="13"/>
   <node name="hs3" data="10"/>
  </node>
  <node name="single" data="-120"/>
 </node>
</root_node>

With my library the format can be described like that:

  
boost::shared_ptr<Receiver> basic_node;
 // Receiver is a base class for all classes which get called by the expat sax code

 // We now define the 'node' tag:
basic_node = xml::gen_object_node(
         // we have to set the property map, and the tag name
      xml::sub_tag<Node>( Node::get_info(), "node")
         // no we add all attributes
      .attributes(
         xml::attribute.assign<Name>("name")
         | xml::attribute.assign<Data>("data")
         )
         // and a sub tag which points on basic_node
      .sub_tags(
         xml::link_tag<Node>(
            basic_node,
            "node"
            )
           ),
        Node::get_info() // the property map a second time.. :(
        );
  // now the root tag:
boost::shared_ptr<Receiver> root_node = xml::gen_root_node(
        xml::root_tag( RootNode::get_info(), "root_node")
        .attributes( xml::attribute.assign<Name>("name") )
        .sub_tags(
            xml::link_tag<Node>( // here we link to basic_node
                basic_node,
                "node"
                )
            )
        );

Parser p;
RootNode obj;
try{
    // parsing :
    p.parse( root_node, filename, &obj );

    // printing:
    root_node->print( &obj, file_stream );
}catch ( std::exception &e){
  // ...
}

The xml library was writen to handle lots of different formats, and to
easily handle any changes of the format, during the development of the
system. It was not intended to become the ultimative xml library, lots
of features are missing, but i think it could be good part of a bigger
more versatile xml library. Or put on top of the raw sax interface of
that xml library.
  I have to admit that my personal intersts have moved, I am much more
intersted in the property part, the defining of meta informations. I
plan to write my master (diplom) thesis about that topic. So about
defining type information, in C++ structures and types, and then showing
how to use this information to simplyfy or automate libraries interfaces.
I planed to use the xml library described above, and a simple database
library as a proof of concept, maybe also a small gui library based on
something like antigrain.
  The properties still have to be improved, their usage is still too
complicated, and some features are missing. The code is available at
http://svn.berlios.de/viewcvs/kant/trunk/source/src/util/ and
http://svn.berlios.de/viewcvs/kant/trunk/source/src/serialize/

I think about changing the code daily, but i have to finish a different
work at the university before i can focus on that code again:
Currently the value property consists of a get and set part which allows
const and non const access to a value:

template <typename T, typename Compound = mpl::void_>
struct value_property
{
    boost::shared_ptr< setter<T,Compound> > set;
    boost::shared_ptr< getter<T,Compound> > get;
};

getter and setter are base classes for lots of different kinds of
access. The getter for example has 6 different implementations, that
handle access by direct memory access, a method pointer that returns a
const reference, a method pointer that returns a value, a method that
expects reference parameter which gets the value assigned ....

I now think about adding a feature to hook functionality into the get
or set part of the property, e.g. to lock a mutex, or check the data
passed to the property, for example to ensure a certain string format,
and to throw on error, or to send a signal everytime the value changes
...

Apart from that the property design needs a bigger change, because the
current design of the value_property completly fails when used by multiple
threads. -- I wish i had more time, these days --

So i would like to work on a 'property' library or meta type library,
but this functionality could overlap with a possbile gui library, the
boost::db ideas which were performed here and maybe also the
boost::python/langbinding libraries.

After that i would like to focus on either using that library in a database
and/or gui library environment.

Regards
Andreas Pokorny

On Sat, Nov 06, 2004 at 11:46:35AM +0100, Thorsten Ottosen <nesotto_at_[hidden]> wrote:
> Dear all,
>
> Following our discussion of the unicode library, would it not be a good idea
> to persue such efforts more aggresively?
>
> I could imagine it would help bring forward libraries much faster. I think it
> would be reasonable that
> the boost comunity provided
>
> 1. project descriptions
> 2. help and guidelines throughout the 6-12 months of the project
>
> If we had small papers explaining potential projects, these can be sent to
> universities which can the in turn
> suggest them to their students.
>
> Off the top of my head, I can think of these projects
>
> 1. C++ database library
> 2. C++ statistics library
> 3. exact reals class
> 4. An XML parser and generator library
>
> I could probably be a co-author and contact person of (2).
>
> Any thoughts?
>
> -Thorsten




Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk