Boost logo

Boost Users :

From: Robert Ramey (ramey_at_[hidden])
Date: 2007-10-09 12:00:07


A very interesting question.

I never wanted to spend any time making the XML archive.
I felt I had to do it in order to get the library accepted and that
it was insisted upon by people who never really understood
that it couldn't be a general XML solution by definition.

a) I was very concerned about making portable and hard
to maintain code for character code independent xml parsing.
b) The xml "standard" permits a lot of variation. I was concerned
that whichever form I used - a lot of people would find that it
wasn't the form they needed.
c) I had to spend time researching the xnl standard which for
me was not very interesting in itself.

I did make it however and it yielded a few surprises.

a) I was able to use spirit and its examples to generate
an xml parser. This resulted in a much more easily
maintainable and robust product than yet another hand
rolled xml parser would have been. Also, and this was a big
consideration, it was much more fun for me to gain experience
with spirit (and by extension DSL and expression templates)
than make yet another xml parser.

b) I chose a "least common denominator" version of the xml
standard to implement. This standard already had provision
for some standard attributes that I needed - e.g. object tags.
I did consider generating an xsd but concluded that it was
unnecessary to provide the minimal functionality to get off
the hook. I'm surprised how little flack I get on our xml
serialization given that there were a lot of things that I could
have chosen to do differently.

c) I believe that those who considered XML a requirement
were under the impression that this would permit one
to generate a program which would automatically read
a wide variety of XML files. Serialization works the other
way around C++ => XML not the other way around. So
its building a C++ structure which automatically generates
a pre-defined XML schema is sort of like finding the inverse
of an equation. In some cases it can be done, but its
harder than it first appears. Suprisingly to me, this seems
to have worked out well for a number of people even
though in generally wouldn't recommend it.

Now that I've gotten that off my chest - your question.

I would not be too difficult to make an archive which
generated and XSD schema which exactly descibed
the XML archive created. If it were me, I would go about it
in the following manner.

Create a new archive xsd_oarchive. This archive would be
unique in that it would be output only. The following
would create and xsd file for any C++ classes

xsd_oarchive<char> xoa(output_stream1); // create xsd archive
xml_oarchive<char> xml(output_stream2); // create xml archive

const my_class t;
xoa << t; // build schema
xml << t; // build xml archive

If one want's to get a little fancier he could use these to create
a combinine xml/xsd archive something like the following

class xx_oarchve : public xml_oarchive, xsd_oarchive
{
    template<class T>
    operator<<(const T & t){
        static_cast<xml_oarchive &>(*this) << t;
        static_cast<xsd_oarchive &>(*this) << t;
    }
    ...
    xx_orachive(ostream & xml_stream, ostream & xsd_stream, unsigned int
flags) :
        xml_oarchive(xml_stream, flags),
        xsd_iarchive(xml_stream, flags),
    {
        ...
    }
};

And if you really had nothing else to do you could make a combined
xx_iarchive which would read the archve while verifying that no one
had edited beyond redemption.

Hope that is food for thought.

Robert Ramey

Juhasz, Zoltan (IT) wrote:
> Hi,
>
> I've got a pretty strange problem.
>
> We are using boost::serialization library to serialize arbitrary
> structs
> into XML. That is working pretty good.
>
> Our problems are:
>
> - we also need to generate schema files (XSD) for the archive file,
> since this XML archive file is going to be used by other programs
> (e.g.
> written in Java etc.), and
> - the unserialization part should be validated through the schema
> file.
>
>
> Currently, the boost::serialization library can serialize /
> deserialize
> our structs to XML just fine, but:
>
> How can we add XSD generation support?
> How can we 'force' boost::serialization to validate the XML file
> through
> an schema file (XSD)?
>
>
> PS: Unfortunately, the automatic generation of the XSD file is a
> strict
> requirement, that is why we cannot use regular XML <> C++ object
> mapping
> tools, where the C++ codes are generated by an external tool from a
> schema (XSD) file.
>
> PS2: If you know other third party tools that can solve the above
> mentioned problem, please feel free to write.
>
>
> Thank you in advance!
>
> Best Regards,
> Zoltan Juhasz
> Morgan Stanley | Technology | Budapest
> Zoltan dot Juhasz at MorganStanley dot com
> --------------------------------------------------------
>
> NOTICE: If received in error, please destroy and notify sender.
> Sender does not intend to waive confidentiality or privilege. Use of
> this email is prohibited when received in error.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net