|
Boost : |
Subject: Re: [boost] Standard c++ XML parser API (Boost.XML)
From: Bjorn Reese (breese_at_[hidden])
Date: 2014-03-20 04:34:18
On 03/18/2014 04:46 PM, Stefan Seefeld wrote:
> I don't see any reason why such an XML API wouldn't be usable by other
> Boost libraries.
It should be part of the GSoC project to verify this for the most
common use cases (XML serialization is the most obvious one.)
>> What is the purpose of the S template argument?
>
> To keep the concern for unicode or any other string type orthogonal from
> the XML library, i.e. to allow Boost.XML to interact with different
> Unicode implementations. In fact, in the existing demos I'm restricting
> content to ASCII, so I can in fact get away with using std::string, so
> this is a good example of the "modularity" design goal I mentioned
> above: Don't force anything on users they don't actually need.
I agree with the goal, but I am not sure that the S type solves the
problem. I must admit that I am having difficulty understanding exactly
how you envision it should work for other encodings, because std::string
is orthogonal to encoding (locale is usually attached to the I/O
stream.)
What encoding is used for std::string? ASCII, UTF-8, or "whatever the
XML library gives me"? This should be documented as part of the API
regardless of the answer.
Should I define a new string type if I want to use Latin-1 or another
encoding in my application? What if the rest of my application uses
std::string for Latin-1 encodings? (I am wondering how will work with
the current convert trait specialization for std::string.)
How does the convert trait know the XML document encoding so that it
is able to convert between this and the application encoding?
I suggest that you adopt the libxml2 design decision to always use
UTF-8 for std::string (and UTF-16 for std::wstring if needed.) See
the design rationale here:
http://xmlsoft.org/encoding.html
Any backend that does not provide UTF-8 will have to be wrapped.
With such a design decision, the S template parameter becomes
superfluous (or should be changed to CharT if you wish to support
both std::string and std::wstring.)
Conversion between UTF-8 and application encodings would have to
be done explicitly in the application.
At any rate, encoding should be addressed in the GSoC project.
>> What is the purpose of the convert trait?
>
> To allow conversion between the backend's own string representation and
> the string type that is used with Boost.XML.
Ok. You should, however, make sure that the strings are converted
correctly:
http://xmlsoft.org/html/libxml-xmlstring.html
For instance, convert::in() does not take libxml2 custom allocators into
account:
http://xmlsoft.org/html/libxml-xmlmemory.html
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk