Boost logo

Boost :

From: Stefan Seefeld (seefeld_at_[hidden])
Date: 2005-11-07 09:19:56


Anthony Williams wrote:

>>>That's one of my main criticisms of your suggested API --- it's too tightly
>>>bound to libxml, and doesn't really allow for substitution of another parser.
>>
>>Could you substantiate your claim ?
>
>
> Really, it was just a feeling from looking at the API docs. However:
>
> In order to use a particular external type, such as std::string, the user has
> to supply a specialization of converter<> for their type, which converts to
> and from the libxml xmlChar type.

Correct. That's the price to pay for not forcing any particular unicode library
on users who want to use the XML API.

> Also, there are lots of constructors around that take xmlNode pointers, or
> xmlDoc pointers, or similar. It may be that these are intended as private
> implementation details, but they show up in the documentation.

Ok, let's fix the documentation. :-)
(You are correct, these are all private constructors. As I already mentioned,
users never instantiate nodes explicitely, but use various factory methods for
that purpose.)

>>>My other criticism so far is the node::type() function. I really don't believe
>>>in such type tags; we should be using virtual function dispatch instead, using
>>>the Visitor pattern. Your traversal example could then ditch the
>>>traverse(node_ptr) overload, and instead be called with
>>>document->root.visit(traversal)
>>
>>Node types aren't (runtime-) polymorphic right now, but is that really a big deal ?
>>Polymorphism is important for extensibility. However here the set of node types
>>is well known (and rather limited).
>
>
> Polymorphism is not just important for extensibility of the polymorphic set,
> but also for type-safe, convenient handling of objects whose exact type is
> only known at runtime. Check-type-flag-and-cast is a nasty smell that I would
> rather avoid where possible.

I understand and agree with you...in principle. There are a number of situations
where a method call will return a node_ptr, and the user typically wants to cast
to the exact type. Examples include element child iteration, node_sets (resulting
from xpath lookup, etc.).
However, doing this with a visitor-like visit/accept pair of methods incures
two virtual method calls, just to get hold of the real type. That's a lot !

As my node implementations already know their type (in terms of an enum tag),
casting is a simple matter of rewrapping the implementation by a new proxy.

Using RTTI to represent the node's type is definitely possible. I'm just not
convinced of its advantages.

> One additional comment on re-reading the samples --- having to instantiate
> every template for the external string type seems rather awkward.
>
> One alternative is to accept and return an internal string type, and provide
> conversion functions to/from the user's external string type. This way, the
> library is not dependent on the string type, but it does add complexity to the
> interface.

Right, I considered that. One has to be careful with those string conversions,
though, to avoid unnecessary copies.

> Another alternative is to make the functions that accept or return the user's
> string type into templates, whilst leaving the enclosing class as a
> non-template, since there are no data members of the user's string
> type. Template parameter type deduction can be used to determine the type when
> it is given as a parameter, and explicit specification can be used when it is
> needed for a return type.

Actually the use I have in mind is really what I demonstrate in the examples:
Developers who decide to use boost::xml::dom fix the unicode binding by specifying
the string type, conversion, etc. in a single place (such as my 'string.hpp'),
such then all the rest of the code doesn't need to be aware of this mechanism.

In your case I couldn't encapsulate the binding in a single place, as you mention
yourself.

What would be possible, though, is to put all types into a single parametrized
struct:

template <typename S>
struct types
{
   typedef typename document<S> document_type;
   typedef typename node_ptr<element<S> > element_ptr;
   ...
};

and then let the user simply write:

typedef dom::types<my_unicode_string> types;
...

std::auto_ptr<types::document_type> doc = dom::parse_file("input.xml");
types::element_ptr root = doc->root();
...

Regards,
                Stefan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk