Boost logo

Boost :

From: Hamish Mackenzie (hamish_at_[hidden])
Date: 2003-06-13 09:37:40


On Fri, 2003-06-13 at 14:11, Stefan Seefeld wrote:
> Hmm, I see your point. Well, that would be possible, but that way you
> are unable to make nodes polymorphic. Neither with respect to the basic
> node types (Element, Attribute, Text, CData, etc.) nor later when
> implementing real DOM support on top of it.

How about this (I forget the name of this pattern but its in the GOF
book)....

class node_type_handler
{
public:
  virtual void do_something( node_proxy & node, int params_go_here )
    = 0;
  static node_type_handler * handler( int node_type );
};

class element_node_type_handler : public node_type_handler
{
public:
  virtual void do_something( node_proxy & node, int params_go_here )
  {
    element_proxy( node ).do_something( params_go_here );
  }
};

static node_type_handler * node_type_handler::handler( int node_type )
{
  static node_type_handler * handlers[ node_type ] =
  {
        new element_node_type_handler(),
        new attribute_node_type_handler()
        // ...
  };
  
  return handlers[ node_type ];
}

class node_proxy
{
public:
  void do_something( int params_go_here )
  {
    node_type_handler::handler( type() )->
      do_something( *this, params_go_here );
  }
};

class element_proxy : public node_proxy
{
public:
  explicit element_proxy( const node_proxy & node );
  void do_something( int params_go_here )
  {
    // Actually do_something
  }
};

> > I have attached the wrappers I have written. They do not cover much of
> > libxml2 (just what I needed at the time). Feel free to borrow as much
> > or as little from it as you like.
>
> yeah, looks interesting, and even more thin than my wrapper. However,
> the thinner the wrapper gets the greater is indeed the danger of having
> the whole design tied to a particular implementation, as William pointed
> out earlier. I don't think that this is a problem with my wrapper lib,
> but with your implementation you get dangerously close...:-)

Making it thicker won't make it any easier to apply to other parsers.
Especially if you rely on the existence of something like _private. If
the wrapper is a good tight wrapper for libxml2 then if something needs
changing to be portable then we can wrap the wrapper with a more
portability layer.

> > Looking up this node's parent node is thus simply
> >
> >>static_cast<Node *>(this->my_impl->parent->_private);
> >
> >
> > If there was a parent lookup in node_proxy it would be
> >
> > class node_proxy
> > {
> > public:
> > node_proxy parent() { return node_proxy( node_->parent ); }
> > ...
> > private:
> > xmlNodePtr node_;
> > };
>
> yes, and you could even make that an 'element_proxy' as you know that
> parent nodes are always elements. However, with a flat set of (libxml2)
> nodes that wouldn't work any more, so runtime polymorphism would be
> lost. Well, may be there is no need for it either. I have to think over
> that...

True and that would fit in nicely with the code I outlined above

node.first_child().do_something( 0 );

Would go through the polymorphic lookup and

node.parent().do_something( 0 );

Would call the same code but without the lookup overhead.

> well, but you could also make it such that
>
> xml::dom::document doc = xml::dom::parse_file("a.xml");
>
> works with parse_file being a function. That would mean the document is
> copied, but then following your philosophy xml::dom::document could be
> a proxy, too, so copying could be cheap...

I don't think xml::dom::document should be a proxy as I see it as the
container and owner of all the nodes. But if I remember correctly the
syntax above will work if parse_file is a class and document has a
constructor that takes it.

> I really don't like the idea of 'parse_file' being an object (whichs
> state being a potentially already parsed document). It's unintuive.
>
> > parse_stream would indeed be even better. As I recall there are
> > functions in libxml2 that allow you to write to the parser as well.
>
> erm, that's even more confusing, I think. A parser should remain
> just a parser, i.e. something that extracts tokens from an input stream.

It is still just a parser (but works as a state machine). Check out
xmlCreatePushParserCtxt and xmlParseChunk.

Say you need to receive lots of xml files over the internet and parse
them all at once. You could use a thread per connection and have the
parsers read from the stream but that would require lots of threads.

With the "push" interface you can use async io to read from the sockets
and then write the data to the parsers as you get it. Because the state
of the parse is not stored on the stack you do not need a separate
thread for each parser.

-- 
Hamish Mackenzie

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk