Boost logo

Boost :

From: Stefan Seefeld (seefeld_at_[hidden])
Date: 2003-06-13 10:11:01


Hamish Mackenzie wrote:
> On Fri, 2003-06-13 at 14:11, Stefan Seefeld wrote:
>
>>Hmm, I see your point. Well, that would be possible, but that way you
>>are unable to make nodes polymorphic. Neither with respect to the basic
>>node types (Element, Attribute, Text, CData, etc.) nor later when
>>implementing real DOM support on top of it.
>
>
> How about this (I forget the name of this pattern but its in the GOF
> book)....

well, it looks like a mix of things. What you are doing, essentially,
is wrapping a polymorphic 'do_something' method around a non-C++
type system, i.e. the real method invocation is done with a 'type()'
discriminator.
Yes, I can see that, for the xml node types. But for that we don't even
need anything but a single 'node_proxy' class (with a 'type()' method
returning an enum).

>>yeah, looks interesting, and even more thin than my wrapper. However,
>>the thinner the wrapper gets the greater is indeed the danger of having
>>the whole design tied to a particular implementation, as William pointed
>>out earlier. I don't think that this is a problem with my wrapper lib,
>>but with your implementation you get dangerously close...:-)
>
>
> Making it thicker won't make it any easier to apply to other parsers.
> Especially if you rely on the existence of something like _private.

Indeed, and I think this hits the nail on the head: we still only
provide ownership semantics implicitly by delegating to the
implementation.
I think it is crucial to define the right ownership semantics, and
then make sure it is implementable with different libs. In your case
you explicitely operate on node references ('proxies'), while I use
'raw pointers'. We both don't own the real nodes.

I think it is a good thing not to own them, but the semantics should
be clear.

>>yes, and you could even make that an 'element_proxy' as you know that
>>parent nodes are always elements. However, with a flat set of (libxml2)
>>nodes that wouldn't work any more, so runtime polymorphism would be
>>lost. Well, may be there is no need for it either. I have to think over
>>that...
>
>
> True and that would fit in nicely with the code I outlined above
>
> node.first_child().do_something( 0 );
>
> Would go through the polymorphic lookup and
>
> node.parent().do_something( 0 );
>
> Would call the same code but without the lookup overhead.

indeed, though, on further thinking, nodes themselfs don't do anything,
so we could as well keep this polymorphism outside the node class, and
let nodes only provide their type as an enum.

I start to like your node reference class quite a lot... :-)

>>>parse_stream would indeed be even better. As I recall there are
>>>functions in libxml2 that allow you to write to the parser as well.
>>
>>erm, that's even more confusing, I think. A parser should remain
>>just a parser, i.e. something that extracts tokens from an input stream.
>
>
> It is still just a parser (but works as a state machine). Check out
> xmlCreatePushParserCtxt and xmlParseChunk.
>
> Say you need to receive lots of xml files over the internet and parse
> them all at once. You could use a thread per connection and have the
> parsers read from the stream but that would require lots of threads.

Are you suggesting that all the different xml files should be merged
into a single dom document ?

> With the "push" interface you can use async io to read from the sockets
> and then write the data to the parsers as you get it. Because the state
> of the parse is not stored on the stack you do not need a separate
> thread for each parser.

yeah, a parser for asynchronous document creation may be interesting.
But I see that as a somewhat different beast. Simple (local,
synchronous) document creation from an xml file doesn't need to
go over a stateful parser object.

Regards,
                Stefan


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk