|
Boost : |
From: Stefan Seefeld (seefeld_at_[hidden])
Date: 2005-11-04 11:37:33
Anthony Williams wrote:
> It is far easier to write a parser that calls user code (push model) than
> write a parser that can be continued (pull model), since in the pull model you
> have to save all the internal state in order to return to the user with each
> token; you basically have to write a "continuations" mechanism.
Fair enough. But here we are (or should be) focussed on the API, i.e. the
user. The question is whether to put the parser in control of the data
flow or the application. While the latter is harder to implement it is also
far more convenient for users.
>>As it happens, the implementation I have in mind uses libxml2, a C
>>library. As such between the application calling 'parse()' and the
>>callbacks are two language boundaries (C++ -> C and C -> C++), so
>>you couldn't even throw exceptions from inside the callbacks and
>>catch them in the main application.
>
>
> That's one of my main criticisms of your suggested API --- it's too tightly
> bound to libxml, and doesn't really allow for substitution of another parser.
Could you substantiate your claim ?
> My other criticism so far is the node::type() function. I really don't believe
> in such type tags; we should be using virtual function dispatch instead, using
> the Visitor pattern. Your traversal example could then ditch the
> traverse(node_ptr) overload, and instead be called with
> document->root.visit(traversal)
Node types aren't (runtime-) polymorphic right now, but is that really a big deal ?
Polymorphism is important for extensibility. However here the set of node types
is well known (and rather limited).
Making nodes polymorphic would imply that the library allocates nodes on the heap,
instead of the stack (as it now does). That could well hurt performance. I'm not
sure how much of an issue that is, though.
Regards,
Stefan
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk