Boost logo

Boost Users :

Subject: Re: [Boost-users] Simple fast parsing
From: Christopher Jefferson (chris_at_[hidden])
Date: 2009-06-22 11:25:32


On 22 Jun 2009, at 16:11, Joel de Guzman wrote:

> Alan M. Carroll wrote:
>> At 05:49 PM 6/21/2009, you wrote:
>>> Christopher Jefferson wrote:
>>>> Looking at the documentation, Boost::Spirit seems like a very big
>>>> hammer to crack this quite small nut, and it is unclear to me how
>>>> well it would fit into an existing recursive decent parser. Has
>>>> anyone ever used it as such? Is there a simple alternative?
>>> Spirit is well tuned for small parsing tasks like this. It is a
>>> modular RD parser. What you need is what you pay for. The code is
>>> as tight as it can be.
>> I don't think this solves his problem.
>
> What is the problem? Speed, right? Then I don't see why it does
> not solve the problem.
>
>> Note that he got a 10X speed
>> up by changing to a buffer with his existing parser, so the _parsing_
>> code isn't the bottle neck. It's better I/O that's needed. I suspect
>
> And Spirit uses a better I/O through generic Forward Iterators.
>
>> it's the locking done by streams on each operation, so he's basically
>> doing a lock on every character. I think he'd be better off using the
>> forward iterator idea and then either writing a small wrapper class
>> on streams to block read or use the lower level read buffer
>> interface.
>
> I think you are under-estimating the complexity of writing a
> humble number parser from a Forward Iterator. It's not as simple
> as it looks and gets pretty hairy when you get past the toy
> examples.

Yes, this is my concern. As a proof of principle I wrote my own number
generator, but it's only passing about 30% of our internal tests. This
is enough to convince me, possibly falsely, it works "in principle"
and could be fixed without too much loss of speed, but I worry how
much work that might be.

(For those curious, a quick glance suggests the two main problems are
not handling negative numbers or recognising overflow. Both easily
fixable, but I don't feel necessary for an experiment).

Chris


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net