Boost logo

Boost :

From: Andreas Huber (ahd6974-spamgroupstrap_at_[hidden])
Date: 2005-03-12 06:28:46


Jonathan Turkanis wrote:
>> http://article.gmane.org/gmane.comp.lib.boost.devel/119246
>
> Sorry I missed this. I think I missed the whole message.
>
>> Let me know whether that's convincing or not.
>
> Okay. Let me see if I understand you correctly:
>
> Andreas Huber wrote:
>> ... What I want to say here
>> is that it is a bad idea to use this library for parsing, as it is so
>> general that performance tradeoffs are inevitable.
>
> This seems to be a restatement of the performance vs. scalability
> part of the rationale, correct?

Yes.

>> For parsing, one
>> is much better off with a parser framework, which is optimized for
>> the task.
>
>> The IMO important observation is the following: Parsers
>> like Spirit only need to return when they have finished processing
>> all "events" (characters). An FSM however must return after
>> processing each event. can exploit this fact and store state
>> with recursive calls, which is of course much faster (some stuff can
>> even be inlined) than any super-clever double-dispatch scheme.
>> There is often a hard limit for stack space which - I think - is the
>> reason why FSMs do make sense in parsers nevertheless.
>
> Here are you talking about any FSM framework or about your library in
> particular?

I'm talking about any FSM framework.

> At any rate, I don't understand why recursion is more
> efficient.

Recursion can be implemented with simple (non-virtual) function calls.
Moreover, if recursion depth is known at compile time, you can inline
such calls. I believe double-dispatch needs at least one indirection
(virtual call, function pointer, whatever). I might be wrong.

> In the case I mentioned -- non-blocking filters -- there is a strong
> resemblence with FSMs because the filter may be interrupted at any
> point during its processing if input is temporarily unavailable or if
> output buffers are full. Filter writers must categorize the various
> points at which the filtering algorithm might be interupted and save
> this information so that when processing begins again the algorithm
> can continue correctly. These categories correspond to states.
> Furthermore, there is often auxilliary information that must be
> stored; e.g., if a filter was interupted after it had consumed a
> character but before it could pass it downstream, the character must
> be stored together with the state. This auxilliary information varies
> from state to state; it would therefore seem to correspond to
> state-local storage.

Ok.

>> However, all
>> the FSM-employing parsers I know are of the dynamic variant (e.g.
>> regex) and these FSMs are not directly designed by humans but
>> computed by an algorithm.
>
> I don't see the relevance of this.

It's a real-world data point that supports (only weakly, admittedly) my
view that you not normally implement parsers-FSMs manually but you use a
tool to do that for you.

>> Last but not least I think it would be
>> tedious for a human to implement a parser on the state- level
>
> It's also tedious to write a non-blocking filter by hand; being able
> to use FSMs could help bring some order to the chaos.

I can't really comment on this, I'm not knowledgeable about these
things.

Regards,

-- 
Andreas Huber
When replying by private email, please remove the words spam and trap
from the address shown in the header. 

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk