Boost logo

Boost :

Subject: Re: [boost] [spirit] scan_keyword: scan an InputIterator for a keyword.
From: Joel de Guzman (joel_at_[hidden])
Date: 2011-11-29 19:06:20


On 11/30/2011 6:23 AM, Vicente J. Botet Escriba wrote:
> Le 29/11/11 00:47, Joel de Guzman a écrit :
>> On 11/29/2011 5:45 AM, Vicente J. Botet Escriba wrote:
>>> Hi,
>>>
>>> for chrono input I'm using a scan_keyword function that adapted from libc++ library (you
>>> can find the code in[1]).
>>> Next follows the interface:
>>>
>> [snip]
>>> What is the better way to do that (or something similar) with Spirit/Lexer?
>>> Can we hope that the Spirit solution could perform much better?
>>> What about if the strings to parse are know at compile-time?
>>>
>>> [1] https://svn.boost.org/svn/boost/trunk/boost/chrono/detail/scan_keyword.hpp
>> The symbol parser seems to be your friend here. http://tinyurl.com/7vue9nu
>> The search is O(log n+k) using a TST.
> Hi,
>
> thanks for the pointer. This seems quite simple, I will try it and compare the performances.
> Does the parser try to scan the largest symbol?

Yes, it parses as much as it can.

> Shouldn't Lexer be more adequate/efficient to this use case?

Yes, you can also use Spirit Lex.

> Sorry, what is a TST?

Ternary Search Trees are faster than hashing for many typical search
problems especially when the search interface is iterator based.
Searching for a string of length k in a ternary search tree with
n strings will require at most O(log n+k) character comparisons.
TSTs are many times faster than hash tables for unsuccessful searches
since mismatches are discovered earlier after examining only a
few characters. Hash tables always examine an entire key when
searching.

For details see http://www.cs.princeton.edu/~rs/strings/.

Regards,

-- 
Joel de Guzman
http://www.boostpro.com
http://boost-spirit.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk