Boost logo

Boost :

From: Thorsten Ottosen (thorsten.ottosen_at_[hidden])
Date: 2008-02-15 04:20:58


Phil Endecott skrev:
> Joe Gottman wrote:
>> I just discovered that the string algorithm predicate is_any_of has an
>> inefficient implementation. Internally it uses a std::set.
>
> It would be nice if things like this could be at least as efficient as
> the equivalent C functionality. I've just done some quick benchmarks
> of possible is_any_of implementations. My test splits my
> /usr/share/dict/words at every lower-case vowel; it's about a megabyte
> and splits into 300,000 pieces.
>
> First using std::set:
>
> struct stdset_is_any_of {
> std::set<char> s;
> stdset_is_any_of(string pat) {
> for(string::const_iterator i=pat.begin(); i!=pat.end(); ++i) {
> s.insert(*i);
> }
> }
> bool operator()(char c) {
> return s.find(c) != s.end();
> }
> };
>
> Relative performance is 0.6.

Fon something as light as char, I suspect std::set<char> or
tr1::unordered_set<char> is not an optimal choice.

Could you try this test with boost::interprocess::flat_set<char> ?
(It's in <boost/interprocess/containers/flat_set.hpp>)

-Thorsten


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk