Boost logo

Boost Users :

Subject: Re: [Boost-users] regular expression too complex
From: Kraus Philipp (philipp.kraus_at_[hidden])
Date: 2011-05-24 04:58:32


Am 24.05.2011 um 09:02 schrieb Viatcheslav.Sysoltsev_at_[hidden]:

>>
>> Is there a solution to create any removing operation? I have got a
>> vector with strings and I must remove each element on the texts, so
>> I create a regular expression with "or" and case-insensitive
>> search ans use regex_replace to remove the words
>>
>
> If you care about performance, write your own matching routine. I'd
> build a tree/forest of chars from your matching words, one pointer
> goes through original string, N pointers may follow the matching
> tree, the original text gets copied char by char (one pass), the
> matching pointers runs on tree/forest, if one of the matching
> pointers goes through, you have the match and move the writing
> output pointer back. The details like longest match or encoding
> support are up to you how to handle. Should be much faster than
> general regexp.

Performance is not my primary aspect. I would like to use a component
that can do this, because the remove only runs one time. Is there a
framework of the Boost that I can use like state machines or anything
else?

But the idea with tree / forest is very nice

Thx

Phil


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net