Boost logo

Boost Users :

Subject: Re: [Boost-users] regex_replace for removing words
From: Kraus Philipp (philipp.kraus_at_[hidden])
Date: 2011-05-22 15:42:12


Am 22.05.2011 um 18:58 schrieb John Maddock:

>> I have some texts and I would like to remove words. Each word for
>> removing is in a std::vector<std::string> and I use a std::string
>> for separating the words on the text. My first idea is to use the
>> boost:split for splitting words of the text, in the result vector
>> I remove the words, which should be removed, and recreate the text
>> from the vector by concatination the elements. But another idea is
>> to create a regexpr with the removing word and the separators and
>> use regex_replace for removing.
>> Which idea is the better one or is there another way to removing
>> word of a text?
>
> I would create a regular expression of all the words you want to
> remove:
>
> \<(?:word1|word2|word3|word4)\>
>
> Then use regex_replace with "" as the replacement string.

Thanks. I have now a problem during creation. The wordlist is
generated automatically, so I must mask each word in the correct way.
Do you have an idea how I can do this? Because some words can be web
content like < > or there are chars like '

Can I do a case-insensitive replace or must I switch the case of my
text?

Thanks

Phil


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net