Boost logo

Boost Users :

Subject: Re: [Boost-users] [regexp] Replace a substring with a regexp
From: Julian Gonggrijp (j.gonggrijp_at_[hidden])
Date: 2011-03-17 09:19:14


Olivier Tournaire wrote:

> I am not really familiar with regexp, an I am facing a
> problem. I have some strings containing unicode sequences
> (like "\u****"), and I would like to replace them with html
> sequences (such that "\u****" becomes "&#x****;"). I think I
> can do that with boost regexp, but I really do not know how.
> The major problem is that I do not now in advance what are
> the characters for "****". I however know that they are
> always 4 and alphanumeric. So, I have to detect them and
> also append after a ";".

I have no experience with Boost.Regex, but these are the
notations you need.

Search pattern: "\\u(\w{4})"

Replacement pattern: "\&#\1;"

Here, "\1" stands for "the match to the first pattern in
parentheses", so that's your four digits. You'll have to
refer to the Boost.Regex manual to find out how to apply
these patterns.

HTH, Julian


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net