Boost logo

Boost :

From: John Maddock (John_Maddock_at_[hidden])
Date: 2001-11-03 07:46:00


>I'm looking at stripping leading and trailing spaces off a string.
Why does the expression;

std::string file(" hello ");
boost::cmatch whatis;
boost::regex_match(file, whatis, boost::regex("^\\s*(.*?)\\s*$"));

match the leading spaces - i.e. it results in " hello", rather
than "hello" ?
<

It's the difference between perl and POSIX matching rules - POSIX follows
the "leftmost longest" rule, in this case the match given for $1 is further
to the left and longer than the alternative (the one that perl finds). You
will also see this behaviour in sed for example:

sed 's/[ ]*\(.+\)/\1/g' << EOF
    hello
EOF

prints:
    hello

To get the behaviour you want either:

1) mark the leading \\s* so that it is considered as part of the "leftmost
longest" rule.
2) use a more precise expression: "^\\s*(\\S.*?)\\s*$" will do the same
thing (except that no match will be found if the text contains only
spaces), but is both more precise (in what it matches) and more efficient
(in how it does it). The advantage of getting used to using this kind of
expression is that it works the same both under perl rules and those that
POSIX defines.

- John Maddock
http://ourworld.compuserve.com/homepages/john_maddock/


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk