Boost logo

Boost Users :

From: Jeff (jeff_j_dunlap_at_[hidden])
Date: 2007-04-08 03:47:41


Richard Dingwall <rdingwall <at> gmail.com> writes:

> Have you tried a simpler pattern:
> <a .*href.*/a>

Richard, thank you for responding. I tried your pattern above and it continues
beyond '</a>'.

The pattern that I am now using below accurately matches everything from
here '<a href=' to here '</a>'.

////////////////////////////////
char exp[] = "<a href(.*?)/a>";
boost::regex e(exp, boost::regex::normal | boost::regbase::icase);
boost::sregex_token_iterator i(sFileCont.begin(), sFileCont.end(), e, 0);
boost::sregex_token_iterator j;
while(i != j)
  cout << *i++ << "\n";
////////////////////////////////

Please note tht sregex_token_iterator's 4th parameter is set to submatch = 0 in
my code above. This leaves me with 2 questions:

1. although I have specified submatch = 0, I am creating a marked sub-
expression, (.*?), and I don't understand why the sub-expression is required or
if there is a better way that I don't know about.

2. why is the ? required in the sub-expression above?

Thanks again

> (Infinite wild match before the href in case they decide to put the
> target attribute first or something).
>
> Richard


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net