Boost logo

Boost Users :

From: yg-boost-users_at_[hidden]
Date: 2003-02-19 14:38:33


>>>>> "Edward" == Edward Diener <eddielee_at_[hidden]> writes:
Edward>
Edward> Once you find your IMG match and pick up your ALT sub-match
Edward> from it, you can use regex_format to change your IMG match to
Edward> whatever you like based on the string you find in your ALT
Edward> sub-match.

My post was poorly written; my point was buried under a lot of text.
My apologies for that, and let me try to be clearer.

Consider these image tags, and how to submatch on the phrase
"alternate text":

1) <img SRC="x.gif" ALT="alternate text">

2) <img SRC="x.gif" BORDER="0" ALIGN="left" ALT="alternate text">

3) <img SRC="x.gif" whocares="nobody" another_attribute="why?"
   ALT="alternate text">

I think I can write one that submatches on 1. For two and three, I
would like to have a part of my regular expression that matched
anything except whitespace, ALT, =, ". I can write a regular
expression that matches anything but one character, or anything but a
number of character, but how do I write one that would match anything
but a word ?

It seems to me that the best way would be to make a regular expression
for the word, and negate it in the regular expression you actually
use:

static const boost::regex start_of_alt("
   \\s+ /* at least one whitespace */
   alt\\s* /* alt followed by 0 or more whitespace */
   =\\s* /* = followed by 0 or more whitespace */
   \" /* a quote */
   ", boost::regbase::normal | boost::regbase::icase);

static const boost::regexp img_tag_with_alt_submatch("
    <\\s* /* a <, followed by 0 or more whitespace */
    img\\s+ /* IMG, followed by at least one whitespace */
    ^@start_of_alt /* anything that doesn't match the previously
                   /* defined regular expression start_of_alt */
    @start_of_alt /* the start_of_alt regular expression defined
                   /* above */
    ([^\"])* /* 0 or many instances of not-a-quote, in the
                   /* sub-matching parens */
    [^>]*> /* anything except a >, and then the > */
    ", boost::regbase::normal | boost::regbase::icase);

Is there a way to refer to previous regular expressions within another
regular expression, as I did in the lines that use "@start_of_alt" ?

--Rob


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net