Boost logo

Boost Users :

From: Simon J Turner (s_j_turner_at_[hidden])
Date: 2003-02-20 06:30:58


 --- yg-boost-users_at_[hidden] wrote:
>
> >>>>> "Edward" == Edward Diener <eddielee_at_[hidden]> writes:
> Edward>
> Edward> Once you find your IMG match and pick up your ALT sub-match
> Edward> from it, you can use regex_format to change your IMG match to
> Edward> whatever you like based on the string you find in your ALT
> Edward> sub-match.
>
> My post was poorly written; my point was buried under a lot of text.
> My apologies for that, and let me try to be clearer.
>
> Consider these image tags, and how to submatch on the phrase
> "alternate text":
>
> 1) <img SRC="x.gif" ALT="alternate text">
>
> 2) <img SRC="x.gif" BORDER="0" ALIGN="left" ALT="alternate text">
>
> 3) <img SRC="x.gif" whocares="nobody" another_attribute="why?"
> ALT="alternate text">
>
> I think I can write one that submatches on 1. For two and three, I
> would like to have a part of my regular expression that matched
> anything except whitespace, ALT, =, ". I can write a regular
> expression that matches anything but one character, or anything but a
> number of character, but how do I write one that would match anything
> but a word ?

If you use an alternative, you might write something like:

  (alt_subexpr|non-alt_subexpr)*

although you probably don't want the whole thing to match, so maybe

  (?:alt_subexpr|non-alt_subexpr)*

If you write substitute your own expressions into this, you get:

  (?: /* this subexpr matches once per attribute, */
                                /* but we discard the match: */

     \\s+alt\\s*=\"([^\"]*)\" /* if the alt expression matches, */
                                /* we spit out a sub_match with the text */

     | /* OTHERWISE */

     \\s+[a-z]+\\s*=\"[^\"]*\" /* we match any other attribute, and discard
*/

  )* /* we can have many attributes */

which should try to match the first alternative, ie, try to match an ALT="..."
expression (and spit out a sub_match for the quoted text), or match and
discard any other attribute XYZ="...".

I haven't tested this, by the way, but it feels right. It assumes the first
alternative (the alt one) is matched if possible, and the more general one is
tried only if that fails.

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net