Boost logo

Boost-Build :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-12-01 03:33:10


On Wednesday 01 December 2004 01:11, David Abrahams wrote:
> Hi. I'm trying to build a new docutils tool for processing
> RestructuredText. I'm trying to write a scanner:
>
> class rst-scanner : common-scanner
> {
> rule pattern ( )
> {
> return "^\\w*\\.\\.\\w+(include|image|figure)::\w+(.*)" ;
> }
> }
>
> The problem is that the 2nd parenthesized group identifies the filename;
> the first one should be ignored, and there's no way to express that
> nicely. I'm guessing that this hack might work:
>
> class rst-scanner : common-scanner
> {
> rule pattern ( )
> {
> return "^\\w*\\.\\.\\w+include::\w+(.*)"
> "^\\w*\\.\\.\\w+image::\w+(.*)"
> "^\\w*\\.\\.\\w+figure::\w+(.*)"
> ;
> }
> }
>
> But really, there ought to be a more general mechanism, shouldn't there?

I think the problem is that

1. We can't mark a specific group in regexp as "unimportant". In Perl/Python,
regexp can have "non-capturing paranthesis",

>>> r = re.compile("(?:foo|bar)(.*)")
>>> m = r.match("foo10")
>>> m.group(1)
'10'
>>>

but I don't think bjam's regexps support that

2. Bjam passes only the first matched parenthesised group to the scanner.

Looks like your hack is the simplest solution.

- Volodya

 


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk