Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2005-07-01 01:49:21


David Abrahams wrote:
> "Eric Niebler" <eric_at_[hidden]> writes:
>>
>>OK, I'll hold off on filing a DR until we have some
>>recommendations.
>
> Suggestion: don't keep this under your hat. At least alert the LWG
> and say that you will have recommendations in a few days.
>

That's probably good advice. I've sent a DR to comp.std.c++.

Also, I may have found another issue, closely related to the one under
discussion. It regards case-insensitive matching of named character
classes. The regex_traits<> provides two functions for working with
named char classes: lookup_classname and isctype. To match a char class
such as [[:alpha:]], you pass "alpha" to lookup_classname and get a
bitmask. Later, you pass a char and the bitmask to isctype and get a
bool yes/no answer.

But how does case-insensitivity work in this scenario? Suppose we're
doing a case-insensitive match on [[:lower:]]. It should behave as if it
were [[:lower:][:upper:]], right? But there doesn't seem to be enough
smarts in the regex_traits interface to do this.

Imagine I write a traits class which recognizes [[:fubar:]], and the
"fubar" char class happens to be case-sensitive. How is the regex engine
to know that? And how should it do a case-insensitive match of a
character against the [[:fubar:]] char class? John, can you confirm this
is a legitimate problem?

I see two options:

1) Add a bool icase parameter to lookup_classname. Then,
lookup_classname( "upper", true ) will know to return lower|upper
instead of just upper.

2) Add a isctype_nocase function

I prefer (1) because the extra computation happens at the time the
pattern is compiled rather than when it is executed.

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk