|
Boost : |
From: John Maddock (john_at_[hidden])
Date: 2005-06-24 07:00:14
>> currently in most cases a tolower will do the job,
>> and even for fairly strict Unicode
>> conformance a tolower(toupper(c)) will work.
>
> You do know that toupper(c) is conceptually broken, don't you?
>
> There is at least one character (U+00DF Latin Small Letter Sharp S), whose
> upper case equivalent in German is TWO characters. Of course, in the case
> of regular expressions, it might well be acceptable or even best to treat
> scharfes-es as equivalent to "ss" and then - however I always get nervous
> when I see people using tolower/toupper in I18N contexts.
I know, and I mentioned this in my last post. However, note that we don't
even have an API that can handle this, certainly not std::ctype!
> (Note that tolower is worse, to convert a string to lowercase in German
> you
> need a dictionary to decide whether to use "ss" or scharfes-es - and I
> think
> that there is a case where you need to interpret the text to decide which
> word it is! Oh, and finally, this all assumes that the proposed German
> spelling reforms don't happen because I think they largely eliminate the
> scharfes-es :-)
Nice to know that English isn't the only "illogical" language out there ;-)
John.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk