|
Boost : |
From: Martin Bonner (martin.bonner_at_[hidden])
Date: 2005-06-24 03:16:43
John Maddock wrote:
> currently in most cases a tolower will do the job,
> and even for fairly strict Unicode
> conformance a tolower(toupper(c)) will work.
You do know that toupper(c) is conceptually broken, don't you?
There is at least one character (U+00DF Latin Small Letter Sharp S), whose
upper case equivalent in German is TWO characters. Of course, in the case
of regular expressions, it might well be acceptable or even best to treat
scharfes-es as equivalent to "ss" and then - however I always get nervous
when I see people using tolower/toupper in I18N contexts.
(Note that tolower is worse, to convert a string to lowercase in German you
need a dictionary to decide whether to use "ss" or scharfes-es - and I think
that there is a case where you need to interpret the text to decide which
word it is! Oh, and finally, this all assumes that the proposed German
spelling reforms don't happen because I think they largely eliminate the
scharfes-es :-)
-- Martin Bonner Martin.Bonner_at_[hidden] Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ, ENGLAND Tel: +44 (0)1223 441434
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk