Boost logo

Boost :

Subject: Re: [boost] Boost.Unicode (was Re: Boost.Locale)
From: Artyom (artyomtnk_at_[hidden])
Date: 2010-12-16 06:32:21


> > 2. case conversion - is locale dependent - for example if the locale is >Turkish > > then upper("i")=="İ" while upper("i")="I" for other languages. > > Simple case conversions are the easy 1:1 language- and context-agnostic >mappings. > > I can't do the more complex conversions because they depend on specific >languages and contexts. > > Thankfully case folding is not language- nor context-dependent, and is >probably what most > people want rather than case conversion. Then don't do case conversion! Do just case folding. For such "simple" and incorrect case conversion I don't need sophisticated Unicode library, I can use use standard operating system API and even std::locale::ctype very successfully (which I do in Boost.Locale if user prefers to use non-icu based backend) Case conversion is: - context dependent: Greek letter "Σ" is converted to "σ" or to "ς", according to position in the word. - locale dependent: Turkish i goes to İ - not 1-to-1: German ß goes to SS in upper case. So if you don't do this right, just don't do it. I'm not sure about case-folding but AFAIK it is not 1-to-1 as well - but I may be wrong. > Yes, it definitely is; but you could still have a "general" collation that >would work > well enough for most languages. For general collation that works "well" in most languages I can use strcmp... I don't need Unicode library for this. Artyom


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk