From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-10-20 02:17:42
Peter Dimov wrote:
>> Ultimately I feel that the operation of normalization (which involves
>> canonical decomposition) of unicode strings should be hidden from the
>> user completely and be performed automatically by the library where
>> that is needed. (Like on a call to the == operator.)
> It appears that there are two schools of thought when it comes to string
> design. One approach treats a string purely as a sequential container of
> values. The other tries to represent "string values" as a coherent whole.
> It doesn't help that in the simple case where the value_type is char the
> two approaches result in mostly identical semantics.
> My opinion is that the std::char_traits<> experiment failed
I agree to that.
> conclusively demonstrated that the "string as a value" approach is a dead
How was it demonstrated?
There are two separate questions. First, is how many operations are methods
of 'string' and how many are external. Contrary to what Exception C++ says,
I believe many methods in string is OK. As an example, QString presents
huge but consistent interface, while in standard C++ we have string,
boost::format, boost::tokenizer and boost::string_algo , and simply it's
too many separate docs to look at.
Second question is if operator==, operator< or 'find' should operate on
vector<char_XX> or on abstract characters, using Unicode rules, or there
should be two versions. I don't really understand why 'unicode-unaware'
semantic is ever needed, so we should have only 'unicode-aware' one.
> But I may be wrong. :-)
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk