Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-10-20 02:17:42


Peter Dimov wrote:

>> Ultimately I feel that the operation of normalization (which involves
>> canonical decomposition) of unicode strings should be hidden from the
>> user completely and be performed automatically by the library where
>> that is needed. (Like on a call to the == operator.)
>
> It appears that there are two schools of thought when it comes to string
> design. One approach treats a string purely as a sequential container of
> values. The other tries to represent "string values" as a coherent whole.
> It doesn't help that in the simple case where the value_type is char the
> two approaches result in mostly identical semantics.
>
> My opinion is that the std::char_traits<> experiment failed

I agree to that.

> and
> conclusively demonstrated that the "string as a value" approach is a dead
> end,

How was it demonstrated?

There are two separate questions. First, is how many operations are methods
of 'string' and how many are external. Contrary to what Exception C++ says,
I believe many methods in string is OK. As an example, QString presents
huge but consistent interface, while in standard C++ we have string,
boost::format, boost::tokenizer and boost::string_algo , and simply it's
too many separate docs to look at.

Second question is if operator==, operator< or 'find' should operate on
vector<char_XX> or on abstract characters, using Unicode rules, or there
should be two versions. I don't really understand why 'unicode-unaware'
semantic is ever needed, so we should have only 'unicode-aware' one.

> But I may be wrong. :-)

Me too.

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk