Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-10-20 02:00:40


Eric Niebler wrote:

> Erik Wien wrote:
>
>>
>> Ultimately I feel that the operation of normalization (which involves
>> canonical decomposition) of unicode strings should be hidden from the
>> user completely and be performed automatically by the library where that
>> is needed. (Like on a call to the == operator.) I think that solution
>> would be satisfactory for most users as the normalization process is
>> somewhat intricate and really not something users should be forced to
>> understand.
>>
>> Are we at all on the same page now?
>>
>
> No. "Normalization" doesn't always mean canonical decomposition. There
> are several canonical forms, some of which *require* the use of
> composite characters. In fact, the XML standard requires such a
> canonical form. A Unicode library cannot hide the issue of
> canonicalization from the user, because users will care which canonical
> form is being used.

Why? If I want to compare two string, I don't really care which normalized
form is used.

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk