Boost logo

Boost :

From: Eric Niebler (eric_at_[hidden])
Date: 2004-10-20 00:06:58

Erik Wien wrote:

> Ultimately I feel that the operation of normalization (which involves
> canonical decomposition) of unicode strings should be hidden from the user
> completely and be performed automatically by the library where that is
> needed. (Like on a call to the == operator.) I think that solution would be
> satisfactory for most users as the normalization process is somewhat
> intricate and really not something users should be forced to understand.
> Are we at all on the same page now?

No. "Normalization" doesn't always mean canonical decomposition. There
are several canonical forms, some of which *require* the use of
composite characters. In fact, the XML standard requires such a
canonical form. A Unicode library cannot hide the issue of
canonicalization from the user, because users will care which canonical
form is being used.

Eric Niebler
Boost Consulting

Boost list run by bdawes at, gregod at, cpdaniel at, john at