|
Boost : |
From: Eric Niebler (eric_at_[hidden])
Date: 2004-10-20 00:06:58
Erik Wien wrote:
>
> Ultimately I feel that the operation of normalization (which involves
> canonical decomposition) of unicode strings should be hidden from the user
> completely and be performed automatically by the library where that is
> needed. (Like on a call to the == operator.) I think that solution would be
> satisfactory for most users as the normalization process is somewhat
> intricate and really not something users should be forced to understand.
>
> Are we at all on the same page now?
>
No. "Normalization" doesn't always mean canonical decomposition. There
are several canonical forms, some of which *require* the use of
composite characters. In fact, the XML standard requires such a
canonical form. A Unicode library cannot hide the issue of
canonicalization from the user, because users will care which canonical
form is being used.
-- Eric Niebler Boost Consulting www.boost-consulting.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk