Boost logo

Boost Users :

Subject: [Boost-users] Boost.Locale: normalized concatenation?
From: Narcoleptic Electron (narcoleptic.electron_at_[hidden])
Date: 2012-04-11 12:56:38


I'm a happy user of boost.locale, but there is a use case that I can't
see a solution for. I would like to concatenate two long, canonically
normalized (NFC) UTF-8 strings. It seems that the only way to
currently do this is by calling boost::locale::normalize on the
resulting string. This is wasteful, as it requires walking the entire
string when only a well-defined substring of each (at the boundary)
can possibly require modification.

The ideal solution would be for boost.locale to expose something like
ICU's unorm2_normalizeSecondAndAppend, which takes advantage of
normalization guarantees in the Unicode standard to only normalize the
boundary where it is required.

Does this capability already exist in boost.locale?

Thanks.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net