On Sat, Oct 26, 2019, 12:41 AM Rainer Deyke via Boost-users <boost-users@lists.boost.org> wrote:
On 26.10.19 03:11, Zach Laine via Boost-users wrote:
> If you care about portable Unicode support, or even addressing the
> embarrassment of being the only major production language with next to no
> Unicode support, please have a look and provide feedback.

I can't see myself using the string layer at all.  My codebase is too
deeply linked to std::string, as is the standard library, and a fair
number of third-party libraries I am using.  Also, the primary advantage
of the string layer seems to be a narrower interface, which is not an
advantage at all to me as a user. 

It is also a place to experiment with things like ropes and string builders.  I would like to standardize both, and I need a string that actually interoperates with those to show how they might work.

std::string::find may be bad design,
but it doesn't hurt me, it just makes finding elements in a string
slightly more convenient.

But it does hurt newcomers to the language, who must learn a slightly different API for string and string_view, and static_string and fixed_string if we get those.  It also hurts the standardization effort to review all those APIs.  You cannot use the std::string search algorithms on spans and other ranges or views either.

Returning -1 instead of the end index it's also pretty horrible.

If convenience is so paramount, why don't we add member sort () to vector?  This is not a troll, I would really like to know.  I want to find something in a vector or sort a vector about as often as I want to find a character or subsequence within a string.  What, to you, is the difference?  If there isn't one, please explain that too.

I am very much interested in the unicode layer.  I am currently using
ICU, and I'd really like to remove this dependency.  ICU is big, it's
difficult to build, and I'm stuck on an older version because of
compatibility issues.

As for the text layer, the fact that it uses FCC means that I probably
won't use it because I have standardized on NFD.

Completely understandable.

NFC, very close to FCC, is more popular, due to its compactness.  I picked the normalization form with the most readily available time and space optimizations, and then stuck to just that one -- the alternative is many text types with different normalizations having to interoperate, which sounds like hell.

Zac