Boost logo

Boost Users :

From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2019-10-26 16:41:48


On Sat, Oct 26, 2019, 12:41 AM Rainer Deyke via Boost-users <
boost-users_at_[hidden]> wrote:

> On 26.10.19 03:11, Zach Laine via Boost-users wrote:
> > If you care about portable Unicode support, or even addressing the
> > embarrassment of being the only major production language with next to no
> > Unicode support, please have a look and provide feedback.
>
> I can't see myself using the string layer at all. My codebase is too
> deeply linked to std::string, as is the standard library, and a fair
> number of third-party libraries I am using. Also, the primary advantage
> of the string layer seems to be a narrower interface, which is not an
> advantage at all to me as a user.

It is also a place to experiment with things like ropes and string
builders. I would like to standardize both, and I need a string that
actually interoperates with those to show how they might work.

std::string::find may be bad design,
> but it doesn't hurt me, it just makes finding elements in a string
> slightly more convenient.
>

But it does hurt newcomers to the language, who must learn a slightly
different API for string and string_view, and static_string and
fixed_string if we get those. It also hurts the standardization effort to
review all those APIs. You cannot use the std::string search algorithms on
spans and other ranges or views either.

Returning -1 instead of the end index it's also pretty horrible.

If convenience is so paramount, why don't we add member sort () to vector?
This is not a troll, I would really like to know. I want to find something
in a vector or sort a vector about as often as I want to find a character
or subsequence within a string. What, to you, is the difference? If there
isn't one, please explain that too.

I am very much interested in the unicode layer. I am currently using
> ICU, and I'd really like to remove this dependency. ICU is big, it's
> difficult to build, and I'm stuck on an older version because of
> compatibility issues.
>
> As for the text layer, the fact that it uses FCC means that I probably
> won't use it because I have standardized on NFD.
>

Completely understandable.

NFC, very close to FCC, is more popular, due to its compactness. I picked
the normalization form with the most readily available time and space
optimizations, and then stuck to just that one -- the alternative is many
text types with different normalizations having to interoperate, which
sounds like hell.

Zac



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net