On 26.10.2019 03:11, Zach Laine via Boost-users wrote:
> About 14 months ago I posted the same thing. There was significant
> work that needed to be done to Boost.Text (the proposed library), and
> I was a bit burned out.
>
> Now I've managed to make the necessary changes, and I feel the library
> is ready for review, if there is interest.
>
> This library, in part, is something I want to standardize.
>
> It started as a better string library for namespace "std2", with
> minimal Unicode support. Though "std2" will almost certainly never
> happen now, those string types are still in there, and the library has
> grown to also include all the Unicode features most users will ever need.
>
> Github: https://github.com/tzlaine/text
> Online docs: https://tzlaine.github.io/text
>
> If you care about portable Unicode support, or even addressing the
> embarrassment of being the only major production language with next to
> no Unicode support, please have a look and provide feedback.
Puuting an issue of standardization aside, I certainly would love to see
something like that included in Boost. After a quick read of you docs
(about an hour), I'm not sure I'm happy with all the choices you've made
(see some remarks below) but overall I see it as something I would use
in the future. As you wrote, Unicode is hard, even with a library like
this; nearly mission impossible without.
Few remarks, for all their worth:
- I've never seen std::string and thread (un)safety as an issue
Fair enough. As stated previously in this thread, the threadsafety feature is a side effect that comes from the copy-on-write semantics of rope. *That* is the reason rope is designed the way it is, not the threadsafety part. It just happens that the threadsafety part comes for free when you do the copy-on-write part.
- pattern if (x == npos) is now so common that is imho important to
preserve it
The std::string/std::string_view API is the only place in the STL where the algorithms do not return the end of the half-open input range on failure. That's really wonky. I don't care about preserving it.
- for the sake of completeness the normalization type used at the text
level ought to be a policy parameter; although I do understand your
arguments against it I think it should be there even at the cost of
different text types being inoperable without conversions
I disagree. Policy parameters are bad for reasoning. If I see a text::text, as things currently stand, I know that it is stored as a contiguous array of UTF-8, and that it is normalized FCC. If I add a template parameter to control the normalization, I change the invariants of the type. Types with different invariants should have different names. To do otherwise is a violation of the single responsibility principle.
- at the text level I'm not sure I'm willing to cope with different
fundamental text types; I just want to use boost::text::text, pretty
much the same as I use std::string as an alias to much more complex
class template; heck, even at the string layer I'd probably prefer
rope/contiguous concept to be a policy parameter to the same type template.
That would be like adding a template parameter to std::vector that makes it act like a std::deque for certain values of that parameter. Changing the space and time complexity of a type by changing a template parameter is the wrong answer.
- views should be introduced as views and not mixed with rope/contiguous
fundamental types
That does not sound like what I want either, but I don't know what this refers to. Could you be specific?
Zach