Boost logo

Boost :

From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2019-11-26 17:33:56


On Tue, Nov 26, 2019 at 8:31 AM Phil Endecott via Boost
<boost_at_[hidden]> wrote:
> I would much prefer static_string

Same here... I was foolish to rename it.

> - Much more of it should be constexpr.

I agree that it could be constexpr (at least some of it) but this can
be added as an improvement later.

Taking the opposite position, why should it be constexpr? The purpose
of this container is to serve as a performant substitute for
std::string when the maximum size is known ahead of time or can be
reasonably estimated. In which use-cases do you anticipate the need
for a constexpr static_string? It would certainly waste space, unless
the caller sizes it perfectly. But then what's the difference between
that and a constexpr string literal or string view?

In my opinion people are too quick to say "constexpr it" without
evaluating the necessity. I'm not saying we shouldn't do it (but see
below), however lets not pretend that there are legions of users eager
for a constexpr fixed capacity mutable string.

> There is also a question of whether all of the
> elements must be initialised (to 0) by the ctor to be constexpr,
> which could be costly for a large but sparse container, though

I think initializing everything to 0 is a non-starter for the
non-constexpr case, due to the performance penalty. And initializing
to 0 only for constexpr requires new language features I believe?
std::is_constexpr_evaluated or some such.

> it does mean that subsequent push_back()s don't need to store
> terminators;

It would still need to null-terminate, for the case where characters
are removed from the string and subsequently reinserted.

> - You can add a specialisation for size 0 that turns it into an
> empty struct.
>
> - You can add a string literal operator"".
>
> - You can add std::hash overloads.

I'm not seeing the value in these features. What's the use case? No
one will be using static_string as the key for a container, since it
wastes space. Why do we need a string literal operator, to produce a
string constant that takes up more space than necessary? And why are
we "optimizing" the case where size==0? Is this something that anyone
will notice or care about (aside from WG21 of course, which often
concerns itself with not-relevant things).

> There are various possibilities for what should happen if the
> capacity is exceeded, including throwing an exception, asserting,
> and allowing undefined behaviour. I'm not sure what I would
> really prefer though I think I'd be more likely to assert() than
> to throw. See P0843r2 again which discusses this.

Quoting some nonsense from p0843r2:

    1. Throw an exception.
    2. Make this a precondition violation.
    ...
   It could throw either std::out_of_bounds or std::logic_error but in any
    case the interface does not end up being equal to that of std::vector.
    ...
  The alternative is to make not exceeding the capacity a precondition on
  the static_vector's methods.

So basically Gonzalo (the paper's author) says that throwing an
exception of the wrong type makes the interface different from
std::vector, thus the solution is to make the interface even more
different from vector by requiring callers to check sizes. Gonzalo
lists existing practice of Boost.Container's static_vector, with
throws on overflow, as existing practice. Only in the bizarro world of
WG21 could someone write a paper which quotes existing practice, then
proceed to deviate from existing practice for illogical reasons.

We can return to sanity by recognizing what the purpose of static_string is for:

    "...a performant substitute for std::string when the maximum size is known
    ahead of time or can be reasonably estimated."

Throwing an exception is the obvious rational choice. It lets
static_string be mostly a drop-in replacement, without requiring call
sites to add extensive code to check for preconditions. It guarantees
that any successful modifications to the static_string will be exact
(no truncation). And most importantly it matches existing practice
(e.g. boost::container::static_vector and others like it).

> Regarding your deviations from std::string:
> operator+ has been omitted because you consider it's not possible
> to determine a reasonable return type; isn't it just size N+M ?
> Do you reject that just because it could end up rather large after
> a sequence of additions? I'd agree if it were N*M but not really for
> N+M. Have I missed something?

Yes. It is too easy for someone to change existing code that does
a+b+c+d where the operands are strings. Users could the get surprising
behavior of tens or hundreds of kilobytes of stack consumption. There
are other ways to implement operator+ but we prefer to the omit
interfaces that can have surprising behavior, or that have non-obvious
results (such as truncating).

> You've chosen to return a string_view from substr(). I think that's
> a bit dangerous; code that does "auto a = s.substr(...)" could end
> up with a dangling view when s is a fixed string; this makes it
> more difficult than it should be to migrate from one string type
> to another, or to write generic code.

Again we have to refer to the purpose of the container. People are
using this for performance, the very last thing they want from
substr() is to receive a copy. Users can opt-in to making a copy if
they want, by constructing a new static_string from the string view
returned by substr. If we go with your suggestion, no one can opt out
of copies.

> My initial thought was that static_string should be implemented
> on top of boost::container::static_vector<char>. My rationale...

No normal user is going at static_string and think "I would use this
if it required Boost.Container." Fewer dependencies is better. And,
this library is designed to work without the rest of Boost (the
"standalone" mode). I want people to be able to do "vcpkg install
static_string" and then be ready to boogie. They will still have to
type `boost::` to access it.

Regards


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk