Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2019-11-26 16:30:58


Dear All,

Here is my review of the proposed Fixed String library.

In summary I think this would be a useful thing to have in Boost,
but there are a few issues with the code as it currently stands.

This is based on studying the code and docs and the previous
mailing list discussions in September; I haven't actually tried
to use the library.

Firstly regarding the name, I would much prefer static_string
for consistency with boost::container::static_vector and with
P0843. Note that static_vector was the preferred name in
an LEWG poll, according to P0843. It's true that "static"
is an ambiguous word that does not really convey what is
different about this string, but then neither does "fixed"
in my opinion; "fixed capacity" would, but just "fixed" sounds
more like "immutable" to me.

On the subject of P0843, it it worth reading as it contains
some useful links and rationale:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0843r2.html
Also a more recent r3 update and reference implementation:
https://github.com/gnzlbg/static_vector

Here are the features that I think are missing, in no particular
order; apologies if some of this does exist but I've missed it:

- Much more of it should be constexpr. See P0980R1. That mentions
that some of char_traits needs to be constexpr as a prerequisite,
but even without that things like size() and empty() can be
constexpr. There is also a question of whether all of the
elements must be initialised (to 0) by the ctor to be constexpr,
which could be costly for a large but sparse container, though it
does mean that subsequent push_back()s don't need to store
terminators; I don't understand all the issues involved here.
(I note a comment in the docs suggesting the lack of constexpr may
be temporary.)

- In the same vein, more of it should be noexcept - why aren't
empty() and size() ?

- You can add a specialisation for size 0 that turns it into an
empty struct.

- You can add a string literal operator"".

- You can add std::hash overloads.

There are various possibilities for what should happen if the
capacity is exceeded, including throwing an exception, asserting,
and allowing undefined behaviour. I'm not sure what I would
really prefer though I think I'd be more likely to assert() than
to throw. See P0843r2 again which discusses this.

Regarding your deviations from std::string:
operator+ has been omitted because you consider it's not possible
to determine a reasonable return type; isn't it just size N+M ?
Do you reject that just because it could end up rather large after
a sequence of additions? I'd agree if it were N*M but not really for
N+M. Have I missed something?
You've chosen to return a string_view from substr(). I think that's
a bit dangerous; code that does "auto a = s.substr(...)" could end
up with a dangling view when s is a fixed string; this makes it
more difficult than it should be to migrate from one string type
to another, or to write generic code.

Regarding the docs, it would be great if the reference could have
some highlighting (e.g. bright red?) for the few places where it
differs from std::string.

My own use for this sort of string has often been to store very
large numbers of small strings when total memory usage has been a
concern. (A good example would be an textual ID code of some sort,
like a postal code or an airport code etc.) For this case, the
8-byte overhead for the size_t length field is a concern; I'd like
a static_string<6> to need only 8 bytes total. I suggested changing
this to the minimum size needed to store values 0 to N in the
previous email discussion, but that hasn't been implemented.

My initial thought was that static_string should be implemented
on top of boost::container::static_vector<char>. My rationale
for that was that static_vector is well-established and doesn't
need to be re-implemented, and that a string_facade that
adapts a vector<char>-like interface into a string-like interface
would be a useful component in its own right. (I've tried to
write something like this myself, but it gets tedious very quickly
writing out all of those special string methods.) But then I
discovered that actually boost::container::static_vector is
not implemented as a simple size,data[] pair; apparently it
contains pointers to its own storage. So it isn't even trivially-
copyable. Now that I know that, I wonder if what we really need
is a low-storage-overhead static_vector "done properly" (like
the P0843 reference implementation), with a string_facade to
make static_string from static_vector<char>.

On balance, my view is that this proposal is promising but not
quite ready yet. I suggest that the authors should be asked to
come back in a few months with a revised version. My preferred
design would be an improved static_vector and a string_facade
that converts it into a static_string, but if they don't want
to do that then simply tidying up the constexpr, noexcept and
hash issues would be minimally sufficient.

Regards, Phil.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk