Boost logo

Boost :

From: Vinnie Falco (vinnie.falco_at_[hidden])
Date: 2019-09-12 14:05:16


On Thu, Sep 12, 2019 at 6:19 AM Mike via Boost <boost_at_[hidden]> wrote:
> Not sure, if this counts as substantial...

Yes it does, pretty much anything that is not simply cosmetic (i.e.
naming) is substantial :)

> a) to make the string trivially copyable (simply copy the whole
> storage, not just the bytes actually containing the string).
> This might make things easier for the optimizer.

The purpose of fixed_capacity_string (see what I did there? LOL) is to
facilitate allocation-free string mutation when the algorithm can
reasonably impose an upper limit on the size of the resulting string.
For example, Beast imposes an generous upper limit on some of the
elements which make up the HTTP header. This allows the parser to use
a stack based string to perform calculations and avoid allocations.

To answer your question, no I have not considered the trivially
copyable attribute when I wrote the class. We can certainly make it
trivially copyable. Or not. I don't have a preference either way
because my code does not copy such strings, and I do not know what the
resulting performance characteristics will be for the use-cases where
copying is needed. I can say that my strings are on the order of
kilobytes, e.g. 1KB, 2KB, 4KB size strings. Copying all of a 4KB
string when only a few hundred bytes are used may not be ideal, but
absent experimentation it is hard to say if the tradeoff is worth it.

> b) to conditionally enable support for std::string_view in c++17
> (implicit conversion to and explicit construction from string_view)

Yes that is something that needs to be done. Beast kind of has support
for this library-wide:

<https://github.com/boostorg/beast/blob/b7230f12f16fe7a9f7a1ece5be1f607c8552448a/include/boost/beast/core/string_type.hpp#L24>

I removed this option when I copied fixed_string into its own
repository, because it does not feel right that EVERY library which
wants to traffick in string_view has to invent its own configuration
for deciding whether to use boost::string_view, std::string_view, or
both. I have plans for more new repositories which need string views
and it would be nice to have a Boost-wide solution - I'm very open to
suggestions here. We should also think about the error_code related
types from Boost.System, and the containers in Boost.SmartPtr for this
sort of treatment. In Beast I did this:

<https://github.com/boostorg/beast/blob/b7230f12f16fe7a9f7a1ece5be1f607c8552448a/include/boost/beast/core/error.hpp#L21>

In theory I could change these aliases to use their std equivalents,
but Asio gets in the way here since currently it can only use types
from Boost.System.

> Regarding the idea of making this less "templaty"
>
> - I don't think is important enough to introduce a virtual function
> It makes things for the optimizer just so much harder.
>
> - Given, that the full type has to be a template anyway,
> I'm very skeptical if any technique provides enough benefit
> in terms of compile times to warrant the additional overhead
> and maybe complexity that the necessary indirection would
> incur.

It can be done without the indirection:

    class fixed_capacity_string_base
    {
        char* begin_;
        std::size_t capacity_;
    protected:
        std::size_t size_;
        fixed_capacity_string_base(
            char* begin, std::size_t size, std::size_t capacity);
    public:
        //...
    };

    template<std::size_t Capacity>
    class fixed_capacity_string : public fixed_capacity_string_base
    {
        char buf_[Capacity + 1];
    public:
        //...
    };

The problem of course is that on 64-bit architectures,
sizeof(fixed_capacity_string_base)==24 which is quite a bit of
overhead. Personally, I don't mind, because it does not affect my
intended use-case as described above.

This year we made sweeping changes to Beast to convert function
templates into ordinary functions wherever possible. The HTTP parser
was one of the big wins. The result was faster compilation and reduced
resource requirements on Travis. For users to gain the benefit they
need simply include this file <boost/beast/src.hpp> in one of their
translation units:

<https://github.com/boostorg/beast/blob/b7230f12f16fe7a9f7a1ece5be1f607c8552448a/include/boost/beast/src.hpp>

As you can see there is quite a lot of function definition code that
is spared from repeated compilation when using this separate
compilation mode. I do believe that the benefit is worth it. One
common perception that ordinary programmers have about Boost is that
it is slow to compile on account of the heavy templating. While it is
possible to mitigate a lot of this burden, it requires discipline and
above-average skill at refactoring the physical structure of a
program. Skills which are unfortunately in short supply.

I think both the perception and performance of the Boost Library
Collection is made more favorable by investing in merciless
elimination of templates where possible, with attention paid to
reducing the compilation burden

There is a class of algorithms which operate on strings which require
an intermediate string as a working area to calculate their result. By
refactoring the fixed string implementation into a non-template base
class, it becomes possible for these algorithms to be written as
ordinary functions. Here is an example signature:

    string_view f (string_view input1, string_view input2,
fixed_capacity_string_base& temp);

> - The Traits template parameter might be a valid candidate for
> removal. I certainly never needed it, but my guess is that
> removing it makes only sense, if you also only support plain
> char (i.e. no wchar or char32_t) as a character type.

I can be perfectly happy supporting only `char`, but we could offer
wchar and char32_t variants using explicit instantiation in the
non-header-only configuration mode, and still get the benefits of
separate compilation.

> I like the to_fixed_string function btw.

Thanks! Composing integers as part of larger strings is a natural fit
for fixed_capacity_string, e.g.

    "Content-Length: 2048\r\n"

Regards


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk