Boost logo

Boost :

Subject: Re: [boost] Variadic append for std::string
From: Richard Hodges (hodges.r_at_[hidden])
Date: 2017-01-24 08:00:22


It's almost looking like conversion to utf8, wide strings, strings or
string_views should be a filter rather than an option.

I agree that the factory hierarchy should be as static as possible, so that
it's trivially copyable.

Thinking on that, it seems to me that the act of joining or concatenating
is the result of applying an output filter (e.g. .str() or to_string()) to
a sequence of input filters.

Or put another way, executing a sequence of input filters with their
outputs set to some output filters.

While I don't like the idea of pipes, the following structure seems to
model the process:

auto s = some_string();
join(a, b, c) | to_utf8() | append(s);

Another way to express this is:

join(a, b, c).apply(to_utf8()).apply(append(s));

This is not dissimilar to the architecture of the boost::iostream library
(although that library is polymorphic, join can be generic).

Other options spring to mind:

join(a, b, c) | widen() | prepend(some_wide_string);
auto s = separator(" : ") | join(a, b, as_hex(fixed(4), c), std::quoted(d))
| create_string();

example output might be:

foo : bar : 003a : "baz"

alternative syntax:

auto s = separator(" : ").join(a, b, as_hex(fixed(4), c), std::quoted(d)
).apply(create_string());

Note that I am still resisting the idea of .str() as a member function. If
the joiner or concatenation object exports begin() and end(), it's
un-necessary, because the object returned by create_string() (or similar)
can use the iterators.

Having iterators also means that the attributed factory can be used as a
source in std::copy, std::transform, std::for_each etc.

Whether the factory should simply export input_iterators iterators or
random_access will depend on how much state we'd want the factory to carry.

For now, I think input_iterators are sufficient.

On 24 January 2017 at 13:26, Christof Donat <cd_at_[hidden]> wrote:

> Hi,
>
> Am 24.01.2017 12:52, schrieb Richard Hodges:
>
>> On 24 January 2017 at 12:17, Christof Donat <cd_at_[hidden]> wrote:
>>
>>> Am 24.01.2017 11:29, schrieb Richard Hodges:
>>>
>>>> Imagine a function:
>>>>
>>>> void foo(std::string_view s);
>>>>
>>>> which we then call with:
>>>>
>>>> foo(join("the answer to life, the universe and everything is: ",
>>>> hex(42)));
>>>>
>>>>
>>> Did you mean
>>>
>>> foo(concat(hex<int>, "the answer to life, the universe and everything is:
>>> ", 42));
>>>
>>>
>> I probably meant:
>> foo(to_string_view(concat("the answer to life, the universe and everything
>> is: ", hex(42))));
>> or
>> foo(to_string_view(join(separator(' '), "the answer to life, the universe
>> and everything is:", hex(42))));
>>
>>
>> The idea being to avoid the construction of any un-necessary string
>> objects. The generator already contains a buffer (or could) so it seems
>> wasteful to me to create a string temporary simply to view its buffer.
>>
>
> The idea was, that not the generator holds a buffer, but the function,
> that actually executes it, instantiates a std::string. When that is
> returned, either the compiler elides that copy, or it will be moved out. So
> in reality the waste is very minimal, if not non existent.
>
> The generator returned by join would stay alive until the end of the
>>>> function foo, so there would be no need to construct a string, only to
>>>> take
>>>> a string_view of it. We could use the string_view implicit in the joiner
>>>> object. This saves us an allocation and a copy.
>>>>
>>>>
>>> I see. So the string would live inside the string factory as a member
>>> object, when we implicitly convert to a string_view. With C++17
>>> std::string
>>> will have an implicit conversion operator to std::string_view. So this
>>> will
>>> be sufficient:
>>>
>>
>> A string, a string-like buffer, or a reference to a string. I feel that
>> the
>> generator should be able to work on a supplied string reference so that it
>> can be used to extend an existing string without reallocations or copies
>> if
>> required.
>>
>
> Yes, that is possible, when the function, that executes the generator is
> responsible for the buffer. Then you can have e.g.
>
> concat(...).str(); // allocate a string and return it
> concat(...).append_to(s); // append to an existing string;
> concat(...).replace(s); // write over an existing string, reusing its
> memory.
>
> I think, we all agree here, that implicit conversion is not the way to go.
> So my current proposal still is .str(), and the like. You propose free
> functions instead.
>
> Yes, but I don't get why you want wide string versions for UTF-8-support.
>>> I sit about converting wide string to utf8? Like this:
>>>
>>> std::string{concat(my_wide_string)};
>>>
>>
>> Maybe to_utf8(concat(...)); would be better.
>>
>
> Uh, coming back to the formatting tags and my currently preferred syntax:
>
> concat(utf_8, ...).str();
>
> If you need additional options for utf_8, it can be a function call:
>
> concat(utf_8(more, options), ...).str();
>
> It's again explicit and could
>> be given options to control behaviour. It also decouples the concept of
>> UTF8 from the concept of concatenation. This adheres more to the c++ way
>> of
>> only paying for what you need.
>>
>
> The functions, that execute the string factories should, in my opinion,
> only care, if they have enough memory, and let the factories care about the
> content. Therefore I think, that the question of character encoding should
> be dealt with in the factories.
>
> I don't see, how the question of character encoding can be decoupled from
> the concept of converting arbitrary objects to strings. The converter has
> to have a way to encode its result.
>
>
> Christof
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman
> /listinfo.cgi/boost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk