Boost logo

Boost :

Subject: Re: [boost] Heads up - string_ref landing
From: Andrey Semashev (andrey.semashev_at_[hidden])
Date: 2012-11-27 07:01:28


On Tue, Nov 27, 2012 at 2:36 PM, Rob Stewart <robertstewart_at_[hidden]> wrote:
> On Nov 26, 2012, at 6:56 AM, Andrey Semashev <andrey.semashev_at_[hidden]> wrote:
>
>> The problem with std::string is the same as with string_ref - it
>> doesn't support implicit construction from an arbitrary range, so my examples with custom string types would still not work.
>
> That's right. We have no universal string/range type for that purpose, so you use the standard string type.

My point was that, in my understanding, string_ref is aimed to solve
this issue in a transparent way but the proposal lacks the necessary
interface. I would have used string_ref to unify string-related
interfaces if it transparently supported multiple string types, not
limited by those defined in STL (and Boost, if boost::string_ref is to
be implemented). Limiting it to particular types defeats its purpose.

>> It is possible, if the third-party strings follow the begin()/end()
>> protocol.
>
> Now you're changing the rules. TP strings don't all provide iterators.

Any reasonable string type will have some notion of iterators, be that
custom types or pointers or a pointer and a size, whatever. As long as
this holds, the third-party string type can be adopted.

I understand that not all (nearly none?) third-party strings support
begin()/end() protocol now, but I expect them to support eventually.
Even if they don't, the necessary overloads can be provided
externally.

>> No, this is not needed. iterator_range has implicit constructor from a range, so the conversion will be hidden from both the user and the library developer.
>
> That only applies to types recognized as ranges. It isn't all string types. The same support should be part of string_ref, but an important distinction is that string_ref requires a contiguous range.

iterator_range doesn't detect that its constructor argument is a range
or not. If applying begin()/end() to it is a valid operation, the
conversion will succeed. I'd like string_ref to behave the same way.

I see only one corner case: C strings. But I believe the solution is
possible. Either begin()/end() can be defined for const char* or the
string_ref can have the corresponding constructor. The latter is one
(and only, AFAICS) reason to have string_ref type in addition to
contiguous_range.

>> Extracting termination policy to a template parameter is a possibility but it has drawbacks of its own. It makes harder to provide a stable API/ABI for compiled libraries.
>
> You'd only use the terminated one in APIs in rare cases, so a separate class is simpler.

So I would not introduce it at all for that reason. Just use
std::string in such cases.

> There are semantic differences between a contiguous range of characters and a string, but a contiguous range type would be useful in and of itself.

The semantic difference is a matter of content and its interpretation.
You can store non-printable elements in std::string (and it is
sometimes more convenient and efficient than std::vector< char >) and
printable characters in std::vector< char >. The interface of
std::vector< char > and std::string is mostly the same when it comes
to string processing (not counting std::string members that can be
replaced with free algorithms). The same applies to string_ref and
contiguous_range< const char* >, the only notable difference being the
construction from const char*.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk