Boost logo

Boost :

Subject: Re: [boost] Heads up - string_ref landing
From: Rob Stewart (robertstewart_at_[hidden])
Date: 2012-11-27 05:36:59


On Nov 26, 2012, at 6:56 AM, Andrey Semashev <andrey.semashev_at_[hidden]> wrote:

> On Mon, Nov 26, 2012 at 2:23 PM, Rob Stewart <robertstewart_at_[hidden]> wrote:
>> On Nov 25, 2012, at 7:30 AM, Andrey Semashev <andrey.semashev_at_[hidden]> wrote:
>>
>>> If I design my library interface (let's assume it doesn't use legacy APIs internally for now), what string type should I use?
>>
>> std::string is appropriate, unless you care about copies and free store allocations, in which case, we're suggesting string_ref.
>
> The problem with std::string is the same as with string_ref - it
> doesn't support implicit construction from an arbitrary range, so my examples with custom string types would still not work.

That's right. We have no universal string/range type for that purpose, so you use the standard string type.

>>> I want my library to be used with any string type and I don't want to provide overloads for all possible string types.
>>
>> That's an impossible order, unless you add compile-time dispatching to your code, and then "all possible string types" means as many as you care to support. boost::string_ref can be extended similarly, but that would never work for std::string_ref.
>
> It is possible, if the third-party strings follow the begin()/end()
> protocol.

Now you're changing the rules. TP strings don't all provide iterators.

> Ok, it's not all possible string types but it is at least
> extensible.

Sure. That's within the criteria I mentioned.

>>> It would seem that string_ref is the answer, but I don't see any support for third-party string types in it. I will be able to do this:
>>>
>>> void foo(string_ref const&);
>>>
>>> foo("hello");
>>>
>>> string str;
>>> foo(str);
>>
>> It can also support iterator pairs and even std::vector<char>.
>
> How? Did I miss string_ref constructor from a range?

I haven't been able to examine the proposed string_ref yet. I'm speaking of possibilities and my own class.

>>> string_literal lit = "Hello";
>>> foo(lit);
>>
>> Why this type?
>>
>>> If string_ref is nothing more than a pair of iterators with a few additional member functions, I find iterator_range< const char* > much more superior because it has the begin()/end() extension mechanism.
>>
>> That forces every call to extract or compute an iterator range, which is less convenient and more error prone.
>
> No, this is not needed. iterator_range has implicit constructor from a range, so the conversion will be hidden from both the user and the library developer.

That only applies to types recognized as ranges. It isn't all string types. The same support should be part of string_ref, but an important distinction is that string_ref requires a contiguous range.

>>> The member algorithms can easily be replaced with the general ones, so they don't really add any value to string_ref.
>>
>> I agree that member versus free is a matter of syntax except for subscripting. (There may be more exceptions, but that one occurred to me.) Subscripting isn't critical, but certainly is convenient and string-like.
>
> iterator_range has operator[] for random access iterators.

OK, I'll have to check for any other examples.

>>> And if you add yet another zstring_ref to that zoo, you're only making things worse.
>>
>> It's only for the times when null termination is required. The two types could even be the same class template with different termination policies.
>
> Extracting termination policy to a template parameter is a possibility but it has drawbacks of its own. It makes harder to provide a stable API/ABI for compiled libraries.

You'd only use the terminated one in APIs in rare cases, so a separate class is simpler.

>>> If string_ref is to be proposed for inclusion (and yes, I would like it to follow the common protocol for the new libraries and not silently committed) the first thing I would like to know is how it is better than iterator_range< const char* > and what problems it solves that can't be solved with iterator_range. If there aren't any significant advantages I'd prefer not to introduce yet another string type.
>>
>> How'd I do?
>
> So far I can see only one significant difference of string_ref from an
> iterator_range: string_ref is a assumed to refer to a contiguous range. I'm not sure the distinction is enough to create a new library rather than extend Boost.Range and Boost.Iterator to introduce a
> notion of a contiguous range and iterator thereof. You could call the new range type string_ref but that unnecessarily narrows the scope of the component. After all, why not have a contiguous range of ints, for example?

There are semantic differences between a contiguous range of characters and a string, but a contiguous range type would be useful in and of itself.

___
Rob


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk