Boost logo

Boost :

Subject: [boost] [c++TR2] N3334, Proposing array_ref<T> and string_ref
From: Beman Dawes (bdawes_at_[hidden])
Date: 2012-01-30 09:20:12


On Sat, Jan 28, 2012 at 8:12 PM, Mathias Gaunard
<mathias.gaunard_at_[hidden]> wrote:
> On 01/28/2012 05:46 PM, Beman Dawes wrote:
>>
>> Beman.github.com/string-interoperability/interop_white_paper.html
>> describes Boost components intended to ease string interoperability in
>> general and Unicode string interoperability in particular.
>>
>> These proposals are the Boost version of the TR2 proposals made in
>> N3336, Adapting Standard Library Strings and I/O to a Unicode World.
>> See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3336.html.
>>
>> I'm very interested in hearing comments about either the Boost or the
>> TR2 proposal. Are these useful additions? Is there a better way to
>> achieve the same easy interoperability goals?
>
>
> I think you should consider the points being made in N3334.

See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3334.html

While this proposal isn't from Boost, it impacts interests of Boost
developers enough that I think it is worth discussing here as a
separate topic.

Mathias continues:

> While that proposal is in my opinion not good enough, it raises an important
> issue that is often present with std::string-based or similar designs.
>
> A function that takes a std::string, or a boost::filesystem::path for that
> matter, necessarily causes the [caller] to copy the data into a heap-allocated
> buffer, even if there is no need to.

Some std library string implementations avoid the heap allocation for
small strings, but still there is an unnecessary copy happening even
in those implementations. Your point is well taken and I've often
worried about it with boost::filesystem::path.

> Use of the range concept would solve that issue, but then that requires
> making the function a template. A type-erased range would be possible, but
> that has significant performance overhead.
> a string_ref or path_ref is maybe the lesser evil.

One of my blink reactions is that array_ref<T> and
basic_string_ref<charT, traits> are range generators and I was a bit
surprised to see the implementation was a pointer and length rather
than two pointers. Or better yet, two iterators or an explicit range
component. With iterators, a basic_string_ref could do encoding
conversions on-the-fly without need of temporary strings. But I have
no idea if that is workable or actually is better.

What do other Boosters think?

--Beman

--Beman


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk