Boost logo

Boost :

Subject: Re: [boost] Heads up - string_ref landing
From: Olaf van der Spek (ml_at_[hidden])
Date: 2012-11-17 05:26:04


On Fri, Nov 16, 2012 at 5:54 PM, Yanchenko Maxim
<maximyanchenko_at_[hidden]> wrote:
>> Those are C-style constructs. The C++-style equivalents are iterator-based.
>
> Those are high-performance constructs. We can only pray that a compiler will be smart enough to convert our iterator-based code to memcpy/memcmp/memset, and from my experience compilers are not nearly as smart if it's slightly beyond trivial cases.

AFAIK MSVC has library code to use memcpy for std::copy if possible.

> (char_range is an optimization technique so we aim for maximum speed. If you don't maximize speed you'd be happy with simple and safe std::string copies.)

Could you stop trying to say what I should be happy with please?
It's not just about performance. You can't pass a CString (MFC) or
QString (QT) to a function taking a const string&.

>> Suppose you have two pointers, 0xa0 (begin) and 0xb0 (end). The size
>> in bytes is 0x10.
>> Suppose you have one pointer (0xa0) and one size (0x10). Does this
>> point to the same memory?
> "this" means 0xa0+0x10? By construction - yes, they do.

No, it could mean 0a0 + 0x40 if sizeof(value_type) == 4.

>> Yes if sizeof(value_type) == 1, no
>> otherwise. You can't tell to what memory range it points without
>> knowing sizeof(value_type)
>
> Ah. The first pointer (0xa0) is typed, so we surely know value_type.

No, we don't always know the type.

> That's why your 0xa0 - 0xb0 works. They are not void*, they are value_type*.
>
>> Isn't that by definition for a reference? It applies to const string&
>> too. I don't think that's a good reason.
>
> It's not a reference to std::string, it's a reference to *internals* of std::string. Those internals are managed by std::string exclusively.
> I.e. if you have a reference to std::string and you expand the string, the reference will continue to work with no problem, while a reference to internals will be invalidated (the simplest example of a reference to internals are invalidating iterators).
> But when you give away iterators, you do it explicitly via begin/end. Same way, if you give away a reference to std::string internals, you do it explicitly via data/c_str. This make potentially dangerous code visible. Same should be done with char_range construction from std::string::data - it should be explicit.
>
> Btw, const references are not that harmless, consider this innocent-looking code:
>
> struct S {
> const std::string& ref_;
> S(const std::string& ref): ref_(ref) {}
> };
>
> S s1("foo");
> S s2(std::string("bar"));

I know. The danger is in storing a reference (or pointer) to something
that may die before the reference. It's not in passing a function
argument as reference.

> // std::vector<std::string> v; - too slow, upgrading to our new char_range!
> std::vector<char_range> v;
> v.push_back( "foo" );
> v.push_back( std::string("bar") ); // BOOM
>
> When pushing stuff to this vector, we want to be 100% sure that strings that gave away their char_ranges will live longer than the vector and live unchanged. And for this we need all the help a compiler can give us, namely - force us to explicitly declare the give-away and fail to compile otherwise.

I think if you need that kind of 'safety', C++ isn't the language for you.

>>> For the same reason we have explicit char_range::literal and char_range::from_array.
>> I'd like this to work:
>> void f(str_ref);
>> f("Olaf");
> f( char_range::literal("Olaf") );
> Explicit and with size known at compile-time (so compiler can utilize this knowledge).

Explicit and unclean.

-- 
Olaf

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk