Boost logo

Boost :

Subject: Re: [boost] [string_ref] string literal constructor
From: Antony Polukhin (antoshkka_at_[hidden])
Date: 2013-11-07 02:33:29


2013/11/7 Gavin Lambert <gavinl_at_[hidden]>

> On 7/11/2013 05:19, Quoth Marshall Clow:
>
>>
>> On Nov 2, 2013, at 10:49 PM, Antony Polukhin <antoshkka_at_[hidden]> wrote:
>>
>> template< std::size_t N >
>>> basic_string_ref( const charT( &str )[N] )
>>> : basic_string_ref( str, std::min(N, strlen(str)) ) /* pseudo code,
>>> we'll need something like strlen_s */
>>> {}
>>>
>>> Such constructor won't change the current behavior
>>>
>>> string_ref ( "test\0test" ) // { ptr, 4 }
>>>
>>> but will also work for non-zero terminated fixed length arrays:
>>>
>>> const char s[] = {'0', '1', '2'};
>>> string_ref test(s); // {ptr, 3}
>>>
>>
>> No, actually, it won't, because the strlen will read past the end of
>> the array, looking for the terminating NULL. (and that's undefined
>> behavior)
>>
>
> I was thinking about pointing that out earlier but I assumed that
> resolving that issue was what he meant by the "strlen_s" comment.
>
<...>

Yes, I meant strlen_s. This function in not in Standard so some pseodocode
`std::min(N, strlen(str))` was provided to describe the idea.

> While I mostly agree with this, it is probably just as common to have a
> char buffer on the stack that you *think* is null terminated, but might not
> be. (Particularly if the classic "strncpy" has been involved at some
> point.) That's where having this sort of constructor could be beneficial,
> to act as a backstop against such issues.
>
> Though personally I'm still inclined to leave it explicitly up to the
> application -- the app code needs to be much more aware of what it's doing
> if it's intentionally passing around (potentially) non-terminated strings;
> and if it's accidental then it's a bug that should be fixed immediately
> (eg. with a safe_strncpy) rather than being quietly resolved by a library.
>

This is one of the good points.

But from the view of usability user may assume that library will
determinate the size in this situation:

const char s[] = {'0', '1', '2'}; /* User knows that this is not zero
terminated*/
string_ref test(s); // Size of array is determinated at this point. Why not
have {ptr, 3}?

Questions about nonzero terminated strings arise quite often.

On the other hand, if we apply the strlen_s then users may be surprised by
the following:

const char s[] = {'0', '1', '2', '\0', '\1'}; /* User knows that this has
fixed size*/
string_ref test(s); // Size of array is determinated at this point. Why *we
have* {ptr, 3}?

Looks like there is no solution that will satisfy all the users.

-- 
Best regards,
Antony Polukhin

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk