Boost logo

Boost :

From: Rob Stewart (stewart_at_[hidden])
Date: 2004-05-13 08:15:55


From: John Nagle <nagle_at_[hidden]>
> Reece Dunn wrote:
> > Rob Stewart wrote:
> >> From: John Nagle <nagle_at_[hidden]>
> >>
> >> If the string if (sic) fixed-capacity *and* there remains sufficient
> >> capacity, c_str() can null terminate the buffer. If there isn't
> >> sufficient remaining capacity, then throw an exception.
>
> Done that way, if you write
>
> char_string<4> s;
> strcat(s,"ABCDE"); // truncates at "ABCD", no null.
> printf("s=%s\n",s.c_str()); // c_str throws exception
>
> which is quite different from classic <string.h> semantics.
> You couldn't use that as a drop-in replacement for C strings.
> Nor is it compatible with STL basic_string semantics.
> I'd suggest consistent null-terminated semantics for char_string.

You're comparing apples to oranges. Either you mean for the
buffer to be exactly four bytes or you don't. If you do, then
there's no room for a null terminator. You're not dealing with a
"C string" (a null terminated string), you're dealing with a four
byte buffer. If you allocated four bytes for a "C string" and
write "ABCDE" to it, or even strcat() "ABCD" to it, you overrun
the buffer. Why would we want to provide consistent semantics
for that behavior?

> >> IOW, make it a runtime error to fix the capacity too small to
> >> permit null termination when calling c_str(). That still leaves
> >> room for things like the 4-character file signature to which you
> >> referred, and yet prevents buffer overrun, but doesn't require
> >> foregoing flexibility.
>
> If you want a 4-char file signature, you can use
> "boost::array<char,4>", which does that job. Is there any
> real need for that functionality in char_string?
> char_string might have some convenience template functions to
> interconvert "boost::array" and "boost::char_string".
>
> > That is a good idea. It will mean keeping track of the string length,
>
> Yes, that seems to be necessary.

The question is, which is better: char_string<4> or
boost::array<char,4>? I suggest that the latter is better. In C
code, and similar C++ code, arrays of char are used as buffers of
fixed length and as memory for strings. Code will be clearer if
one uses boost::array<char,N> for the former and char_string<N>
for the latter. Once you make that distinction of purpose, null
termination can be integral to char_string without complication
or unwanted overhead.

> What about the base class issue? There's a need to be able to
> write something like "char_string_base& s" when you want to
> pass around fixed-capacity strings of more than one capacity.

If that is a necessary feature, then the length can be stored in
the base class. However, such a base class means that the dtor
must be virtual. Is vtable overhead acceptable in such a class?
Perhaps two types are needed.

-- 
Rob Stewart                           stewart_at_[hidden]
Software Engineer                     http://www.sig.com
Susquehanna International Group, LLP  using std::disclaimer;

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk