Boost logo

Boost :

From: Reece Dunn (msclrhd_at_[hidden])
Date: 2004-05-13 10:34:56


Rob Stewart wrote:
>From: John Nagle > Reece Dunn wrote:
> > > Rob Stewart wrote:
> > >> From: John Nagle

>You're comparing apples to oranges. Either you mean for the
>buffer to be exactly four bytes or you don't. If you do, then
>there's no room for a null terminator. You're not dealing with a
>"C string" (a null terminated string), you're dealing with a four
>byte buffer. If you allocated four bytes for a "C string" and
>write "ABCDE" to it, or even strcat() "ABCD" to it, you overrun
>the buffer. Why would we want to provide consistent semantics
>for that behavior?

The idea is to prevent overrun when this type of situation occurs. Thus, it
will be a 3 character buffer with the extra character being a
null-terminator.

> > >> IOW, make it a runtime error to fix the capacity too small to
> > >> permit null termination when calling c_str(). That still leaves
> > >> room for things like the 4-character file signature to which you
> > >> referred, and yet prevents buffer overrun, but doesn't require
> > >> foregoing flexibility.
> >
> > If you want a 4-char file signature, you can use
> > "boost::array<char,4>", which does that job. Is there any
> > real need for that functionality in char_string?

Isn't boost::array specific to generic arrays, whereas using a variant of
char_string, you can use string functions that will be optimized for string
operations. This is the main reason for using a special string class.

> > char_string might have some convenience template functions to
> > interconvert "boost::array" and "boost::char_string".
> >
> > > That is a good idea. It will mean keeping track of the string length,
> >
> > Yes, that seems to be necessary.
>
>The question is, which is better: char_string<4> or
>boost::array<char,4>? I suggest that the latter is better. In C
>code, and similar C++ code, arrays of char are used as buffers of
>fixed length and as memory for strings. Code will be clearer if
>one uses boost::array<char,N> for the former and char_string<N>
>for the latter. Once you make that distinction of purpose, null
>termination can be integral to char_string without complication
>or unwanted overhead.

That makes sense.

> > What about the base class issue? There's a need to be able to
> > write something like "char_string_base& s" when you want to
> > pass around fixed-capacity strings of more than one capacity.
>
>If that is a necessary feature, then the length can be stored in
>the base class. However, such a base class means that the dtor
>must be virtual. Is vtable overhead acceptable in such a class?
>Perhaps two types are needed.

My design does not use a virtual base class, so that isn't an issue. John
Nagle's version does, so that is where the problem arises. I have several
issues with the use of a virtual base class:

[1] If you want to operate on a variable length character string
specifically, why not templatize the function:
   template< int n >
   void myfn( boost::char_array< n > & s ){ ... }

[2] How do you deal with wide-character strings? My update generalizes to
support char and wchar_t based buffers, but with a virtual base class, you
are limited to char buffers.

[3] One of the reasons for having a virtual class is to supply custom string
operations, e.g. using Windows-specific string functions instead of the
standard library ones. This can also be solved with a policy template like
that found in basic_string. My current version uses this approach, improving
interoperability with basic_strings.

Regards,
Reece

_________________________________________________________________
Use MSN Messenger to send music and pics to your friends
http://www.msn.co.uk/messenger


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk