Boost logo

Boost :

From: Rob Stewart (stewart_at_[hidden])
Date: 2004-05-13 11:55:18


From: "Reece Dunn" <msclrhd_at_[hidden]>
> Rob Stewart wrote:
> >From: John Nagle > Reece Dunn wrote:
>
> >You're comparing apples to oranges. Either you mean for the
> >buffer to be exactly four bytes or you don't. If you do, then
> >there's no room for a null terminator. You're not dealing with a
> >"C string" (a null terminated string), you're dealing with a four
> >byte buffer. If you allocated four bytes for a "C string" and
> >write "ABCDE" to it, or even strcat() "ABCD" to it, you overrun
> >the buffer. Why would we want to provide consistent semantics
> >for that behavior?
>
> The idea is to prevent overrun when this type of situation occurs. Thus, it
> will be a 3 character buffer with the extra character being a
> null-terminator.

I think we're miscommunicating, and it's probably me
misinterpreting something.

The four byte matter was raised because someone wanted to be able
to peek into a buffer of data read from a file. The 5th byte
wasn't a null terminator and it shouldn't be changed, so I took
that to suggest effectively overlaying a char_string on that
file's contents in the buffer.

Another interpretation of that need is to pass the buffer to a
char_string<4> and expect that only the first four bytes be
copied.

> > > >> IOW, make it a runtime error to fix the capacity too small to
> > > >> permit null termination when calling c_str(). That still leaves
> > > >> room for things like the 4-character file signature to which you
> > > >> referred, and yet prevents buffer overrun, but doesn't require
> > > >> foregoing flexibility.
> > >
> > > If you want a 4-char file signature, you can use
> > > "boost::array<char,4>", which does that job. Is there any
> > > real need for that functionality in char_string?
>
> Isn't boost::array specific to generic arrays, whereas using a variant of
> char_string, you can use string functions that will be optimized for string
> operations. This is the main reason for using a special string class.

You're right that using boost::array doesn't offer any string
facilities. Perhaps what we need is a string library that uses
all namespace scope functions and type generators to generalize
the notion of a string (this is not unlike Thorsten's
CollectionTraits library). Then, a boost::array<char,N>, a
std::string, even a C string can be treated generically as a
string. With such a facility, boost::array would work just fine
for the file signature example and char_string would be relieved
from needing to handle the "no terminator" case.

> > > What about the base class issue? There's a need to be able to
> > > write something like "char_string_base& s" when you want to
> > > pass around fixed-capacity strings of more than one capacity.
> >
> >If that is a necessary feature, then the length can be stored in
> >the base class. However, such a base class means that the dtor
> >must be virtual. Is vtable overhead acceptable in such a class?
> >Perhaps two types are needed.
>
> My design does not use a virtual base class, so that isn't an issue. John
> Nagle's version does, so that is where the problem arises. I have several
> issues with the use of a virtual base class:

A "virtual base class" or a base class with a virtual function?

> [1] If you want to operate on a variable length character string
> specifically, why not templatize the function:
> template< int n >
> void myfn( boost::char_array< n > & s ){ ... }

The issue had to do with being able to create collections of
variable sized string objects.

> [2] How do you deal with wide-character strings? My update generalizes to
> support char and wchar_t based buffers, but with a virtual base class, you
> are limited to char buffers.

That's true only if the character type is encoded in the base
class. Why would it be?

> [3] One of the reasons for having a virtual class is to supply custom string
> operations, e.g. using Windows-specific string functions instead of the
> standard library ones. This can also be solved with a policy template like
> that found in basic_string. My current version uses this approach, improving
> interoperability with basic_strings.

You can certainly design an ABC with many pure virtual functions
that the derived types implement, but that was never the intent
of the base class idea.

Your policy approach will permit a lot of custimization, but
perhaps the better approach is to externalize all operations.

-- 
Rob Stewart                           stewart_at_[hidden]
Software Engineer                     http://www.sig.com
Susquehanna International Group, LLP  using std::disclaimer;

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk