Boost logo

Boost :

From: Reece Dunn (msclrhd_at_[hidden])
Date: 2004-05-13 17:14:43


Rob Stewart wrote:
>From: "Reece Dunn" <msclrhd_at_[hidden]>
> > Rob Stewart wrote:
> > >From: John Nagle > Reece Dunn wrote:
> > The idea is to prevent overrun when this type of situation occurs. Thus,
>it
> > will be a 3 character buffer with the extra character being a
> > null-terminator.
>
>I think we're miscommunicating, and it's probably me
>misinterpreting something.
>
>The four byte matter was raised because someone wanted to be able
>to peek into a buffer of data read from a file. The 5th byte

That was me :). I was suggesting that one possible application for this type
of class was in managing header files of binary data, e.g. tar files, doing
things like:

   struct tar_header
   {
      ...
      boost::char_string< 6 > magic;
   } hdr;

   if( hdr.magic == "ustar\x20\x20\0" ) // process GNU TAR file

>wasn't a null terminator and it shouldn't be changed, so I took
>that to suggest effectively overlaying a char_string on that
>file's contents in the buffer.

I have revised my initial idea of supporting both null-terminated and
non-terminated string buffers in response that (a) boost::array< char, N >
is a good candidate for hte latter, and (b) adding a boolean template
parameter to select null termination and the associated code made the logic
too complex.

> > > > >> IOW, make it a runtime error to fix the capacity too small to
> > > > >> permit null termination when calling c_str(). That still leaves
> > > > >> room for things like the 4-character file signature to which you
> > > > >> referred, and yet prevents buffer overrun, but doesn't require
> > > > >> foregoing flexibility.
> > > >
> > > > If you want a 4-char file signature, you can use
> > > > "boost::array<char,4>", which does that job. Is there any
> > > > real need for that functionality in char_string?
> >
> > Isn't boost::array specific to generic arrays, whereas using a variant
>of
> > char_string, you can use string functions that will be optimized for
>string
> > operations. This is the main reason for using a special string class.
>
>You're right that using boost::array doesn't offer any string
>facilities. Perhaps what we need is a string library that uses
>all namespace scope functions and type generators to generalize
>the notion of a string (this is not unlike Thorsten's
>CollectionTraits library). Then, a boost::array<char,N>, a
>std::string, even a C string can be treated generically as a
>string. With such a facility, boost::array would work just fine
>for the file signature example and char_string would be relieved
>from needing to handle the "no terminator" case.

Doesn't the string algorithms library do just that?

> > > > What about the base class issue? There's a need to be able to
> > > > write something like "char_string_base& s" when you want to
> > > > pass around fixed-capacity strings of more than one capacity.
> > >
> > >If that is a necessary feature, then the length can be stored in
> > >the base class. However, such a base class means that the dtor
> > >must be virtual. Is vtable overhead acceptable in such a class?
> > >Perhaps two types are needed.
> >
> > My design does not use a virtual base class, so that isn't an issue.
>John
> > Nagle's version does, so that is where the problem arises. I have
>several
> > issues with the use of a virtual base class:
>
>A "virtual base class" or a base class with a virtual function?

Base class with a set of virtual functions (my mistake).

> > [1] If you want to operate on a variable length character string
> > specifically, why not templatize the function:
> > template< int n >
> > void myfn( boost::char_array< n > & s ){ ... }
>
>The issue had to do with being able to create collections of
>variable sized string objects.

Hmmm. That could be tricky :(.

> > [2] How do you deal with wide-character strings? My update generalizes
>to
> > support char and wchar_t based buffers, but with a virtual base class,
>you
> > are limited to char buffers.
>
>That's true only if the character type is encoded in the base
>class. Why would it be?

You need to know the character type you are operating on in the base class.
This is the basic idea that John Nagle's approach takes:

   class char_string_base
   {
      public:
         virtual std::size_t length() const;
         virtual const char * c_str() const;
         virtual void copy( const char * );
   };

But I agree with John's comments that it may be necessary to have a
wchar_string_base to support wide characters.

> > [3] One of the reasons for having a virtual class is to supply custom
>string
> > operations, e.g. using Windows-specific string functions instead of the
> > standard library ones. This can also be solved with a policy template
>like
> > that found in basic_string. My current version uses this approach,
>improving
> > interoperability with basic_strings.
>
>You can certainly design an ABC with many pure virtual functions
>that the derived types implement, but that was never the intent
>of the base class idea.
>
>Your policy approach will permit a lot of custimization, but
>perhaps the better approach is to externalize all operations.

Can you expand on what you mean by externalize.

Regards,
Reece

_________________________________________________________________
Sign-up for a FREE BT Broadband connection today!
http://www.msn.co.uk/specials/btbroadband


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk