Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Artyom (artyomtnk_at_[hidden])
Date: 2011-01-29 10:25:40


> From: Dean Michael Berris <mikhailberis_at_[hidden]>
> On Sat, Jan 29, 2011 at 8:06 PM, Artyom <artyomtnk_at_[hidden]> wrote:
> >>
> >> But c_str() doesn't have to be part of the string's interface.
> >>
> >
> > What is better
> >
> > fd = creat(file.c_str(),0666)
> >
> > or
> >
> > fd = creat(boost::string::lineraize(file),0666)
> >
> > Isn't it obvious?
> >
>
> No, it's not obvious. Here's why:
>
> fd = creat(file.c_str(), 0666);
>
> What does c_str() here imply? It implies that there's a buffer
> somewhere that is a `char const *` which is either created and
> returned and then held internally by whatever 'file' is.

It implies that const file owns const buffer that holds null terminated
string that can be passed to "char const *" API.

> Now let's say
> what if file changed in a different thread, what happens to the buffer
> pointed to by the pointer returned by c_str()? Explain to me that
> because *it is possible and it can happen in real code*.

I'm sorry but string as anything else has value semantics
that is:

   - safe for "const" access from multiple threads
   - safe for mutable access from single thread

I don't see why string should be different from
any other value type like "int" because following

  x+=y + y

is not safe for integer as well.

The code I had shown is **perfectly** safe with
string has has value semantics (which std::string has)

>
> fd = creat(linearize(file), 0666); // rely on ADL
>
> This is also bad because linearize would be allocating the buffer for
> me which I might not be able to control the size of or know whether it
> will be a given length -- worse it might even throw.

Exactly!

   c_str() never throws as it is "const" member function in
   C++ string semantics.

   So for example this code is fine with const c_str()

   bool create_two_lock_files(string const &f1,string const &f2)
   {
     int fd1=creat(f1.c_str(),O_EXCL ...)
     if(fd2==-1)
         return false;
     int fd1=creat(f2.c_str(),O_EXCL ...)
     if(fd2==-1) {
         unlink(f1.c_str());
         close(fd1);
         return false;
     }
     close(f1);
     close(f2);
     return true;
   }

It would not work with all linerazie stuff because
it would not be exception safe and would require
me to create a temporary variable to store f1 linearized.

> It also may mean
> that the buffer is static, or I have to manage the pointer being
> returned.

I think we both and 95% of C++ programmers that use STL
know what is the semantics of std::string::c_str()

> I like this better:
>
> char * filename = (char *)malloc(255, sizeof(char)); // I know I
> want 255 characters max
> if (filename == NULL) {
> // deal with the error here
> }
> linearize(substr(file, 0, 255), filename);
> fd = creat(filename, 0666);
>

Sorry? Is this better then:

    fd=create(filename.substr(0,256).c_str(),O_EXCL...)

Which by the way is 100% thread safe as well (but still may throw).
Even thou I can't see any reason to cut 256 bytes before create

> It's explicit and I see all the memory that I need. I can even imagine
> a simple class that can do this. Oh wait, here it is:
>
> std::string filename = substr(file, 0, 255);
> fd = creat(filename.c_str(), 0666)

You don't need create explicit temporary filename,
C++ keeps it alive as long as create is not completed.
>
> All I have to do is to create a conversion operator to std::string and
> I'll be fine. Or, actually:
>
> fd = creat(static_cast<std::string>(substr(file, 0, 255)).c_str(), 0666);
>
> Now here file could be a view, could be a raw underlying string.
>

As above. throws and extreamly verbose.

> There are many ways to skin a cat (in this context, cut a string) but
> having c_str() as part of the interface puts too much of a burden on
> the potential efficiency of an immutable string implementation.
>
> >>
> >> So then what's the point of making strings use a single contiguous
> >> memory chunk if it's not necessary?
> >>
> >
> > When you create large text chunks it makes sense and then
> > create a sing string from it if you want.
> >
>
> And then the problem is not addressed then of the unnecessary contiguous
>buffer.
>

There is good idea to have some non-linear data storage but
it should used in very specific cases.

Also what is really large string for you that would have
performance advantage not being stored lineary.

Talk to me in numbers?

> >
> > What you are suggesting has noting to do with text,
> > and I don't understand how do you fail to see this.
> >
>
> I don't know if you're not a native English speaker or whether you
> just really think strings are just for text.

There was other great answer by David Bergman for this
you had probably already read

I can only say +1 to his answer can't be told better.

Artyom

      


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk