Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-29 07:33:23

On Sat, Jan 29, 2011 at 8:06 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>> But c_str() doesn't have to be part of the string's  interface.
> What is better
>  fd = creat(file.c_str(),0666)
> or
>  fd = creat(boost::string::lineraize(file),0666)
> Isn't it obvious?

No, it's not obvious. Here's why:

  fd = creat(file.c_str(), 0666);

What does c_str() here imply? It implies that there's a buffer
somewhere that is a `char const *` which is either created and
returned and then held internally by whatever 'file' is. Now let's say
what if file changed in a different thread, what happens to the buffer
pointed to by the pointer returned by c_str()? Explain to me that
because *it is possible and it can happen in real code*.

  fd = creat(linearize(file), 0666); // rely on ADL

This is also bad because linearize would be allocating the buffer for
me which I might not be able to control the size of or know whether it
will be a given length -- worse it might even throw. It also may mean
that the buffer is static, or I have to manage the pointer being
returned. I like this better:

  char * filename = (char *)malloc(255, sizeof(char)); // I know I
want 255 characters max
  if (filename == NULL) {
    // deal with the error here
  linearize(substr(file, 0, 255), filename);
  fd = creat(filename, 0666);

It's explicit and I see all the memory that I need. I can even imagine
a simple class that can do this. Oh wait, here it is:

  std::string filename = substr(file, 0, 255);
  fd = creat(filename.c_str(), 0666);

All I have to do is to create a conversion operator to std::string and
I'll be fine. Or, actually:

  fd = creat(static_cast<std::string>(substr(file, 0, 255)).c_str(), 0666);

Now here file could be a view, could be a raw underlying string.

There are many ways to skin a cat (in this context, cut a string) but
having c_str() as part of the interface puts too much of a burden on
the potential efficiency of an immutable string implementation.

>> So then  what's the point of making strings use a single contiguous
>> memory chunk if  it's not necessary?
> When you create large text chunks it makes sense and then
> create a sing string from it if you want.

And then the problem is not addressed then of the unnecessary contiguous buffer.

>> You can, there are  already libraries for that sort of thing if you
>> insist on using  std::string.
> If the tool causes bugs and problems in every second problem,
> then seems that there is a problem in the tool.


> You can't fix all programs but you can make tools better
> so less issues would arise.
> So basically what you say that
>  "there is no problem with encodings in current C++ world"
> I'm sorry but you are wrong.

I NEVER SAID THIS! You're arguing a strawman here.

I said: encoding is largely external to a string data structure.
That's why there's a view of a string in a given encoding.

> You know what...
> I'd really like your data structure if you were not
> calling it string but rather bytes chunk or immutable
> bytes array.
> What you are suggesting has noting to do with text,
> and I don't understand how do you fail to see this.

I don't know if you're not a native English speaker or whether you
just really think strings are just for text.

Strings are a data structure (look it up). Encoding is a way of
representing or interpreting data in a certain way. I fail to see why
encoding has anything to do with a data structure. So if I have data
in a data structure, I should be able to apply an encoding on that
data structure and "view" it a given way I want/need.

What I'm saying is, a string data structure should have clearly
defined semantics -- hence the document going into the immutability,
value semantics, etc. -- now encoding is largely a matter at a
different level operating on strings. Encoding is an interpretation of

*I* fail to see why *you* fail to understand this clear statement.

Dean Michael Berris

Boost list run by bdawes at, gregod at, cpdaniel at, john at