Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-29 07:33:23


On Sat, Jan 29, 2011 at 8:06 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>>
>> But c_str() doesn't have to be part of the string's  interface.
>>
>
> What is better
>
>  fd = creat(file.c_str(),0666)
>
> or
>
>  fd = creat(boost::string::lineraize(file),0666)
>
> Isn't it obvious?
>

No, it's not obvious. Here's why:

  fd = creat(file.c_str(), 0666);

What does c_str() here imply? It implies that there's a buffer
somewhere that is a `char const *` which is either created and
returned and then held internally by whatever 'file' is. Now let's say
what if file changed in a different thread, what happens to the buffer
pointed to by the pointer returned by c_str()? Explain to me that
because *it is possible and it can happen in real code*.

  fd = creat(linearize(file), 0666); // rely on ADL

This is also bad because linearize would be allocating the buffer for
me which I might not be able to control the size of or know whether it
will be a given length -- worse it might even throw. It also may mean
that the buffer is static, or I have to manage the pointer being
returned. I like this better:

  char * filename = (char *)malloc(255, sizeof(char)); // I know I
want 255 characters max
  if (filename == NULL) {
    // deal with the error here
  }
  linearize(substr(file, 0, 255), filename);
  fd = creat(filename, 0666);

It's explicit and I see all the memory that I need. I can even imagine
a simple class that can do this. Oh wait, here it is:

  std::string filename = substr(file, 0, 255);
  fd = creat(filename.c_str(), 0666);

All I have to do is to create a conversion operator to std::string and
I'll be fine. Or, actually:

  fd = creat(static_cast<std::string>(substr(file, 0, 255)).c_str(), 0666);

Now here file could be a view, could be a raw underlying string.

There are many ways to skin a cat (in this context, cut a string) but
having c_str() as part of the interface puts too much of a burden on
the potential efficiency of an immutable string implementation.

>>
>> So then  what's the point of making strings use a single contiguous
>> memory chunk if  it's not necessary?
>>
>
> When you create large text chunks it makes sense and then
> create a sing string from it if you want.
>

And then the problem is not addressed then of the unnecessary contiguous buffer.

>>
>> You can, there are  already libraries for that sort of thing if you
>> insist on using  std::string.
>>
>
> If the tool causes bugs and problems in every second problem,
> then seems that there is a problem in the tool.
>

Weh?

> You can't fix all programs but you can make tools better
> so less issues would arise.
>
> So basically what you say that
>
>  "there is no problem with encodings in current C++ world"
>
> I'm sorry but you are wrong.
>

I NEVER SAID THIS! You're arguing a strawman here.

I said: encoding is largely external to a string data structure.
That's why there's a view of a string in a given encoding.

> You know what...
>
> I'd really like your data structure if you were not
> calling it string but rather bytes chunk or immutable
> bytes array.
>
> What you are suggesting has noting to do with text,
> and I don't understand how do you fail to see this.
>

I don't know if you're not a native English speaker or whether you
just really think strings are just for text.

Strings are a data structure (look it up). Encoding is a way of
representing or interpreting data in a certain way. I fail to see why
encoding has anything to do with a data structure. So if I have data
in a data structure, I should be able to apply an encoding on that
data structure and "view" it a given way I want/need.

What I'm saying is, a string data structure should have clearly
defined semantics -- hence the document going into the immutability,
value semantics, etc. -- now encoding is largely a matter at a
different level operating on strings. Encoding is an interpretation of
strings.

*I* fail to see why *you* fail to understand this clear statement.

-- 
Dean Michael Berris
about.me/deanberris

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk