Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-29 03:37:59

On Sat, Jan 29, 2011 at 3:02 PM, Artyom <artyomtnk_at_[hidden]> wrote:
>> > 1. "Contiguity"
>> >
>> >     Continuity and c_str() is one of the most important
>> >     properties of C++ string (that is BTW required by C++0x)
>> Eliminating c_str() doesn't mean  there's no easy way to produce a
>> contiguous NTBS.
> Yes, just it can't be really "char const *c_str() CONST" or would require
> extra stuff like linearization.
> It would turn away 90% of users.

It might turn you away because you obviously love std::string.
Generalizing is a different matter and is largely a hot-air blowing
exercise that is futile for convincing anybody.

>> > 3.  non-uniform-memory-architecture
>> >
>> >    Give  me a break... Who uses NUMA for string processing?!
>> Anyone running on a  multiprocessor system with AMD Opterons or Intel
>> Nehalem or Tukwila  processors.  You don't always get to choose the
>> kind of architecture  your code will run on, and those systems are all
>> NUMA.  But even when  you do get to choose, some very large problems
>> that would be appropriate to  NUMA involve lots of strings.
> 1. The locality of cache or private processor cache
>   does not makes them "NUMA"

What makes an architecture NUMA is when each CPU manages memory by
embedding the memory manager in the CPU. By not having a single memory
controller in the system, effectively a CPU's access to the whole
available memory is non-uniform because it will have faster access to
some memory while having to go through other CPUs if it needs to
access memory that's controlled by another CPU.

It looks like you don't know what NUMA is from what you're saying.

> 2. In such case it would be even better to have non-shared
>   strings


>> > 4. About string builder. Most  languages require is as they
>> >    don't have "reserve" also if  you want efficient
>> >    string builder use std::ostream with  nice stream buffer.
>> There's nothing efficient about std::ostream, no  matter what buffer
>> you put on it.
> I beg your pardon? It is efficient as all functions
> are as efficient as memcpy with exceptions of overflow/underflow
> happens which require some virtual functions calls
> which are pretty fast as well...
> Also 99% of issues are just solved with reserve.
> (and I work with text parsing, combining and processing a lot)

And you obviously don't work with systems that have to do this
multiple thousand times in one second to not know what the effects of
NUMA are and why allocating a contiguous amount of memory is the
performance killer that it is.

>> This, I believe, is a persistent  misunderstanding.  IIUC, Dean is only
>> suggesting to avoid giving UTF-8  any special status in the string's
>> interface.  He's not arguing against  using UTF-8 storage in the
>> implementation.
> The entire "buzz" started with the fact that under windows
> we have problems with string encoding not being UTF-8

No, the entire buzz started when people like you suggested treating
std::string as UTF-8 by default which I have maintained already is
largely unnecessary from a design perspective.

>> > This is classic  example of how trying to do something
>> > "cool" gives us theoretically  interesting and cool things
>> > that are useless in real world where simple  and straight
>> > forward things actually work a way better.
>> That may  ultimately turn out to be true, but your reaction here seems
>> so over-the-top  and premature as to make that conclusion very
>> unconvincing.
> This article written from wrong understanding of real
> problems - instead of solving a problem it suggests
> some idea for some cases not looking to the problem
> in hole.

I think you mean "in whole".

The article was written from the understanding that the real problem
stems from how std::string is broken. It already identifies why it's
broken. It seems that you're just happy to attack people and the work
they do more than you are interested in solving problems.

If you disagree with what's being said argue on the merits of "why".
Mud-slinging and sitting on a high horse and just saying "blech,
you're wrong" is not helping solve any technical problems.

I've already pointed out some of the problems why I think std::string
is broken. Now if you disagree with the things I pointed out that's
fine, have it your way. I'm not here to please you, I'm here to make a
technical point and a contribution that others may very well welcome.

> Starting from "std::string is broken" statement...

You obviously love it like the way it is so I don't think I need to be
convincing you otherwise. Good luck with that.

Dean Michael Berris

Boost list run by bdawes at, gregod at, cpdaniel at, john at