Boost logo

Boost :

Subject: Re: [boost] [string] proposal
From: Dean Michael Berris (mikhailberis_at_[hidden])
Date: 2011-01-28 15:46:05


On Sat, Jan 29, 2011 at 4:26 AM, Artyom <artyomtnk_at_[hidden]> wrote:
>
>> From: Dean Michael Berris <mikhailberis_at_[hidden]>
>> On Fri, Jan 28, 2011 at 7:20 PM, Dean Michael Berris
>> <mikhailberis_at_[hidden]>  wrote:
>>
>> And I stopped before I  write too much -- the initial version is
>> already up:
>>https://github.com/downloads/mikhailberis/cpp-string-theory/cpp-string-theory.pdf
>>
>> --  I'll give it more information and the actual interfaces and
>> implementation  as soon as I get some Z's. :)
>>
>
> I'm sorry but this "document" full of mistakes
> and misses serious points:
>
> 1. "Contiguity"
>
>   Continuity and c_str() is one of the most important
>   properties of C++ string (that is BTW required by C++0x)
>

It is the single most problematic feature bar none.

>   Reason: c_str() is a boundary to almost every library
>   existing in C++ and C world. So removing this "bad" feature
>   makes it useless for vast majority of string users.
>
>   Note: all strings around in all languages are continuous for
>   the reasons.
>

You didn't read the whole thing. I present an algorithm to linearize a
string. What else do you need?

> 2. Efficiency - have you forgotten about std::string::reserve?
>

It's still contiguous. And then when it grows past the reserved size,
explain to me what happens.

> 3. non-uniform-memory-architecture
>
>   Give me a break... Who uses NUMA for string processing?!
>

If you're using a modern Intel Core i5/i7, you're using NUMA for
*everything*. Xeon 5400's are NUMA. AMD with hypertransfer is NUMA.
NUMA is an architecture and if you're running your programs in a NUMA
machine well you're using NUMA.

Google NUMA and see what I mean.

> 4. About string builder. Most languages require is as they
>   don't have "reserve" also if you want efficient
>   string builder use std::ostream with nice stream buffer.
>
>   Don't copy paradigms that do not belong to C++!
>

Who are you to say what paradigms don't belong to C++?

And std::ostream is not a stream builder -- std::ostringstream is the
string builder as it is now but it uses a string buffer, which is also
contiguous and has the same problems as std::string.

> 5. Makeing all operations lazy you bring more segmentation
>   to memory as it is not recycled, also it reduces
>   performance as does not have "liner" location in cache/
>

Are you F'n kidding me? Lazy operations doesn't bring more
segmentation, it makes application of operations delayed until the
data is required.

> 6. Encoding is extrinsic to strings
>
>   ?!?!?!
>
>   All the discussion in started because we need UTF-8
>   in strings now we are back to the beginning?
>

No, the discussion started because we need a UTF-8 view of data. You
missed the point I was making. And you didn't understand the document
I wrote.

It's obvious you have an idea of what a string should be and I have a
different one. So I don't see any point in trying to convince you
otherwise when you've already made up your mind that std::string is
fine when the whole point of my document builds around why std::string
is broken.

>
> This is classic example of how trying to do something
> "cool" gives us theoretically interesting and cool things
> that are useless in real world where simple and straight
> forward things actually work a way better.
>

Simple and straight forward translates to naive and inefficient most
of the time.

This isn't meant to be "cool" it was meant to address a problem that
has already been identified repeatedly.

> So you are welcome to propose overcomplicated
> interface that tries to optimize some corner cases
> and finally makes it useless.
>

Thank you. I shall try to prove you wrong then -- or better yet, guess
what, I don't really care what *you* think because I don't think I
need to convince *you* that std::string is broken when you obviously
think it's fine.

> Sorry,
>

No need to apologize, it's apparent you missed the whole point anyway.

> But what you had written has nothing to do
> to reality. SGI had ropes... Where they are today?
>

Ropes, are getting into TR2. Read the same document referred to with
regards to COW with strings becoming non-standard compliant in C++0x.

-- 
Dean Michael Berris
about.me/deanberris

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk