Boost logo

Boost :

From: Corrado Zoccolo (czoccolo_at_[hidden])
Date: 2006-09-09 16:54:42


On 9/3/06, Jeff Garland <jeff_at_[hidden]> wrote:
> <snip previous discussion>
> > This scheme addresses the performance problems noted in
> > http://www.sgi.com/tech/stl/string_discussion.html for reference
> > counted strings with unshareable state (like the g++ implementation),
> > because now shareable/unshareable state is assigned by the user at
> > compile time, and can be explicitly changed. In this way the user will
> > know what to expect from the performance point of view, and the
> > semantics will be (to my eyes) more clear.
>
> This seems like a reasonable approach for those that want to go the immutable
> route....care to develop it into a full blown proposal?

I developed my ideas into three concrete classes.
I uploaded it to the boost vault as imm_string_and_builder.zip (under
Strings - Text Processing).
The tiny url to the zip file is http://tinyurl.com/hznt4 .
I added a test file to compare the copy/access cost of my string
implementations with std::string and const char *.

>From my tests, splitting the string abstraction into immutable string
and string builder allows a more efficient implementation.
My tests results are:
1) compiling without threads configured (BOOST_HAS_THREADS undefined),
copying an immutable string has a small overhead over plain const char
*
2) compiling with BOOST_USE_ASM_ATOMIC_H, imm_string and
string_builder always outperform std::string (I'm using the
implementation delivered with g++ . 4.0.1). Overhead is large w.r.t.
const char * (due to the high cost of lock operations on Pentium IVs),
but passing a const & to imm_string is as fast as passing a const char
* (not true for std::string).
3) using move semantics over string builders achieve the same
performance of copying immutable strings
(see below for the actual numbers)
I'm interested in seeing the test results on other platforms/compilers.

It is interesting to note that with this split of the abstraction in
different classes, the thread safety properties becomes compatible
with posix rules (that was my original goal):
* accessing an object as read only from multiple threads is safe
* even writing to different string builders areas (without changing
their length) is MT-safe, as writing to a preallocated char []

I think that, if the upcoming standard will deal with threads, some
thread-safe issues in the standard library will need to be addressed.
I think that for string, these abstractions will be an useful starting
point.

Corrado

Test results for BOOST_USE_ASM_ATOMIC_H, on a PentiumIV 2.8GHz with HT
Baseline (const char *, without atomic count)
Took 0.09 s on PKc
Baseline (const char *, with atomic count)
Took 0.94 s on PKc
Pass by value
Took 0.96 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 2.45 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE
Took 3.48 s on Ss // this is std::string
Pass by reference
Took 0.09 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 0.08 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE
Took 0.13 s on Ss // this is std::string
Modified, Pass by value
Took 1 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 2.45 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE
Took 3.78 s on Ss // this is std::string
Modified, Pass by reference
Took 0.09 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 0.09 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE
Took 0.13 s on Ss // this is std::string
Leaked, Pass by value // Leaked state is peculiar of g++ strings implementations
Took 1 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 2.45 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE
Took 2.56 s on Ss // this is std::string
Leaked, Pass by reference // Leaked state is peculiar of g++ strings
implementations
Took 0.09 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 0.09 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE
Took 0.13 s on Ss // this is std::string
Temp String, with copy
Took 2.49 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 2.49 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE
Took 4.39 s on Ss // this is std::string
Temp String, Move semantics
Took 1.25 s on N5boost7strings10imm_stringIcSt11char_traitsIcEEE
Took 1.03 s on N5boost7strings14string_builderIcSt11char_traitsIcEEE

> Jeff
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>

-- 
__________________________________________________________________________
dott. Corrado Zoccolo                          mailto:czoccolo (at) gmail.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk