Boost logo

Boost :

Subject: Re: [boost] [lexical_cast] efficiency
From: Domagoj Saric (dsaritz_at_[hidden])
Date: 2010-01-19 15:58:12


"Alexander Nasonov" <alnsn_at_[hidden]> wrote in message
news:22191263825453_at_webmail84.yandex.ru...

> First of all, which boost version do you use?

1.40

> If it's a recent version, a conversion
> from char const[4] to int should not create a stringstream object, only the
> std::locale
> object.

But it does (create a stringstream object).
(the appropriate lcast_streambuf_for_target<int> is not specialized and
'returns' true)

> If you alsways use the "C" locale, you can eliminate a construction of that
> object if you define BOOST_LEXICAL_CAST_ASSUME_C_LOCALE. Can you
> please run your test it and report a number of calls to the new operator and
> the EnterCriticalSection function?

>From briefly stepping through the lexical_cast<> code again it seems that
BOOST_LEXICAL_CAST_ASSUME_C_LOCALE has no influence on the example I gave. It
does however make a difference for a reverse cast from int to string and indeed
then (in the reverse case) no stream object is created.
OTOH the second/reverse case suffers from 'forced std::string
usage'...converting an int to a string representation should not require
dynamic memory allocation.

> Since std::streams is a part of C++ std library, I'd say it's a problem of a
> C++ vendor.

As much as I dislike 'reinventing the wheel' and generaly bark at all the GUI
libraries out there that just "have to" rewrite the standard library along with
half of the known universe, I also think that the 'std library' is not some
holy cow that should be blindly worshiped.
The best example of something wrong with the standard library are precisely
std::streams...even the official "Technical Report on C++ Performance" has a
special section dedicated to the issue (
http://www.open-std.org/jtc1/sc22/WG21/docs/TR18015.pdf ... section 6 ) that
begins with "The Standard IOStreams library (§IS-27) has a well-earned
reputation of being inefficient.". Sure, it then goes on to prove that much of
this reputation is actually not well earned by offering an example of a ">more
efficient<" implementation (than those 'naive' ones) that uses things like
dynamic_casts with explicit try-catch blocks ... !? ... it surely does make you
want to std::scream ... and wonder just how bad then are those 'naive'
implementations...

It seems to me that std::screams constitute The Epic Failure of the standard
library and look more suited/designed for C# than C++. So much that the Boost
coding guidelines should probably warn against using them, in bold and in
italics.

"To cut the barking"...a particular vendor implementation of iostreams my be
particulary horrible (the Dinkumware one certainly does seem so) but iostreams
are simply bad by design, flawed in and of themselves. A very basic example:

std::stringstream sss;
sss << 321;
sss.str();

this, among (many many many) other things and to my knowledge of the standard,
requires allocation of minimally two buffers (the stream internal buffer and
the std::string internal buffer) when it should of course require none as it
is ancient knowledge that a base-10 string representation of an int fits in a
stack allocated 33 chars large buffer... Sure some implementations will try to
alleviate the problem by using "small string optimizations" but that only
shifts the problem into more bloated code...

> If you call lexical_cast from more than one DSO, it further increases a size
> by mutliple
> instantiations in diferent DSO. I'd personally move some functions to
> libboostconversion.{so,DLL} to reduce a size but I already hear complaints
> from
> header-only lovers ;-)

Well, as a small immediate space/bloat-wise improvement you could extract the
exception throwing code into a non template function and add a lexical_cast<>
specific (instead of the global BOOST_NO_TYPEID) configuration option for
turning off RTTI information in bad_lexical_cast...

>> As a start maybe this problem could be sufficiently "lessened" by providing
>> lexical_cast specializations/overloads that use C library functions (strtol,
>> itoa and the likes) as they suffer less from bloat/performance issues than
>> std::streams.
>
> C and C++ locate are not neccessarily equal.

In what way? wouldn't that imply that one of them is 'wrong'/'incorrect'?

>> Ideally, IMHO, lexical_cast<> should provide the
>> power/configurability of boost::numeric_cast (so that one can, for example,
>> say
>> I do not need/want to use locales/assume ASCII strings and things like
>> that).
>
> Flexibility of numeric_cast is a separate project. Check the review schedule.

The only related thing I could find here
http://www.boost.org/community/review_schedule.html is
"String Convert" by Vladimir Batov...were you thinking of that library?

> I'm afraid it's too late to change lexical_cast, it's already in the next
> standard.

Why? As argued earlier 'the standard' should not be some untouchable god-like
entity. Besides:
- changing boost::lexical_cast<> does not change std::lexical_cast<>
(unfortunately :)
- replacing the 'screaming' ;) implementation with C functions, Spirit or
something else does not necessarily change the interface or the behaviour
- adding overloads that accept or return error codes instead of exceptions, or
fixed char buffers instead of std::strings also does not change the existing
interface...

> But there are two libraries in the review queue. I don't know if they give
> you enough flexibility, though.

Which ones please, I seem to be 'looking without seeing' ;)

> I hope this helps,
> Alex

Any effort is thanks worthy ;)

-- 
 "That men do not learn very much from the lessons of history is the most
important of all the lessons of history."
 Aldous Huxley

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk