Boost logo

Boost :

Subject: Re: [boost] [lexical_cast] efficiency
From: OvermindDL1 (overminddl1_at_[hidden])
Date: 2010-01-18 06:12:00

On Mon, Jan 18, 2010 at 1:30 AM, Domagoj Saric <dsaritz_at_[hidden]> wrote:
> 'A while back', in the "error_code debate" I used a lexical_cast<> example for
> demonstrating certain concerns/aspects but the whole post was so long (as usual
> :) that it was probably read by less than 0.1% of people :)
> Anyways I'm extracting and reposting this bit now as I think it warrants
> attention.
> The first problem is the ('standard') "std::streams vs efficency" issue:
>  - on msvc 9.0 sp1 the single line
>   boost::lexical_cast<int>( "321" );
> caused 14 calls to new and 26 calls to EnterCriticalSection() (not to mention
> vtable initializations, virtual function calls, usage of locales...) the
> first time it is called, and 3 calls to new and 15 calls to
> EnterCriticalSection() on subsequent calls...It also caused the binary (which
> does not otherwise use streams) to increase by 50 kB! ...which is IMNHO
> abhorrent...
> (with my usual 'put things in perspective' examples
> :)
> As a start maybe this problem could be sufficiently "lessened" by providing
> lexical_cast specializations/overloads that use C library functions (strtol,
> itoa and the likes) as they suffer less from bloat/performance issues than
> std::streams. Ideally, IMHO, lexical_cast<> should provide the
> power/configurability of boost::numeric_cast (so that one can, for example, say
> I do not need/want to use locales/assume ASCII strings and things like that).
> The second problem is the use of exceptions (or better of only using
> exceptions):
> if, for example, one uses lexical_cast<> deep in the bowels of some parser it
> is probably "natural" to use a bad_cast exception to report an error to
> the outside world but if one uses lexical_cast<> to convert a string entered
> into a widget by a user into an int it would mostly be simpler to have a simple
> error code if the user entered an invalid string and issue a warning and retry
> "at the face of the place" (without the need for try-catch blocks)... In other
> words maybe a dual/hybrid approach of using both exceptions and error codes
> (through different overloads would be justified).

Boost.Lexical_cast is *slow* for sure. I have been making my own
overloads for my own projects that have it use Boost.Spirit2.1 behind
it instead of the stream, and with Boost.Spirit2 in the trunk (2.2,
2.3?) to be released in Boost 1.43 most likely, it has an ability to
build parsers based on return types, that would allow us to have a
rather very generic lexical_cast that would be a *GREAT* great deal
faster. Even for you example of boost::lexical_cast<int>( "321" ),
spirit2 could still parse that faster then the native fast C function
atoi. Although you can still do that pretty easily with Spirit2 in
the trunk like this (not sure if this code has the proper identifier
spelling, but close enough, it works like this):
  int result;
  std::string input = "321";
  boost::spirit::qi::gen_parse(input.begin(), input.end(), result);
And yes, as stated, that will execute faster then the native c
functions atoi/strtol/etc... Spirit includes a benchmark that you can
run yourself as proof. Plus with spirit, you can customize your
grammar inline to, like:
  tuple<int,double> result;
  std::string input = "[ 42, 3.14 ]";
  boost::spirit::qi::phrase_parse(input.begin(), input.end(),
'['>>int_>>','>>double_>>']', result, blank);
Or even other more complicated things like:
  tuple<int,std::vector<double> > result;
  std::string input = "( 42, [3.14,1.2, 3.4] )";
  boost::spirit::qi::phrase_parse(input.begin(), input.end(),
'('>>int_>>','>>'['>>double_%','>>']'>>')', result, blank);
  // assert(result==make_tuple(42,std::vector<double>(3.14,1.2,3.4)));
// line of pseudo-code
Or the above using the generator parser:
  tuple<int,std::vector<double> > result;
  std::string input = "42,3.14,1.2,3.4";
  boost::spirit::qi::gen_phrase_parse(input.begin(), input.end(),
result, lit(','));
  // assert(result==make_tuple(42,std::vector<double>(3.14,1.2,3.4)));
// line of pseudo-code

So if you want speed, use something else, like Boost.Spirit,
Boost.Lexical_cast is made for simplicity, not speed, although much of
it could certainly be sped up if Boost.Spirit become part of its
back-end, and with some template magic it can even fall back to
stringstream if spirit does not know how to parse it directly (thus
you would need to supply a grammar).

Boost list run by bdawes at, gregod at, cpdaniel at, john at