Boost logo

Boost :

From: Terje Slettebø (tslettebo_at_[hidden])
Date: 2002-05-27 06:27:48


>From: "Kevlin Henney" <kevlin_at_[hidden]>

Well, hello there, Kevlin. :)

In article <1a4901c20256$e51f1e20$60fb5dd5_at_pc>, Terje Slettebø
<tslettebo_at_[hidden]> writes
>Following a suggestion at the Boost User's list, I've updated the
>lexical_cast version in the Boost files section
>(http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/) to
use
>implicit conversion, where possible. This means it uses the fastest and
most
>accurate conversion available, and only resorts to using stringstream when
>it has to.
>
>It uses boost::is_convertible for this. Even if that may not work on all
>compilers, if it doesn't work, it will just resort to using stringstream.

>This seems to roundly defeat the point of lexical_cast, and I'm afraid
>that these changes would not be acceptable.

Yes, this is what I've come to, too.

Like I said in my later postings, I've got a better understanding of what
lexical_cast is supposed to do, now. As I understand, it's supposed to use
the type's lexical representation (the stream operators). You say this as
well, here. Anything else might instead be implemented as a separate
component, like your suggested interpret_cast.

However, what about the following cases (I've included the complete test
program for easy testing):

--- Start ---

#include <iostream>
#include <string>
#include "boost/lexical_cast.hpp"

template<class Target,class Source>
void test(Target target,Source source,int n)
{
try
  {
  if(boost::lexical_cast<Target>(source)!=target)
    std::cout << "Case " << n << " - Not equal\n";
  }
catch(const std::exception &e)
  {
  std::cout << "Case " << n << " - " << e.what() << '\n';
  }
}

int main()
{
// Case 1 - ' ' -> ' ' - Throws bad_lexical_cast

test(' ',' ',1);

// Case 2 - std::string(1,' ') -> std::string(1,' ') - Throws
bad_lexical_cast

test(std::string(1,' '),std::string(1,' '),2);

// Case 3 - ' ' -> std::string(1,' ') - Throws bad_lexical_cast

test(std::string(1,' '),' ',3);

// Case 4 - std::string(1,' ') -> ' ' - Throws bad_lexical_cast

test(' ',std::string(1,' '),4);

// Case 5 - std::string("A string with whitespace") -> std::string("A string
with whitespace") - Throws bad_lexical_cast

test(std::string("A string with whitespace"),std::string("A string with
whitespace"),5);
}

--- End ---

Some of this may be quite easily fixed, by turning off whitespace ignoring
in the stringstream, using code equivalent to:

interpreter << std::noskipws;
interpreter >> std::noskipws;

This fixes case 1 and 4, but 2, 3 and 5 still throws an exception. This is
because std::string, itself, ignores any whitespace on the input, and stops
reading at the first whitespace, following non-whitespace.

Then, there's also the cases of passing empty strings.

Another thing is the extension to handle other character types, such as wide
characters. As I've mentioned, this may be selected based on the arguments.

>For a start, is_convertible
>is too liberal in its behaviour. It uses purely language-based
>conversions rather than on the lexical representation of the type. It
>will give the wrong result in a number of easily identifiable and not so
>readily identifiable situations. The only so-called 'optimisations' that
>are unquestionably reasonable are identity conversions.

>>However, this does mean that the semantics is changed slightly. In the
cases
>>where an implicit conversion exists, it may give a different result than
if
>>not, in the case of char/wchar_t. Without implict conversion, you get 1 ->
>>'1'. With implicit conversion, you get 1 -> 1 (ASCII 1). As far as I can
>>tell, this only affects conversions between char/wchar_t and other types,
>>though. If this is a problem, please let me know, and I can change it to
>>make an exception for char/wchar_t.

>You will end up creating a patchwork of special cases as each one is
>identified in turn: char <-> int, double -> int, pointer conversions,
>etc.

Yeah. Like you say, and other have said, implicit conversions is really a
different concept.

>>The reason this is optional, is that when enabled, it relies on having a
>>static stringstream object.

>These options were explored -- and dropped -- in the run-up to the
>original lexical_cast release. The static option is not an option.

I understand. Yes, lack of thread safety, for one thing, is not good.

>>- It may well make lexical_cast faster, as it avoids the creation and
>>destruction of a stringstream object, each time it's called.
>
>Convenience rather than performance is the primary reason that people
>will be using lexical_cast. Introducing new and subtle forms of
>incorrect behaviour is not really that attractive or convenient :->

I agree. :)

It's important that it's safe to use.

>>This way, one can implement all the features of the "wish list" in the
>>previous posting in this thread, such as.
>>
>>- wlexical_cast - A version that can use wide characters

>A separate cast function should not be required.

I agree. This was quoted from that list. What I meant was that this could be
done using lexical_cast, without having a separate name. That was also what
was implemented.

>>- basic_lexical_cast - A version that takes the interpreter (stringstream)
>>as a parameter, to support things like changing locale

>Use std::stringstream.

That's an option. I agree with you that it doesn't look much like a cast
anymore, with that, so dropping that is ok by me.

However, like I said earlier here, even when considering that lexical_cast
should do a strict lexical casting, there may still be issues involving
whitespace, with conversions involving char and std::string (and other
character types).

>- Being able to set number base (dec, oct or hex). This would be implicitly
>possible using basic_lexical_cast

>Use std::stringstream.

>>- Being able to set precision for numerical conversions. This may also be
>>done with basic_lexical_cast

>The correct fix for this is to use the maximum precision. Otherwise...
>use std::stringstream :->

I get the point. :)

>>- Checking for change of sign, when using unsigned numbers. This is
>>addressed using an integer adapter, in the other lexical_cast_propositions

>I have not kept up to date with all the proposals. Do you have an
>example of this issue?

I found it when searching the list archive. It's a while ago. The postings
are here (http://aspn.activestate.com/ASPN/Mail/Message/1156496).

>>The lack of formatting is what is largely the reason I haven't used
>>lexical_cast much. With control over the formatting, it may be more widely
>>applicable.

>Use std::stringstream :->

So, if I understand you right, you're saying I should use stringstream? :)
Is that what you're trying to tell me? :)

>>Despite the extra code, to handle implicit conversions, where available,
by
>>examining the resulting assembly output (from Intel C++), it's in fact
able
>>to optimise _all_ of it away, producing indentical code as if built in
>>conversions were used directly.
>>
>>Here's an example.
>>
>>int n;
>>double d=1.23;
>>
>>int main()
>>{
>>n=boost::lexical_cast<int>(d);
>>}
>
>This is designed to fail with the current boost::lexical_cast. This is
>intentional behaviour: there was originally a discussion about how
>liberal/strict interpretation should be, and the general consensus leant
>towards strict. This design decision is not about to be reversed as
>there is code that depends on exactly that behaviour.

I understand and I think that makes sense, too. This way it uses a strict
lexical cast, using only the stream operators.

Regards,

Terje


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk