Boost logo

Boost :

From: Terje Slettebø (tslettebo_at_[hidden])
Date: 2002-12-30 17:52:15


>From: "Thomas Witt" <witt_at_[hidden]>

> On Sunday 29 December 2002 01:29, Terje Slettebø wrote:
> > > there...) Anyway, the surprising result of this was that the space in
> > > "Hello there" caused the interpreter.eof() check in lexical_cast to
fail.
> > > So if that check were not present, dest would have been assigned the
> > > value "Hello". Fortunately, the check was there, and instead I got a
> > > bad_lexical_cast exception.
> >
> > This issue comes up at regular intervals, since this is one of the known
> > problems of lexical_cast, but it hasn't yet been fixed. See e.g. this
> > posting (http://aspn.activestate.com/ASPN/Mail/Message/1454894).
> >
> > A proposition has been made to fix this and other things
> > (http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/),
and
> > I'll get to update it properly, and write the docs for it, soon. It
should
> > work correctly as it is, as it's been tested in an extensive unit test
> > (also found at the same place).
> >
> > > Is this the way lexical_cast is intended to work?
> >
> > No, Kevlin has acknowledged that this is a known problem with the
current
> > version of lexical_cast, the handling of whitespace in strings and
> > characters.
>
> I am still uncertain whether this is a problem with lexical_cast and
whether
> it should be fixed.
>
> The stated purpose of lexical_cast is type conversion through string
> representation.

Well, the way it's implemented is to use type's _stream_ representation, not
string representation. That it uses std::stringstream (i.e. a string as the
underlying buffer of the stream) internally, for some of the conversions, is
really just an implementation detail. At least, this is my understanding of
its stated purpose - to use the stream representation. Perhaps you agree?

This means that allowing whitespace in strings and characters read into
std::string is a special case, as it wouldn't ordinarily allow that, as
std::string skips, and terminates reading, on whitespace. Nevertheless, it
appears that handling this special case is perceived as a useful thing, to
provide more uniform treatment of conversions. Consider:

int i=123;
point2d p(1,3);

std::string s1=boost::lexical_cast<std::string>(i);

// s1="123"

std::string s2=boost::lexical_cast<std::string>(p);

// Without allowing whitespace - throws exception
// Allowing whitespace - s2="(1, 3)"

I think it's reasonable to allow both to succeed. That requires a special
case for std::string.

> I think this is a simple but powerful concept.
>
> To me the actual problem is not in lexical_cast but in the
std::basic_strings
> stream operator semantics. Basically you cannot read strings containing
> whitespace as a whole. I.e. the integrity of a string containing
whitespace
> is lost once you streamed it.

The problem is only when reading _into_ a std::string. The purpose of the
whitespace skipping and terminating, is to be able to determine when the
reading should start and finish. However, in this case, we know where it
should start and finish - it should read the whole std::stringstream buffer.
Therefore, returning std::stringstream's buffer (using str()), when the
target is std::string, is in my opinion reasonable.

> This is a fundamental if at times undesirable property of
std::basic_string
> and char const* for that matter. I don't know whether it is a good idea
for
> lexical_cast to try to fix it.

Comments to the above?

If Boost/Kevlin (especially the latter) thinks there should be no treatment
of whitespace/empty string special case, then I'll go with that, of course.
However, it seems from his recent reply, here, as well as the "Future
directions" info in the docs, that he finds it reasonable, as well. After
all, it was a steady trickle of problem-reports regarding whitespace, that
lead to the current proposal in the first place. Since then, wide character
support has been added, as well.

Regards,

Terje


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk