Boost logo

Boost :

From: John Salmon (john_at_[hidden])
Date: 2007-01-30 13:03:20


On 1/30/07, Alexander Nasonov <alnsn_at_[hidden]> wrote:
> John Salmon <john <at> thesalmons.org> writes:
> > Yes. It's possible to write more "defensive" inserters and
> > extractors. That's beside the point. The patch allows lexical_cast
> > to work with Udts whose implementations are not so careful. It does
> > so while passing all existing regressions (except for two which are
> > modified to return a very reasonable result rather than throwing a
> > somewhat surprising exception).
>
> I don't have an access to my home system to check if your patch doesn't
> break this:
t>
> BOOST_CHECK_EQUAL(" a b c " == lexical_cast<std::string>(" a b c "));
>
> Does it?

The patch satisfies that test on my system (Linux, gcc-3.3.3). I was
surprised that such a test was not already in the regression suite.
Something like it obviously should be, so I added some very similar
tests just to be sure. These are in the original patch:

diff -u -r1.24 lexical_cast_test.cpp
--- libs/conversion/lexical_cast_test.cpp 28 Oct 2006 19:33:32
-0000 1.24
+++ libs/conversion/lexical_cast_test.cpp 27 Jan 2007 23:47:03 -0000
@@ -215,6 +213,12 @@
    BOOST_CHECK_EQUAL(" ", lexical_cast<std::string>(" "));
    BOOST_CHECK_EQUAL("", lexical_cast<std::string>(""));
    BOOST_CHECK_EQUAL("Test", lexical_cast<std::string>(std::string("Test")));
+ BOOST_CHECK_EQUAL("TrailingSpace ",
lexical_cast<std::string>(std::string("TrailingSpace ")));
+ BOOST_CHECK_EQUAL("TrailingSpace ",
lexical_cast<std::string>("TrailingSpace "));
+ BOOST_CHECK_EQUAL(" LeadingSpace ",
lexical_cast<std::string>(std::string(" LeadingSpace ")));
+ BOOST_CHECK_EQUAL(" LeadingSpace ", lexical_cast<std::string>("
LeadingSpace "));
+ BOOST_CHECK_EQUAL(" Embedded Space ",
lexical_cast<std::string>(std::string(" Embedded Space ")));
+ BOOST_CHECK_EQUAL(" Embedded Space ", lexical_cast<std::string>("
Embedded Space "));
    BOOST_CHECK_EQUAL(" ", lexical_cast<std::string>(std::string(" ")));
    BOOST_CHECK_EQUAL("", lexical_cast<std::string>(std::string("")));

>
> > Nothing in lexical_cast's documentation even hints that one should
> > beware of whitespace when reading and writing from streams. There is
> > no reason for lexical_cast to insist on this degree of care and
> > attention to detail in the implementations of user-defined types. The
> > naive pair of operator<< and operator>> in the original post work
> > perfectly well when operating on iostreams in their standard, default
> > state. They work perfectly well for the user who never heard of
> > lexical_cast who uses the standard istringstream idiom:
> >
> > istringstream iss("13 19");
> > Udt Foo;
> > iss >> Foo;
> > assert( Foo == Udt(13, 19) );
> >
> > Shouldn't lexical_cast do just as well? When I see something like the
> > above in a colleague's code, and I suggest lexical_cast, should I
> > really have to also warn him about carefully handling whitespace in
> > the Udt extractor?
>
> I'll think about it. But noskipws is there for a good reason. You can
> search the boost archive.

Yes. There have been a few discussions over the years. There are two
basic themes:
1 - somebody thinks that lexical_cast ought to work on a type whose
operator>> isn't smart enough to handle a non-default skipws flag.
The response is typically "your operator>> is broken", followed by
more or less discussion about whether lexical_cast should accommodate
"broken" operator>> anyway. We are repeating this theme.

2 - a desire for consistency between the handling of leading
whitespace and trailing whitespace.

I would suggest that the fact that questions of type 1 keep coming up
is an indication that the current behavior is surprising and
alternatives should be explored. If we can change it so that it is
less surprising to "newbies" without breaking anything else and
without introducing new complexity, I think we should.

I would suggest that the answer to questions of type 2 is to observe
that it's not trailing whitespace that causes the exception, it's
trailing *anything*. As such, it should be clearly documented. In
fact, the proposed text for the specification of lexical_cast in TR2
makes this very clear.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1973.html

Boost's documentation should be reconciled with the TR2 proposal.
Here's an excerpt:

<snip>

lexical_cast behavior is specified in terms of operator<< and
operator>> on a std::basic_stringstream object. Implementations are
not required to actually use a std::basic_stringstream object to
achieve the required behavior. Implementations are permitted to
provide specializations of the lexical_cast template.

    [Note: Implementations may use this "as if" leeway to achieve
efficiency. -- end note.]

template<typename Target, typename Source>
  Target lexical_cast(const Source& arg);

    Effects:

        * Inserts arg into an empty std::basic_stringstream object via
 operator<<.
        * Extracts the result, of type Target, from the
std::basic_stringstream object via operator>>.

    Throws: bad_lexical_cast if:

        * Source is a pointer type.
        * fail() for the std::basic_stringstream object is true after
either operator<< or operator>> is applied.
        * get() for the std::basic_stringstream object is not
std::char_traits<char_type>::eof() after both operator<< and
operator>> are applied.

    Returns: The result as created by the effects.

    Remarks: If Target is either std::string or std::wstring, stream
extraction takes the whole content of the string, including spaces,
rather than relying on the default operator>> behavior.
</snip>

>
> --
> Alexander Nasonov
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk