(Sent as HTML, as it contains a table)

>From: "Gennaro Prota" <gennaro_prota@yahoo.com>

>On Thu, 25 Jul 2002 06:28:15 +0200, Terje Slettebų
><tslettebo@chello.no> wrote:
>
>>By the way, I wasn't actually sure what to put as string-terminator,
there.
>>'\0' might be an alternative, but as this is supposed to work for other
>>character types, as well (such as wchar_t), as I understand, this would
mean
>>that for wchar_t, you would have to have L'\0', instead. Therefore, I
>>guessed that NULL or 0 would be ok..

>Though NULL is required to expand to something that evaluates to 0 and
>so would give no error in that context, it is actually intended for
>pointers

I know. NULL was only used earlier, to kind of signify that this was used as
a "special value", and because I thought that NULL was required to be a
macro for int 0. However, I agree that this is an inappropriate place for
it (it's not a pointer, as you say), and the standard apparently doesn't
guarantee that it's a macro for int, only a macro for an integral value. So
this is now changed to 0, in the new version. There's a question if even
this is sufficient, as you come to below, here.

> (As you know, it only works because the language lacks a true
>'null' and the macro NULL expands to an expression that has *integral*
>type) . So a plain 0 is IMHO better style.

I agree.

> BTW I wandered through the
>standard looking for a guarantee that 0 converts to L'\0'. Where is
>it?

Where, indeed. :)

This is a problem of generic programming, involving literals with different
syntax, such as '\0' and L'\0'.

This was brought up in a recent "C Vu" article by Francis Glassborow ("C Vu"
June 2002, p.12, "Trouble with Literals". "C Vu" is the journal of the
ACCU). To quote: "We do not have a syntax to deal with the type of a literal
in a generic way."

Even if we use '\0' for char, and L'\0' for wchar_t, how do we generically
specify the end of the string of an unknown character type? There appears to
be no character trait for it. Perhaps there should be? Is there a portable
way to deal with endings of strings of arbitrary character types?

I used 0 to circumvent the problem of using '\0', L'\0', or some unknown
value, to mark the end of the string of arbitrary character type, and I
hoped it would convert to the appropriate end of string for the character
type. However, as you say, I haven't found a guarantee for this in the
standard.

What one could do, here, is to specialise pointer_to_char_base, for char and
wchar_t, to use the appropriate '\0' and L'\0', respectively, and leave it
to use 0 (the base template) for unknown character types. Perhaps it's
safest to do this?

>>Opps, you're right. Will be fixed. This is one that was missed by the unit
>>tests, simply because there was no test to test a case where it should
throw
>>an exception. The test system is made to enable such tests, as well, I
just
>>hadn't included those tests. I'll add tests that check that it throws an
>>exception when expected, as well.
>>
>>Isn't it typical? The one thing you don't test for, is where you get a
bug.
>>:)

>Yes, but even the test could have apparently "worked" (that's
>undefined behavior...)

I know what you mean. However, in this case "worked" means it should throw
an exception, if passed an empty string, so if that didn't happen, the test
would pick it up.

>>Actually, the code above doesn't throw an exception, if you try to convert
>>an empty string to a char, which I think it should, as mentioned, so it
>>should probably be changed to:
>>
>>if(arg[0] == 0 || arg[1] != 0)
>>  throw bad_lexical_cast();

>Yes, in the first place I thought you would like converting "" to '\0'
>and that's why I gave that code. This morning I began to think that
>empty strings would have better been punished with an exception, but,
>as expected, my newsreader told me you already did it :-)

:)

Yeah, I think the principle of least surprise may favour an exception, if
you try to convert an empty string to a character.

>>>2) Stupid question of the day: is there any reason why all Source
>>>function parameters cannot be declared as Source const &?
>>
>>It's not a stupid question. :) Well, for some types, it may be more
>>efficient to pass by value, than by reference. Pass by reference typically
>>passes the address of the object, so for small types, just passing the
>>object may be more efficient, as you then avoid the indirection, when
>>operating on the object.

>Well, I know that. Actually I missed the reference to const in the
>select_base mechanism!

Ah, I guessed you knew, so I wondered. Then I understand. :)

The boost::call_traits also deals with the "reference to reference" problem.

By the way, the simulated partial specialisation only handles char and
wchar_t, while the partial specialisation handles any character type. So the
simulated version can't really replace the other one.

>(BTW, would we need a similar technique for std::max? :-)

Well, std::min and std::max are already defined to take their parameters by
const reference, what do you mean? Except perhaps to fix the problem
mentioned above.

>This leads to the stupid question of the month: why not Source const?

Well, what would it solve? Perhaps if the output operator is a const member
function, and lexical_cast is passed a non-const object?

>P.S.: Of course I know the C++ implications. What I'm trying to
>discover are boost's guidelines about this sorts of things (I'm quite
>new to this list), so forgive me if I'm asking something that
>everybody knows here.

I'm quite new here, myself, so no worry. :) You've been good help.

>>By the way, this works correctly on Intel C++ 7.0 pre-beta... Perhaps if
you
>>complain about this, you'll get that, too. :) I got that version, after I
>>reported some ICE when trying to compile BLL. They gave me that, to try
>>again. It still doesn't work, but other things, such as this, does.
>>
>>There are also other things that work on 7.0 pre-beta, such as Loki's
>>SmartPtr.h, which doesn't work on 6.0.

>Very odd. Another oddities is this: with your (previous) unit test I
>get no error compiling with VC++6.0, either using it's original
>standard library or STLport 4.5.3 with SGI iostreams.

I've tested it using the same setup, so that makes sense. :)

>If I use Intel
>C++ 6.0 instead, all the tests with (unsigned short) wide-characters
>fail with both libraries.

>So, even if both compilers lack a distinct wchar_t type, VC++ 6.0
>works well with both libraries and Intel C++ 6.0 with none of them.
>Any clue?

I agree that it's odd. When working on that version, I spent a lot of time
building STLPort for the various compilers, to try to get some sensible
behaviour out of this. What I found is that many implementations have poor
support for wide characters. I summarize my findings in the table at the
end, here.

I altered the unit test, now, to try to find out of this, to also output the
desired result of the conversion (it used to just output the source, and
target, and not the specified correct target).

Using this change, the answer became clear. For the following line (the
parameters are "do_test(correct_target,source,line)"):

test<int,wchar_t>::do_test(1,L'1',__LINE__);

Using Intel C++ 6.0 with intrinsic wchar_t (/Zc:wchar_t option), and debug
output (note that typeid(Type).name() still reports it as "unsigned short"):

Test - Succeeded (line 270)
Source type      = unsigned short (1)
Destination type = int (1) (Should have been (1))

Using Intel C++ 6.0, with no intrinsic wchar_t:

Test - Failed (line 270)
Source type      = unsigned short (49)
Destination type = int (49) (Should have been (1))

Using VC++ 6.0 (no intrinsic wchar_t), and debug output:

Test - Succeeded (line 270)
Source type      = unsigned short (49)
Destination type = int (1) (Should have been (1))

As you can see, Intel C++, with and without intrinsic wchar_t, works as it's
supposed to. VC++ 6.0, however, behaves weird. It shows L'1' as "49", yet
when passing it to stringstream, it interprets it as if it's actually
wchar_t, and writes it as "1". So it's kind of a partial wchar_t support,
there, where they have made "unsigned int" to behave as "wchar_t".

By the way, when checking this, I also found a bug in the "printer"
function, with code such as "stream << "..." ", where "stream" could be any
kind of stream (including using wide characters). Opps. Fixed now. This is
something that just happened to work, earlier, but which wasn't correct.

Strictly speaking, the test, if it reports failure, is non-conformant, as it
may use output to both std::cout and std::wcout, for the info of the
failure, in the same program, which as I understand is not allowed.

However, this is only in case of failure, as it otherwise doesn't write
anything but the result of the test. Besides, it appears to generally work,
anyway, and this is just for debugging. It seems it needs stream flushing,
when switching stream type, to work, at least.

If this would be a problem, the output could be turned off, when run as part
of the Boost regression tests.


Here are the results of running the unit test on the first version uploaded
(using Boost 1.28):
 
Platform       Compiler        Library                           Result          Remark
----------------------------------------------------------------------------------------------------------
Windows 2000   Intel C++ 6.0   MSVC standard library (Default)   100% Passed     Needs /Zc:wchar_t option
Windows 2000   Intel C++ 6.0   STLPort 4.5.3                     100% Passed     Needs /Zc:wchar_t option
Windows 2000   MSVC 6.0        MSVC standard library (Default)   100% Passed     No partial specialisation
Windows 2000   MSVC 6.0        STLPort 4.5.3                     100% Passed     No partial specialisation
Windows 2000   BCC 5.5         Rogue Wave 2.1.1 (Default)        100% Passed     Bad PS, not used
Windows 2000   BCC 5.6 (BCB 6) STLPort 4.5.0 (Default)           100% Passed     Bad PS, not used
Windows 2000   gcc 2.95.3      SGI standard library (Default)    100% Passed     (*)
 
(*) No wide character support in library, 45/104 tests not supported. gcc
2.95.x may also need #define BOOST_NO_STRINGSTREAM, as config.hpp is unable
to detect it.

The updated version is uploaded.

Wow, peer-review really works. :)

Thanks. :)


Regards,

Terje