
I've run the same tests with gcc 3.2.2 on RH9, without any problems, so I'll post this on the microsoft.public.vc.language newsgroup. Keith MacDonald "Keith MacDonald" <boost@mailclan.net> wrote in message news:c177ag$79p$1@sea.gmane.org...
Messagetokenizer worked in Unicode for me, so I experimented with your example to try to find out what made the difference. To simplify building in different modes, I changed it to the following:
// ==== BEGIN CODE ==== // Unicode Build: cl /D_UNICODE /EHsc /IF:\Dev\boost_1_31_0 tok.cpp // DBCS Build: cl /EHsc /IF:\Dev\boost_1_31_0 tok.cpp // #include <string> #include <string> #include <iostream> #include <boost/tokenizer.hpp>
#ifdef _UNICODE typedef std::basic_string<wchar_t> string_t; #define _T(x) L##x #define STDOUT std::wcout #else typedef std::basic_string<char> string_t; #define _T(x) x #define STDOUT std::cout #endif
typedef string_t::value_type char_t;
typedef boost::tokenizer < boost::char_separator<char_t>, string_t::const_iterator, string_t
MyTokenizer;
const boost::char_separator<char_t> sep(_T("a"));
int main() { #ifdef _BUG MyTokenizer token(string_t(_T("abacadaeafag")), sep); #else string_t s(_T("abacadaeafag")); MyTokenizer token(s, sep); #endif
for (MyTokenizer::const_iterator it = token.begin(); it != token.end(); ++it) STDOUT << *it;
return 0; } // ==== END CODE ====
The following table shows the output when _UNICODE and _BUG are defined:
_UNICODE _BUG Output ----------------------------- undef def " bcdefg" def def "" undef undef "bcdefg" def undef "bcdefg"
It seems that the tokenizer constructor is handling both Unicode and MBCS temporary strings incorrectly, with VC7.1.
Keith MacDonald