Boost Users :

Date view	Thread view	Subject view	Author view

From: Keith MacDonald (boost_at_[hidden])
Date: 2004-02-21 04:10:43

Next message: Keith MacDonald: "[Boost-users] Re: tokenizer and wstring with VC7.1"
Previous message: john.wismar_at_[hidden]: "Re: [Boost-users] Counting matches with Regex V4"
In reply to: Douglas G. Hanley: "[Boost-users] tokenizer and wstring with VC7.1"
Next in thread: Keith MacDonald: "[Boost-users] Re: tokenizer and wstring with VC7.1"
Reply: Keith MacDonald: "[Boost-users] Re: tokenizer and wstring with VC7.1"
Reply: Bronek Kozicki: "[Boost-users] Re: tokenizer and wstring with VC7.1"

Messagetokenizer worked in Unicode for me, so I experimented with your
example to try to find out what made the difference. To simplify building
in different modes, I changed it to the following:

// ==== BEGIN CODE ====
// Unicode Build: cl /D_UNICODE /EHsc /IF:\Dev\boost_1_31_0 tok.cpp
// DBCS Build: cl /EHsc /IF:\Dev\boost_1_31_0 tok.cpp
//
#include <string>
#include <string>
#include <iostream>
#include <boost/tokenizer.hpp>

#ifdef _UNICODE
    typedef std::basic_string<wchar_t> string_t;
    #define _T(x) L##x
    #define STDOUT std::wcout
#else
    typedef std::basic_string<char> string_t;
    #define _T(x) x
    #define STDOUT std::cout
#endif

typedef string_t::value_type char_t;

typedef boost::tokenizer <
    boost::char_separator<char_t>,
    string_t::const_iterator,
    string_t
> MyTokenizer;

const boost::char_separator<char_t> sep(_T("a"));

int main()
{
#ifdef _BUG
    MyTokenizer token(string_t(_T("abacadaeafag")), sep);
#else
    string_t s(_T("abacadaeafag"));
    MyTokenizer token(s, sep);
#endif

for (MyTokenizer::const_iterator it = token.begin(); it != token.end();
++it)
STDOUT << *it;

return 0;
}
// ==== END CODE ====

The following table shows the output when _UNICODE and _BUG are defined:

_UNICODE _BUG Output
-----------------------------
undef def " bcdefg"
def def ""
undef undef "bcdefg"
def undef "bcdefg"

It seems that the tokenizer constructor is handling both Unicode and MBCS
temporary strings incorrectly, with VC7.1.

Keith MacDonald

"Douglas G. Hanley" <DHanley_at_[hidden]> wrote in message
news:8E1D6FAA50041A4CB4C2A0179B608D153B6412_at_ng-ald-mail.aldermaston.neverfailgrou
Has anybody managed to get tokenizer working for wide characters with VC7.1
(boost version 1.31.0)? The following example works fine...

typedef tokenizer<char_separator<std::string::value_type>,
std::string::const_iterator, std::string> MyTokenizer;

const char_separator<std::string::value_type> sep("a");

MyTokenizer token(std::string("abacadaeafag"), sep);
for (MyTokenizer::const_iterator it = token.begin(); it != token.end();
++it)
{
std::cout << *it;
}

...while the following example produces no output...

typedef tokenizer<char_separator<std::wstring::value_type>,
std::wstring::const_iterator, std::wstring> MyTokenizer;

const char_separator<std::wstring::value_type> sep(L"a");

MyTokenizer token(std::wstring(L"abacadaeafag"), sep);
for (MyTokenizer::const_iterator it = token.begin(); it != token.end();
++it)
{
std::wcout << *it;
}

Cheers,

Douglas.

_______________________________________________
Boost-users mailing list
Boost-users_at_[hidden]
http://lists.boost.org/mailman/listinfo.cgi/boost-users

Next message: Keith MacDonald: "[Boost-users] Re: tokenizer and wstring with VC7.1"
Previous message: john.wismar_at_[hidden]: "Re: [Boost-users] Counting matches with Regex V4"
In reply to: Douglas G. Hanley: "[Boost-users] tokenizer and wstring with VC7.1"
Next in thread: Keith MacDonald: "[Boost-users] Re: tokenizer and wstring with VC7.1"
Reply: Keith MacDonald: "[Boost-users] Re: tokenizer and wstring with VC7.1"
Reply: Bronek Kozicki: "[Boost-users] Re: tokenizer and wstring with VC7.1"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net