Boost logo

Boost Users :

Subject: [Boost-users] [serialization?] converting utf8 string to unicode wstring
From: Igor R (boost.lists_at_[hidden])
Date: 2009-10-07 13:58:34


Hello,

I try to accomplish the subj with help of boost's utf8_codecvt_facet.
I based my code on this example:
http://www.boost.org/doc/libs/1_40_0/libs/serialization/doc/codecvt.html .
The only difference is that my utf8 text resides in std::string:

#include <sstream>
#include <iostream>
#include "boost/archive/detail/utf8_codecvt_facet.hpp"
// link with boost/libs/serialization/src/utf8_codecvt_facet.cpp

int main()
{
 std::string utf;
 utf.resize(11);
 // hardcode some utf8 text
 utf[0] = 0xd7;
 utf[1] = 0x90;
 utf[2] = 0xd7;
 utf[3] = 0x99;
 utf[4] = 0xd7;
 utf[5] = 0x92;
 utf[6] = 0xd7;
 utf[7] = 0x95;
 utf[8] = 0xd7;
 utf[9] = 0xa8;
 utf[10] = 0x0;
 std::locale old_locale;
 std::locale utf8_locale(old_locale, new
boost::archive::detail::utf8_codecvt_facet());
 std::locale::global(utf8_locale);
 std::stringstream in;
 in.imbue(utf8_locale);
 in.str(utf);
 std::wstringstream out;
 out << in;
 std::wcout << out.str() << std::endl;
}

The above code doesn't work: "out" buffer doesn't contain correct unicode
interpretation of the string.
Actually, all i want is a c++ equivalent to the following WinAPI:

#include "windows.h"
int main()
{
  std::string utf;
 utf.resize(11);
 // hardcode some utf8 text
 utf[0] = 0xd7;
 utf[1] = 0x90;
 utf[2] = 0xd7;
 utf[3] = 0x99;
 utf[4] = 0xd7;
 utf[5] = 0x92;
 utf[6] = 0xd7;
 utf[7] = 0x95;
 utf[8] = 0xd7;
 utf[9] = 0xa8;
 utf[10] = 0x0;

 wchar_t outBuff[11];
 MultiByteToWideChar(CP_UTF8, 0, utf.c_str(), -1, outBuff, 10);
}

...which works well.

Any idea would be greatly appreciated!

Thanks.



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net