Boost logo

Boost Users :

Subject: [Boost-users] [locale] how to sort utf8 std::string?
From: Frédéric Bron (frederic.bron_at_[hidden])
Date: 2013-01-09 05:07:04


I read std::string from a file encoded in UTF-8.
I would like to sort the strings. I think I have to first convert them
to wide characters and then sort the resulting output.
I have tried to do that with boost::local::conv::to_utf(utf8string, "UTF-8");
But it seems that all non ASCII characters are replaced by the question mark.

Here is my program:
#include <iostream>
#include <fstream>
#include <string>

#include <boost/locale/encoding.hpp>

int main(int argc, char **argv) {
        if (argc<2) return 0;
        std::ifstream is(argv[1], std::ios::binary);
        while (is) {
                std::string line;
                std::getline(is, line);
                if (not line.empty()) {
                        try {
                                std::wstring
ws=boost::locale::conv::to_utf<wchar_t>(line, "UTF-8",
boost::locale::conv::stop);
                                std::wcout<<ws;
                        }
                        catch (...) {
                                std::cout<<"exception\n";
                        }
                }
        }
        is.close();
        return 0;
}

Frédéric


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net