Boost logo

Boost Users :

Subject: [Boost-users] Unicode regex example
From: Stefan Schweter (stefan_at_[hidden])
Date: 2013-08-13 15:45:10


Hi,

here is an unicode regex example, which I want to get matched:

#include <iostream>
#include <boost/regex.hpp>
#include <boost/regex/icu.hpp>

int main()
{
        std::setlocale(LC_ALL, "");

        boost::wregex condition(L"\\p{u}");

        std::wstring test_word(L"Ãœ");

        if (boost::regex_match(test_word, condition)) {
                std::wcout << L"Matches!" << std::endl;
        }

        boost::wregex condition2(L"[[:upper:]]");

        if (boost::regex_match(test_word, condition2)) {
                std::wcout << L"Matches!" << std::endl;
        }

        boost::u32regex condition3 = boost::make_u32regex(L"\\p{u}");

        if (boost::u32regex_match(test_word, condition3)) {
                std::wcout << L"Matches using lib icu!" << std::endl;
        }

        return 0;
}

Compiled with -lboost_regex -licuuc.

Result: Only the last regex condition matches.

So I have a few questions:

1. Is u32regex + make_u32regex the *only* way to get my regex condition
matched?

2. Why does "upper class" in the second regex condition not match. E.g.
when I use:

echo "Ü" | grep '[[:upper:]]'

on command line - it works properly :)

Thanks in advance + regards,

Stefan


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net