Boost Users :

Date view	Thread view	Subject view	Author view

From: bthiesfield (bthiesfield_at_[hidden])
Date: 2002-06-24 03:39:44

Next message: jfo60540: "boost.python and g++ 3.1 - Help needed"
Previous message: Erik Arner: "[BGL] Newbie problems with property_map interface"
Next in thread: John Maddock: "Re: [Boost-Users] regex with double byte character sets"
Reply: John Maddock: "Re: [Boost-Users] regex with double byte character sets"

Hi all,

I am currently trying to use the boost regex library with Japanese
language strings. It appears like DBCS is not supported. For
example, using the following code (with compile definition of
BOOST_REGEX_USE_C_LOCALE) I get the output strings as

0 = "$B!#(B"
1 = "English"

Instead of the expected:

0 = "$B$d$f$h$o$r!<!#(B"
1 = "English"

This is due to the fact that the Japanese (SJIS encoding) for one of
these characters uses the [ character as one of the characters in the
encoding.

setlocale( LC_COLLATE, "Japanese" );
setlocale( LC_CTYPE, "Japanese" );

char * pszText = "$B$d$f$h$o$r!<!#(B [english]",
char * pszRule ="([^\\[]*)\\[([[:word:]]*)\\]";

    // split the string into it's components
    std::vector<std::string> vPart;
    boost::regex eParseExpr( pszRule,
        boost::regbase::normal | boost::regbase::icase );
    boost::regex_split( std::back_inserter(vPart),
std::string(pszText), eParseExpr );

Is there some what to modify the library to enable DBCS? For
example, can the char_traits be modified to enable DBCS processing?
(Keeping in mind that the biggest problem with DBCS is that a single
character may consist of 2 bytes which tends to blow out all
assumptions about the size of characters).

Brodie.

Next message: jfo60540: "boost.python and g++ 3.1 - Help needed"
Previous message: Erik Arner: "[BGL] Newbie problems with property_map interface"
Next in thread: John Maddock: "Re: [Boost-Users] regex with double byte character sets"
Reply: John Maddock: "Re: [Boost-Users] regex with double byte character sets"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net