Boost logo

Boost Users :

From: Darren Cook (darren_at_[hidden])
Date: 2003-12-11 21:03:37


> Actually I have a text with a lot of strange characters and japanese
> one ( Hiragana, Katakana, Kanji everything..!) and I want to find these
>japanese sentence in order to translate them and replace in the text.
>I need hence a way in order to identify a japanese sentence . A kind
>of function const bool isJap( const wchar ) const would be fine.

Do you need to use regexes? I've not tried boost.regex yet so cannot help there.

Is your text just ascii and Japanese? Or do you need to distinguish from
other languages as well?

If just ascii and Japanese, you could define a Japanese char as anything
that is not ascii (beware shift-jis encoding though, as 2nd byte of a double
byte character is in the ascii range). If your data is unicode it should
also be easy to treat European characters as non-Japanese as well.

Darren


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net