Boost logo

Boost Users :

From: Ben Hutchings (ben.hutchings_at_[hidden])
Date: 2003-12-15 07:33:57


John Maddock <john_at_[hidden]> wrote:
<snip>
> It might be best to add a facility to add new character classes as a
> list of characters and ranges to include, something like:
>
> register_character_class("myname", "d-f");
>
> Then we add all the Unicode block ranges as standard for wide
> character regexes.

Aside from the unified Han characters (kanji/hanzi), characters of
the same category generally aren't neatly grouped together in Unicode.
The 128-character blocks tend to correspond to locales, communities
of use or specific legacy encodings and not to categories. You need
a look-up table (or more efficiently two levels of table) to check
character categories.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net