Boost Users :

Date view	Thread view	Subject view	Author view

From: John Maddock (john_at_[hidden])
Date: 2003-12-14 06:41:12

Next message: Eyal Fink: "Re: [Boost-users] problem with shared_ptr"
Previous message: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"
In reply to: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"
Next in thread: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"
Reply: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"

> Are the existing character-classes following a standard, or are you open
to
> patches to extend them?

Yes, they follow the POSIX and ECMA script standards to give:

"alnum"
"alpha",
"cntrl",
"digit",
"graph",
"lower",
"print",
"punct",
"space",
"upper",
"xdigit",
"blank",
"word",
"unicode",

> It might be nice to have at least:
> [:hiragana:]
> [:katakana:]
> [:hankaku_katakana:]

isn't that just [[:hiragana:][:katakana:]] ?

> [:wide_alpha:]
> [:wide_num:]
> [:wide_alphanum:]

There should be no need for those - [[:alpha:]] will detect wide character
alphabetic characters perfectly well (provided the locale isn't "C").

> Defining the set of Japanese kanji would be harder.

How are they defined?

It might be best to add a facility to add new character classes as a list of
characters and ranges to include, something like:

register_character_class("myname", "d-f");

Then we add all the Unicode block ranges as standard for wide character
regexes.

John.

Next message: Eyal Fink: "Re: [Boost-users] problem with shared_ptr"
Previous message: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"
In reply to: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"
Next in thread: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"
Reply: Darren Cook: "Re: [Boost-users] find japanese character with boost regex++"

Date view	Thread view	Subject view	Author view