Boost logo

Boost Users :

From: John Maddock (john_at_[hidden])
Date: 2005-03-10 11:34:22


>> If you're prepared to depend upon ICU,
>
> WHat's ICU == I see you?? :)

IBM's Unicode libraries:
http://www-306.ibm.com/software/globalization/icu/index.jsp

> then the current cvs has
>> (optional) support for 16 and 32-bit Unicode character types, the traits
>
> it's like utf-16, but I replace all the chars above 0xFFFF with '?', so
> it's utf-16 that doesn't have 4-byte chars.
>
>> class design is also rather simplified and better documented, so that
>> would be the best bet if you wanted to define your own minimalist traits
>
> I don't really understand well what's character_traits etc (and how to
> create them myself), I only wanted that my regex16 would do the same job
> for chars 0-0x00FF as boost::regex does for 0-0xff, and the rest of the
> chars (>=0x0100) would be considered non-words (\W) and so that I could
> only use \xXXXX-\xXXXX notation for their ranges& patterns...

Unfortunately you still have to write yourself a traits class to do that, a
simple wrapper that forwards calls onto c_regex_traits<char> where
appropriate would do it. Unfortunately the traits class design is going to
change in the next release, which is why I'm nudging you towards the current
cvs state, rather than the last release.

John.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net