|
Boost Users : |
From: John Maddock (john_at_[hidden])
Date: 2007-04-14 05:18:57
Sebastian Pipping wrote:
> It seems to me like Boost.Regex is much more
> powerful than QRegExp of QT. Can Boost.Regex
> work with QChar [1] as charT?
> The text "i.e. either char or wchar_t" on [2]
> made me unsure of this. Any limitations
> I should know about?
This is going to be a "yes but you will need to do some small amount of
work" kind of answer, sorry:-(
Basically, Boost.Regex needs to know some things about the character type -
like which code points are upper case, and how to convert cases etc in order
to do it's stuff. If QChar is a typedef for unsigned short, then
Boost.Regex won't have that information - by default it will be trying to
get it from the std::locales facets which won't be specialised for unsigned
short.
So..... either if you don't mind using IBM's ICU library for unicode support
(Maybe QT use this already for it's Uniocde support?), then you can use
boost::u32regex to scan either utf8, utf16, or utf32 encoded text, see
http://www.boost.org/libs/regex/doc/icu_strings.html
However, if QT doesn't use ICU already, that's probably a dumb idea: having
two unicode lib's doesn't sound like a sensible idea to me :-( So that
leaves you needing to write your own traits class for QChar so you can use
basic_regex<QChar, my_traits_class>
as your regular expression type.
The formal requirements for the traits class are documented here:
http://www.boost.org/libs/regex/doc/concepts.html#traits , but I would
suggest that you use either c_regex_traits or icu_regex_traits as examples
to work from: basically cut and paste their interfaces into your code and
then slowly fill in the blanks. This traits class might be a worthwhile
addition to the library BTW if you're prepared to do a fully featured job
with it?
Let me know if this helps, and/or if you get stuck.
John.
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net