Boost logo

Boost Users :

From: Gitit (yg-boost-users_at_[hidden])
Date: 2002-10-31 05:52:58


Hi,

We want to use regex++ (version 3.31) with UTF-8 strings.
I tried to match a UTF-8 character of 2 bytes to the regex "." and the match
failed. It seems regex++ handles these 2 bytes as two separate characters.

1) Is there a "native" way in the regex++ library for using UTF-8 strings?
Can we use UTF-8 strings to compare against a compiled regex (the regex is
in ASCII only)? Can the regex itself hold UTF-8 characters?

2) Is converting to wchar_t our only option? As far as I understand, wchar_t
does not cover the entire range of characters covered by UTF-8, so it may
not be enough. Any other ideas?

thanks,
Gitit.


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net