Boost logo

Boost Users :

From: Anjaly (anjaly_at_[hidden])
Date: 2007-09-30 23:52:43


In the regex document it was said that the size of data type of the
variable passed to the make_u32regex that determines character encoding
(utf8,utf16 or utf32) . I passed wchar_t (which i think size is 4) so
that the buffer encoding is considered as utf8 by u32regex_search
irrespectively. Actually i am trying to do a utf8 search.

                                 Anjaly G S

On Fri, 2007-09-28 at 10:53 +0100, John Maddock wrote:
> Jens Seidel wrote:
> > That's the valid byte order mark. See
> > e.g. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058
>
> Right: it's a byte order mark for UTF16LE, but the user is trying to read it
> as a UTF8 sequence.
>
> If the file is indeed UTF16LE then it's up to the user to read it into a
> sequence of valid UTF16 code points before passing to Boost.Regex.
>
> HTH, John.
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users

______________________________________
Scanned and protected by Email scanner


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net