Boost logo

Boost Users :

From: Jens Seidel (jensseidel_at_[hidden])
Date: 2007-09-28 05:43:04


On Fri, Sep 28, 2007 at 10:29:09AM +0100, John Maddock wrote:
> Anjaly wrote:
> >> Hai,
> >> Thank you for your response. I have catched the exception.Now the
> >> program does not crash but the searching is incomplete. Even if the
> >> file is of encoding type utf16 ,the exception occurs(I have used a
> >> message box to show reason of exception). Is the problem due to
> >> reading the file and storing in char array or due to making the
> >> regex expression. I have attached the file in which i am searching.
> >> Hope you can help me.
>
> The first byte in the file is 0xFF which is not a valid UTF8 character,
> likewise the second byte is 0xFE which is also not used in UTF8: so there's
> no way to decode the file and convert to UTF32.
>
> However, if I start reading from the third byte in the file, then the search
> does go through to the end: I can't guarentee that the content was correct
> though !

That's the valid byte order mark. See
e.g. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

Jens


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net