Boost Users :

Date view	Thread view	Subject view	Author view

From: Ovanes Markarian (om_boost_at_[hidden])
Date: 2005-07-21 08:48:54

Next message: Adrian Grigore: "Re: [Boost-users] Linker Problems with VC 7.1/ Boost 1.32 / Stlport 4.61"
Previous message: Vladimir Prus: "Re: [Boost-users] 1.33 pointer to object serialization"
In reply to: John Maddock: "Re: [Boost-users] regex with multi-byte characters"
Next in thread: Jonathan Turkanis: "Re: [Boost-users] regex with multi-byte characters"

On Thu, July 21, 2005 14:54, John Maddock said:
>> What do you think? Could boost regex make usage of such traits_class or
>> you would not like to
>> include it into the distribution?
>
> I don't know, it depends what it does: how do you plan to handle character
> classification in a portable manner for unsigned short?
I plan to do it the same way Xerces-C does it. As I understand it they put 2 byte code into the
short and do various operations with it. I have to investigate how exactly it is done.

>
>> There are too many developers involved in the process, that we force all
>> to recompile Xerces-C
>> with specific settings. I don't think this would be an option for us. In
>> our case it can also lead
>> to unpredictable results, if one replaces xerces-c with freshly compiled
>> xerces-c without icu
>> support. I am a little bit sceptical about this.
>
> OK let me try one more time: if you compile regex *only* with ICU support,
> and use the iterator based u32regex_match/u32regex_search algorithms (or
> their equivalent regex iterators) then it doesn't matter what character type
> Xerces or anything else uses as long as:
>
> It's an 8-bit type: then it'll be treated as an [unsigned] UTF-8 encoded
> string.
> Or: It's a 16-bit type, then it'll be treated as an [unsigned] UTF-16
> encoded string.
> Or: It's a 32-bit type, then it'll be treated as an [unsigned] UTF-32
> encoded string.
>
> Is that generic enough for you? :-)
Yes, I will do some tests. If they will be ok, I will compile regex with ICU support. Otherwise I
will write my own traits class for unsigned short characters.

Thanks a lot for your help.

>
> John.
>
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]
> http://lists.boost.org/mailman/listinfo.cgi/boost-users
>

With Kind Regards,

Ovanes

Next message: Adrian Grigore: "Re: [Boost-users] Linker Problems with VC 7.1/ Boost 1.32 / Stlport 4.61"
Previous message: Vladimir Prus: "Re: [Boost-users] 1.33 pointer to object serialization"
In reply to: John Maddock: "Re: [Boost-users] regex with multi-byte characters"
Next in thread: Jonathan Turkanis: "Re: [Boost-users] regex with multi-byte characters"

Date view	Thread view	Subject view	Author view

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net