Boost logo

Boost Users :

From: Ovanes Markarian (om_boost_at_[hidden])
Date: 2005-07-21 08:48:54

On Thu, July 21, 2005 14:54, John Maddock said:
>> What do you think? Could boost regex make usage of such traits_class or
>> you would not like to
>> include it into the distribution?
> I don't know, it depends what it does: how do you plan to handle character
> classification in a portable manner for unsigned short?
I plan to do it the same way Xerces-C does it. As I understand it they put 2 byte code into the
short and do various operations with it. I have to investigate how exactly it is done.

>> There are too many developers involved in the process, that we force all
>> to recompile Xerces-C
>> with specific settings. I don't think this would be an option for us. In
>> our case it can also lead
>> to unpredictable results, if one replaces xerces-c with freshly compiled
>> xerces-c without icu
>> support. I am a little bit sceptical about this.
> OK let me try one more time: if you compile regex *only* with ICU support,
> and use the iterator based u32regex_match/u32regex_search algorithms (or
> their equivalent regex iterators) then it doesn't matter what character type
> Xerces or anything else uses as long as:
> It's an 8-bit type: then it'll be treated as an [unsigned] UTF-8 encoded
> string.
> Or: It's a 16-bit type, then it'll be treated as an [unsigned] UTF-16
> encoded string.
> Or: It's a 32-bit type, then it'll be treated as an [unsigned] UTF-32
> encoded string.
> Is that generic enough for you? :-)
Yes, I will do some tests. If they will be ok, I will compile regex with ICU support. Otherwise I
will write my own traits class for unsigned short characters.

Thanks a lot for your help.

> John.
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]

With Kind Regards,


Boost-users list run by williamkempf at, kalb at, bjorn.karlsson at, gregod at, wekempf at