Boost logo

Boost :

From: Shunsuke Sogame (mb2act_at_[hidden])
Date: 2006-07-06 01:43:23


Sean Parent wrote:
>> This is very close to what I have in mind. The main difference is that
>> the functions/algorithms in my mind take ranges instead of iterators.
>> Thus:
>>
>> to_lower(src, dest)
>> to_upper(src, dest)
> So long as you don't require ranges (or a pair of iterators makes a
> valid range and dest can still be an output iterator). That's fine -
> these should work on char* as well as container types. I don't know
> what kind of ranges you have for dest which allow dest to change size
> - seems a bit problematic.
>
> I want iterators that can handle the encoding transform. I want to be
> able to write items like the following:
>
> std::string s = get_some_utf_8_xml_data();
>
> // Find the BOM character as a UTF-32 character
>
> utf_iterator_t i = std::find(utf_iterator_t(s.begin()), utf_iterator_t
> (s.end()), UL0x0000FEFF);
>
> assert(*i.base() == U0xEF); // base iterator points to start of UTF-8
> character

Boost has (unofficially?) such iterators.
Look into <boost/regex/pending/unicode_iterator.hpp>

-- 
Shunsuke Sogame

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk