|
Boost Users : |
Subject: Re: [Boost-users] Boost's direction regarding UTF8 -> UTF32 andUTF32 -> UTF8
From: Joel de Guzman (joel_at_[hidden])
Date: 2010-06-24 18:46:12
On 6/25/10 3:16 AM, Mathias Gaunard wrote:
> John Maddock wrote:
>
>> Really? That would be a bug, the intention is that they should always
>> throw an exception when given invalid input.
>
> The iterator adapter has no way of knowing it has reached the end.
>
> Consider this in u16_to_u32_iterator:
>
> void increment()
> {
> // skip high surrogate first if there is one:
> if(detail::is_high_surrogate(*m_position)) ++m_position;
> ++m_position;
> m_value = pending_read;
> }
>
> If the last character is a high surrogate, you increment the iterator
> twice, while it is only allowed to do it once.
>
> Fixing the bug means making the iterator adapter have knowledge of the
> beginning, the end, and the current position.
>
>
>> Of course a more complete solution would always be welcome....
>
> My library deals with this.
How? By storing the beginning, the end, and the current position?
Regards,
-- Joel de Guzman http://www.boostpro.com http://spirit.sf.net
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net