Boost :

Date view	Thread view	Subject view	Author view

From: Eric Niebler (eric_at_[hidden])
Date: 2004-10-20 14:48:31

Next message: Peter Dimov: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Previous message: Peter Dimov: "Re: [boost] Re: Re: Re: Any interest in adding unicode support to boost?"
In reply to: Erik Wien: "[boost] Re: Re: Re: Any interest in adding unicode support to boost?"
Next in thread: Peter Dimov: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Reply: Peter Dimov: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Reply: Rogier van Dalen: "Re: [boost] Re: Any interest in adding unicode support to boost?"

Erik Wien wrote:
> The iterators used are bidirectional, not random access (impossible on UTF-8
> and UTF-16)

No. Andrei Alexandrescu explained a scheme to me whereby a UTF-16
encoded string can have a random-access iterator, and I think it should.
The basic idea is you keep a plain array of 16-bit integers which are
the 16-bit characters and the first 16 bits of surrogate pairs. Then you
have a data structure which maps from string offsets to the second 16
bits of surrogate pairs. Random access involves a simple index and a map
look-up. Sequential access requires no map look-up. And since surrogate
pairs are very rare, the map will almost always be empty and the look-up
is skipped.

I think the default should be UTF-16 encoding, and that the iterator
should use a scheme like this to be random access. Rationale: there are
string algorithms that benefit from random access (Boyer-Moore comes to
mind).

-- 
Eric Niebler
Boost Consulting
www.boost-consulting.com

Next message: Peter Dimov: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Previous message: Peter Dimov: "Re: [boost] Re: Re: Re: Any interest in adding unicode support to boost?"
In reply to: Erik Wien: "[boost] Re: Re: Re: Any interest in adding unicode support to boost?"
Next in thread: Peter Dimov: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Reply: Peter Dimov: "Re: [boost] Re: Any interest in adding unicode support to boost?"
Reply: Rogier van Dalen: "Re: [boost] Re: Any interest in adding unicode support to boost?"

Date view	Thread view	Subject view	Author view

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk