Boost logo

Boost :

From: Peter Dimov (pdimov_at_[hidden])
Date: 2004-10-20 14:46:29


Erik Wien wrote:
> Right now i have a single encoded_string class that has two template
> parameters, namely encoding and encoding_traits. encoding_traits is a
> class where all encoding specific implementation is kept, and this
> class is used to setup the encoded_string class to correctly represent
> strings in the given encoding.

Yes, that's close to what I thought.

Do not repeat the basic_string mistake and make encoding_traits a template
parameter. A traits class is never used in this way. Your encoding_traits is
actually a policy.

A traits class is independent of the components that use it. It is basically
a mapping from a type to something; in your case, a mapping between the
encoding parameter and the operations.

So, encoding_traits aside, you essentially have string<utf8>.

> The iterators used are bidirectional, not random access (impossible
> on UTF-8 and UTF-16) and they are as of now not constant. It IS
> possible to assign a code unit to a UTF-8 encoded string through an
> iterator, even if the resulting code unit sequence would be longer
> than the one the iterator is pointing to. The underlying container is
> automatically resized to make room for the new sequence. (This is of
> course slow!)

This is another basic_string mistake that effectively rules out efficient
reference counting. ;-) Just make the iterators constant. The functionality
can be obtained with explicit erase/insert/replace members.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk