Boost logo

Boost :

From: Daniel James (daniel_at_[hidden])
Date: 2005-03-21 04:00:45


Erik Wien wrote:
> Daniel James wrote:
>
>> Why should such a string class stop at unicode? Wouldn't it be a good
>> idea to support other encodings? It might be better to have such a class
>> as part of a separate library, probably with 'pluggable' encodings,
>> which would include unicode.
>
> That was the idea behind the "character_set_traits" class in the current
> prototype. You could just implement the tratis for some other encoding,
> and you'd be set. The problem though (and in my opinion it's a big one),
> is that for the encoded_string class (and any iostream implementation
> based on the same concepts) to be useable at all as a Unicode string
> class, we would have to include a lot of functionality that is Unicode
> specific. (Normalization is one example) What would we do with this
> functionality for Shift-JIS?

I have no idea ;) I know this is a complicated subject, and I'm far from
an expert.

I was writing about the suggested dyanmic string, 'utf_string', possibly
better called 'any_string', or 'encoded_string'. IMO your library should
concentrate on unicode (and perhaps encodings that are close enough to
unicode), and leave other encodings to other libraries. A dynamicly
encoded string class would probably require a different interface,
partly for efficiency's sake and partly because of the differences
between encodings. Also, it will be more important that it interacts
well with other string implementations.

Daniel


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk