Boost logo

Boost :

From: Erik Wien (wien_at_[hidden])
Date: 2005-04-04 06:42:13


Daniel James wrote:
>> That was the idea behind the "character_set_traits" class in the
>> current prototype. You could just implement the tratis for some other
>> encoding, and you'd be set. The problem though (and in my opinion it's
>> a big one), is that for the encoded_string class (and any iostream
>> implementation based on the same concepts) to be useable at all as a
>> Unicode string class, we would have to include a lot of functionality
>> that is Unicode specific. (Normalization is one example) What would we
>> do with this functionality for Shift-JIS?
>
> I have no idea ;)

Neither do I. :) That's why I feel it's a dead end.

> I was writing about the suggested dyanmic string, 'utf_string', possibly
> better called 'any_string', or 'encoded_string'.

Actually, it *is* already called encoded_string. I think
code_point_string would be a more descriptive name, given it's function
though. I'm not sure what it will end up being.

  IMO your library should
> concentrate on unicode (and perhaps encodings that are close enough to
> unicode), and leave other encodings to other libraries. A dynamicly
> encoded string class would probably require a different interface,
> partly for efficiency's sake and partly because of the differences
> between encodings. Also, it will be more important that it interacts
> well with other string implementations.

Yes, that is basically how I am beginning to feel too. After all, since
Unicode is supported by all major players in the industry, it will (I
hope) eventually take over for all the encodings in existance today.
Support for those encodings will therefore not be as important in the
future, making concentrating on Unicode exclusively a more viable solution.

- Erik


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk