Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-10-21 01:21:30


Erik Wien wrote:

>> It's good to have one string class for library interoperability reasons.
>> Otherwise library A would demand utf8_string, library B would demand
>> utf16_string, and library C would demand utf32_string. No matter which
>> one you choose, you'll pay a price. (This doesn't change even if you
>> spell utf8_string as string<utf8>.)
>
> That is true. Though the strings of different encodings should be
> assignable to each other, libraries taking references to encoded_strings
> would need some conversion to be done.
>
> We have a similar problem today with basic_string<char> and
> basic_string<wchar_t>, and I think it could also be solved in a way that
> is very similar to what is done in the <string> header.

Just to clarify: the string and wstring in the standard have a huge problem:
you can't convert string to wstring in any way: there's just no appropriate
converting constructor.

> If we typedef a
> unicode_string or something as encoded_string<utf16>, and promote that as
> THE string class, most users would use that as their primary string
> representation, and simply be oblivious to the underlying encoding. (A
> good thing.)

That would still make it easy for a user to use some different encoding
without good reason.
 
> Advanced user could (just like we do today with basic_string) choose to
> support multiple encodings by templating their own functions on encoding
> as well.

Oh well. I just hope nobody will ever make an implementation of

   XML parser + XML Schema + XPath + XQuery + SOAP + HTML renderer

which is fully templated on string type, unless the same person speeds up
gcc by 10 times previously.

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk