From: Andy Little (andy_at_[hidden])
Date: 2004-10-22 16:52:34
"Erik Wien" <wien_at_[hidden]> wrote in message
> Hi. I am in the process of planning a library for handling unicode strings
> in C++, and would like to probe the interest in the boost community for
> something like that. I read through the unicode dicussion that was up back
> in april, and from what I could gather there was some amount of interest,
> but no one felt comfortable taking on the task as of yet.
> I really feel the C++ language needs some form of standardized unicode
> support, and developing such a library within the boost community would be
> very good way to ensure it fits everybody's needs the best possible way.
> If you have any, and I do mean ANY, thoughts on this, please do not
> to reply to this mail and let me know. I'm looking forward to your
FWIW Here my thoughts..
There is no equivalence between std::string (aka std::string, std::wstring)
and a sequence of characters conforming to an encoded sequence (aka
However an encoded-string can (potentially) be converted to a string, but
not the other way round, because the std::string does not provide adequate
For an encoding scheme to work the encoding must be provided, and must be
run time. The best way to do this for various encodings is to use packets,
with headers providing the information regarding the contents, eg type of
encoding, number of characters, checksum etc. These packets themselves could
be manipulated in std::strings (including sequences of packets), which could
then be used to perform operations where the encoding is not important.
This should combine the best combination of performance, both in speed and
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk