Boost logo

Boost :

From: Vladimir Prus (ghost_at_[hidden])
Date: 2004-10-19 01:43:37


Hi Robert,

> "Miro Jurisic" <macdev_at_[hidden]> wrote in message
>> There is a lot Unicode work to be done in the standard C++ library and
> boost.
>> C++ currently has no Unicode-aware string abstraction, and this is a big
> problem
>> for anyone who has to deal with Unicode strings in C++ code. std::string
> is
>> poorly suited for any Unicode-savvy work, for many reasons -- mainly
> having to
>> do with the fact that std::string and STL and boost algorithms using
>> std::string::iterator don't know how to handle strings in accordance with
> the
>> Unicode spec.
>>

> I believe that STL and boost algorithms that handle std::string can (or
> should) be able to handle any std::basic_string<?> . That is my basis for
> the view that unicode shouldn't be a big issue.
...
> I'm willing to be convinced I'm wrong about this - but I just don't see
> it yet.

This was discussed extensively before. For example, Miro has pointed out
that even plain "find" is not suitable for unicode strings because some
characters can be represeted with several wchar_t values.

Then, there's an issue of proper collation. Given that Unicode can contain
accents and various other "marks", it is not obvious that string::operator<
will always to the right thing.

- Volodya


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk