Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2004-10-19 11:19:06


"Vladimir Prus" <ghost_at_[hidden]> wrote in message
news:cl2d2p$7a3$1_at_sea.gmane.org...
> This was discussed extensively before. For example, Miro has pointed out
> that even plain "find" is not suitable for unicode strings because some
> characters can be represeted with several wchar_t values.
>
> Then, there's an issue of proper collation. Given that Unicode can contain
> accents and various other "marks", it is not obvious that
string::operator<
> will always to the right thing.
>

My reference (Stroustrup, The C++ Programming language) shows the locale
class containing a function

template<class Ch, class Tr, class A> // compare strings using this locale
bool operator()(const basic_string<Ch, Tr, A> & const basic_string<Ch, Tr,
A> & ) const;

So I always presumed that there was a "unicode" locale that implemented this
as well all other required information. Now that I think about it I realize
that it was only a presumption that I never really checked. Now I wonder
what facitlities do most libraries do provide for unicode facets. I know
there are ansi functions for translating between multi-byte and wide
character strings. I've used these functions and they did what I expected
them to do. I presumed they worked in accordance with the currently
selected locale and its related facets. If the
basic_string<wchar_t>::operator<(...) isn't doing "the right thing" wouldn't
it be just a bug in the implementation of the standard library rather than a
candidate for a boost library?

Robert Ramey


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk