Boost logo

Boost :

From: Robert Ramey (ramey_at_[hidden])
Date: 2004-10-19 11:19:06

"Vladimir Prus" <ghost_at_[hidden]> wrote in message
> This was discussed extensively before. For example, Miro has pointed out
> that even plain "find" is not suitable for unicode strings because some
> characters can be represeted with several wchar_t values.
> Then, there's an issue of proper collation. Given that Unicode can contain
> accents and various other "marks", it is not obvious that
> will always to the right thing.

My reference (Stroustrup, The C++ Programming language) shows the locale
class containing a function

template<class Ch, class Tr, class A> // compare strings using this locale
bool operator()(const basic_string<Ch, Tr, A> & const basic_string<Ch, Tr,
A> & ) const;

So I always presumed that there was a "unicode" locale that implemented this
as well all other required information. Now that I think about it I realize
that it was only a presumption that I never really checked. Now I wonder
what facitlities do most libraries do provide for unicode facets. I know
there are ansi functions for translating between multi-byte and wide
character strings. I've used these functions and they did what I expected
them to do. I presumed they worked in accordance with the currently
selected locale and its related facets. If the
basic_string<wchar_t>::operator<(...) isn't doing "the right thing" wouldn't
it be just a bug in the implementation of the standard library rather than a
candidate for a boost library?

Robert Ramey

Boost list run by bdawes at, gregod at, cpdaniel at, john at