Boost logo

Boost Users :

From: Andy (atompkins_at_[hidden])
Date: 2006-03-07 13:29:36

Pavol Droba <droba_at_[hidden]> wrote in

> On Fri, Mar 03, 2006 at 11:29:58PM +0100, Olaf van der Spek wrote:
> > >
>> >
>> > Well, this is a more complicated point. Small optimization you are
>> > proposing does not really make any difference. I have read a paper
>> > (I don't remember where),
>> Why not? Isn't a simple compare much faster than two table lookup?
> Ok, maybe you are right. One compare will not cost much and it can
> speedup few cases.
>> > where author proposes to cache the results of tolower during the
>> > comparison.
>> I'm not sure how that would work, could you explain?
> I don't remember the details. The idea was to eliminate some of the
> virual calls, that take place when dealing with locales. The algorihtm
> was described on several pages. Unfortunaltely, I forgot where I have
> seen that paper.
> Regards, Pavol

I believe what you are looking for is found in Effective STL by Scott
Meyers (pages 229-237). On page 236-237 he writes a case insensitive
string comparison functor that has this optimization, here it is:

struct lt_str_2 :
  public std::binary_function<std::string, std::string, bool>
  struct lt_char {
    const char* tab; lt_char(const char* t) : tab(t) {} bool
    operator()(char x, char y) const {
      return tab[x-CHAR_MIN] < tab[y-CHAR_MIN];

  char tab[CHAR_MAX - CHAR_MIN + 1]; lt_str_2(const std::locale& L =
  std::locale::classic()) {
    const std::ctype<char>& ct = std::use_facet<std::ctype<char> >(L);
    for (int i=CHAR_MIN; i<=CHAR_MAX; ++i)
      tab[i-CHAR_MIN] = (char)i;

    ct.toupper(tab, tab + (CHAR_MAX - CHAR_MIN + 1));
  bool operator()(const std::string& x, const std::string& y) const {
    return std::lexicographical_compare(x.begin(), x.end(),
                                        y.begin(), y.end(),

He then talks about the pros and cons of this. "This isn't a fully
general solution ..., if your were working with 32-bit UCS-4
characters." "This might be slower if you're creating an lt_str_2
function object, using it to compare a few short string, and then
throwing it away." "For any substantial use, though, lt_str_2 will be
noticeable faster"

I hope that I am allowed to reproduce this. If I am not, I humbly ask
Scott Meyers for forgiveness.


Boost-users list run by williamkempf at, kalb at, bjorn.karlsson at, gregod at, wekempf at