Boost logo

Boost :

From: Christophe Meessen (christophe_at_[hidden])
Date: 2003-10-18 11:02:33


Hello,

using boost threads with vc7.1 is not easy. It doesn't provide support
for static library. I had to make a VC7.1 project to be able to compile
it. I hacked out the on_thread_exit since there is no trivial solution
for this on win32. Its ok for now since I don't use tss.

I then compared the performance of using scoped_lock with using the
native win32 critical section calls. To my surprise scoped_lock is tree
time slower than native win32 calls. Then I tried implementing my own
scope_lock without support for lock methods. I named this class
synschronize and mutex was named synchronizable. You may see the code below.

These tests where performed on a DELL Lattitude C800 (800MHz, 256Mbyte).
The test consisted in calling one billion time (1 000 000 000) a simple
++x instruction.

 let x be a protected variable using the different technologies.
The loop was the following.

for( int i = 0; i < 1000000000; ++i ) ++x;

I used xtime to compute time.

I also tried with linux (second numbers) (1200GHz,512MByte) by
replacing the CriticalSection calls with pthread_mutex calls. The
performance difference is incredible. Any clue ? On Win32 the overhead
of scoped_lock is much more visible. On Linux, pthread is so sloooow
that the difference is not so significant.
Are there equivalent performance evaluations around on boost scoped_lock
classes ?

1° using Win32 native calls: 65 sec. (65 nsec/iteration) 178 sec
(pthreads)
---------------------------
code:
unsigned long operator++()
{
    EnterCriticalSection( &m_m );
    unsigned long var = m_var++;
    LeaveCriticalSection( &m_m );
    return var;
}
It is an overkill for 32 bit values, but I just wanted to measure the
mutex protection not the assignement operations.

2° using scoped_lock call: 185 sec (185 nsec/iteration) 220 sec (pthreads)
--------------------------

code:
unsigned long operator++()
{
    scoped_lock l( m_m );
    return ++m_var;
}
Very nice, short and clean code. But three time slower than native calls !

3° using synchronize call: 68 sec (68 nsec/iteration) 185 sec (pthreads)
 -------------------------

code:
unsigned long operator++()
{
    synchronize on( m_m );
    return ++m_var;
}
Code is equivalent to scoped_lock. But performance is equivalent as when
I use native calls.

ANNEX
---------

The synchronize stuff is just a quick hack. I picked that name because
it is the same concept as synchronize in Java. But the word's meaning is
orthogonal to the programming semantic. The programming model is the
same as for scoped_lock except that the synchronize object does carry
the m_locked flag. It is useless for most frequent needs and makes it
none thread safe. syncrhonize is thread safe and as fast as native calls
(at least on windows).

Code for synchronize.

class synchronize;

class synchronizable
{
      friend class synchronize;
 public:
      synchronizable() { InitializeCriticalSection( &m_cs ); }
      ~synchronizable() { DeleteCriticalSection( &m_cs ); }
    protected:
        void lock() { EnterCriticalSection(&m); }
        void unlock() { LeaveCriticalSection(&m); }
    private:
        CRITICAL_SECTION m;
};

class synchronize
{
 public:
       synchronize( synchronizable& s ) : m_s(s) { m_s.lock(); }
       ~synchronize() { m_s.unlock(); }
  private:
       synchronizable& m_s;
};

 

-- 
Bien cordialement,
Ch. Meessen

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk