Boost logo

Boost Users :

Subject: [Boost-users] the performance of boost::lock_free is slow in centos 6 and 7
From: gao1738_at_[hidden]
Date: 2016-07-08 01:39:13


 Hi all,

I try the boost::lockfree::queue and find some performance issue£º

I use the following test programs:

lock_free_test.cc

#include <boost/thread/thread.hpp>
#include <boost/lockfree/queue.hpp>
#include <iostream>
#include<cstdio>

#include <boost/atomic.hpp>

boost::atomic_int producer_count(0);
boost::atomic_int consumer_count(0);

boost::lockfree::queue<int> queue(128);

const int iterations = 1000000;
const int producer_thread_count = 4;
const int consumer_thread_count = 4;

void producer(void)
{
  for (int i = 0; i != iterations; ++i) {
    int value = ++producer_count;
    while (!queue.push(value))
      ;
  }
}

boost::atomic<bool> done (false);
void consumer(void)
{
  int value;
  while (!done) {
    while (queue.pop(value))
      ++consumer_count;
  }

  while (queue.pop(value))
    ++consumer_count;
}

int main(int argc, char* argv[])
{
  using namespace std;
  cout << "boost::lockfree::queue is ";
  if (!queue.is_lock_free())
    cout << "not ";
  cout << "lockfree" << endl;

  boost::thread_group producer_threads, consumer_threads;//Ïß³Ì×é

  for (int i = 0; i != producer_thread_count; ++i)
    producer_threads.create_thread(producer);

  for (int i = 0; i != consumer_thread_count; ++i)
    consumer_threads.create_thread(consumer);

  producer_threads.join_all();
  done = true;

  consumer_threads.join_all();

  cout << "produced " << producer_count << " objects." << endl;
  cout << "consumed " << consumer_count << " objects." << endl;
}

locktest.cc

#include <boost/thread/thread.hpp>
#include <boost/lockfree/queue.hpp>
#include <iostream>
#include<cstdio>
#include <queue>

#include <boost/atomic.hpp>
using namespace std;

boost::mutex producer_count_mu;
boost::mutex consumer_count_mu;
int producer_count = 0;
int consumer_count = 0;

std::queue<int> message_queue;

boost::mutex queue_mutex;

const int iterations = 1000000;
const int producer_thread_count = 4;
const int consumer_thread_count = 4;

void producer(void)
{
  for (int i = 0; i != iterations; ++i) {
    queue_mutex.lock();
    int value = ++producer_count;
    message_queue.push(value);
    queue_mutex.unlock();
  }
}

bool done (false);
void consumer(void)
{
  int value;
  while (!done) {
    queue_mutex.lock();
    while (!message_queue.empty()) {
      message_queue.pop();
      ++consumer_count;
    }
    queue_mutex.unlock();
  }

  queue_mutex.lock();
  while (!message_queue.empty()) {
    message_queue.pop();
    ++consumer_count;
  }
  queue_mutex.unlock();
}
int main(int argc, char* argv[])
{
  using namespace std;
  cout << "boost::lockfree::queue is ";
// if (!queue.is_lock_free())
    cout << "not ";
  cout << "lockfree" << endl;

  boost::thread_group producer_threads, consumer_threads;//Ïß³Ì×é

  for (int i = 0; i != producer_thread_count; ++i)
    producer_threads.create_thread(producer);

  for (int i = 0; i != consumer_thread_count; ++i)
    consumer_threads.create_thread(consumer);

  producer_threads.join_all();
  done = true;

  consumer_threads.join_all();

  cout << "produced " << producer_count << " objects." << endl;
  cout << "consumed " << consumer_count << " objects." << endl;
}

The compile command is:
 g++ -I/usr/local/inlcude -L/usr/local/lib lock_free_test.cc -lboost_thread -lboost_system -o lock_free_test
g++ -I/usr/local/inlcude -L/usr/local/lib lock_test.cc -lboost_thread -lboost_system -o lock_test

1. I first test in on my work computer, which use ubuntu 14.04 with 2core(i5), with
boost version: 1.54
gcc version: 4.8.4
g++ version: 4.8.4

The test result is that:

time ./lock_test

boost::lockfree::queue is not lockfree

produced 4000000 objects.

consumed 4000000 objects.


real 0m3.844s

user 0m1.800s

sys 0m12.308s
 time ./lock_free_test

boost::lockfree::queue is lockfree

produced 4000000 objects.

consumed 4000000 objects.


real 0m1.745s

user 0m6.886s

sys 0m0.000s

We can see that the lock free solution has better performance, about 50%.

2. then I test it in a PC server with centos 6.4 , and 8 core (CPU Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz )

boost version: 1.54
gcc version: 4.4.7
g++ version:4.4.7

The test result is that:

time ./lock_test
boost::lockfree::queue is not lockfree
produced 4000000 objects.
consumed 4000000 objects.

real 0m3.900s
user 0m2.593s
sys 0m27.282s

 time ./lock_free_test
boost::lockfree::queue is lockfree
produced 4000000 objects.
consumed 4000000 objects.

real 0m5.470s
user 0m43.105s
sys 0m0.000s

Non lock free solution is better than lock free solution.

3. I test it in a better PC server with centos 7.1 and 32 core CPU (Intel(R) Xeon(R) CPU E7-4820 v2 @ 2.00GHz)
boost version: 1.53
gcc version: 4.8.3
g++ version: 4.8.3

time ./lock_test
boost::lockfree::queue is not lockfree
produced 4000000 objects.
consumed 4000000 objects.

real 0m3.023s
user 0m1.929s
sys 0m20.706s

time ./lock_free_test
boost::lockfree::queue is lockfree
produced 4000000 objects.
consumed 4000000 objects.

real 0m9.804s
user 1m14.900s
sys 0m0.100s

The lock free solution will be 3 times lower than the non-lock free solution!

My question is that:
1. why lock free solution will get better performance in ubuntu but much slower in centos 6 and 7?
    Is it the issue of kernal or the gcc version or the boost version?
    The more cpu in the machine the worse performance for lock free solution?

2. In which case, we should use the boost lock free solution to get better performance?

Best Regards!

dennis



Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net