Boost logo

Boost :

From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2008-05-14 16:07:01

James Sutherland wrote:
> I have been testing thread performance on Linux and Mac. My Linux
> system has two dual-core processors and my Mac has one dual-core
> processor. Both are intel chips.
> For the code snippet given below, the execution time should ideally
> decrease as the number of threads increases. However, the opposite
> trend is observed. For example, using -O3 flags on my Linux desktop
> produces the following timings:
> 1 Thread: 0.66 sec
> 2 Threads: 0.9 sec
> 3 Threads: 1.2 sec
> 4 Threads: 1.4 sec
> I do not have a lot of experience with threads, and was wondering if
> this result surprises anyone?

Hi James,

Quoting your code out of order:
> for( int itask=0; itask<nTasks; ++itask ){
> boost::thread_group threads;
> for( int i=0; i<nThreads; ++i ){
> threads.create_thread( MyStruct(itask++ + 100) );
> }
> threads.join_all();
> }

Did you really want the ++itask in the first for() ? Isn't it being
incremented enough in the create_thread line?

> struct MyStruct
> {
> explicit MyStruct(const int i) : tag(i) {}
> void operator()() const
> {
> const int n = 100;
> std::vector<int> nums(n,0);
> for( int j=0; j<1000000; ++j )
> for( int i=0; i<n; ++i )
> nums[i] = i+tag;
> }
> private:
> int tag;
> };

So sizeof(MyStruct)==sizeof(int) [for the tag]. Now, if you were
creating the MyStruct objects like this:

MyStruct my_structs[n];

then I would say that they are all sharing a cache line, and that cache
line is being fought over by the different processors when they read
tag, and that you should add some padding. But you're not; you're
passing a temporary MyStruct to create_thread which presumably stores a
copy of it. How does boost::thread_group store the functors that are
passed to it? If it is storing them in some sort of array or vector
then that could still be the problem - and it could be fixed by adding
padding inside boost.thread, or by copying the functor onto the new
thread's stack.

Also, I would imagine that the compiler would keep tag in a register.
What happens if you declare it as const?

I suggest that you try adding some padding and see what happens.


Boost list run by bdawes at, gregod at, cpdaniel at, john at