Boost logo

Boost Users :

Subject: Re: [Boost-users] Thread local storage
From: Oliver Abert (abert_at_[hidden])
Date: 2009-03-30 10:12:11

On 30.03.2009, at 14:08, Anthony Williams wrote:

> Oliver Abert <abert_at_[hidden]> writes:
>>> Thanks for alerting me to this thread Peter.
>>> Oliver Abert <abert_at_[hidden]> writes:
>>>> On 29.03.2009, at 19:36, Peter Dimov wrote:
>>>>> Oliver Abert:
>>>>>> Hi Everyone,
>>>>>> I am using Boost Threads (1.38) as threading library and I also
>>>>>> use
>>>>>> the thread_specific_ptr to store a minor amount of data per
>>>>>> thread
>>>>>> (I think currently it is like 5 different pointer values per
>>>>>> thread). Technically everything works out fine, but I am
>>>>>> having a
>>>>>> performance problem on Mac OS X. On Linux the performance is 10
>>>>>> times faster than on Mac OS. If I use pthreads on Mac OS I have
>>>>>> identical performance to the Linux version. Both versions are
>>>>>> running on the same machine using 8 threads both.
>>>>> What does your profiler say?
>>>> about 80% of the time is spend in __spin_lock which in turnwas
>>>> called
>>>> by pthread_once. If I use only one thread (instead of 8) the
>>>> percantage goes down to 2.5% - which is still a bit much for my
>>>> taste.
>>> pthread_once is called by the thread_specific_ptr code to ensure
>>> that
>>> the TLS key it uses has been allocated and is valid. It's a real
>>> pain if
>>> that is too slow.
>> yes, i understand that so far - but there seems to be some more
>> serious problem. Is it possible that there is some unintended mutex
>> lock, because it seems like exactly that is happening. Maybe it is
>> related to the static variables, which might get mutexed
>> automatically? I heard there is a bug with the Apple gcc 4.0.1
>> regarding statics, but this morning I also tried the intel 11.0
>> compiler with the same dissapointing results. What makes me wonder,
>> ist that the same code runs just fine on Linux.
>> Some more background Information: The problem is definitevly caused
>> by
>> calls to get() of the shared pointer. I am using it in a realtively
>> hot section of my code. Profiling is not so helpful, because there
>> are
>> a bunch of unknown libraries in between my call and the pthread_once
>> call - and yes I also used a begug build of boost - I have not a clue
>> what is happening in between.
> Could you show the code that accesses the thread_specific_ptr?

Okay, the calling is done by a simple:

HierarchyTraverser *ht = RenderThread::hierarchyTraverser();

(there is nothing boost related stuff before and after that call)
while that is:

inline HierarchyTraverser* RenderThread::hierarchyTraverser()
#ifdef BOOST

and the mHierarchyTraverser is of type
  static boost::thread_specific_ptr<unsigned long int>

Hope that helps, but as you can see its basically pretty unspectacular.


> Anthony
> --
> Author of C++ Concurrency in Action |
> just::thread C++0x thread library |
> Just Software Solutions Ltd |
> 15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No.
> 5478976
> _______________________________________________
> Boost-users mailing list
> Boost-users_at_[hidden]

Boost-users list run by williamkempf at, kalb at, bjorn.karlsson at, gregod at, wekempf at