Boost logo

Boost Users :

Subject: Re: [Boost-users] Thread local storage
From: Anthony Williams (anthony.ajw_at_[hidden])
Date: 2009-03-30 08:08:39


Oliver Abert <abert_at_[hidden]> writes:

>> Thanks for alerting me to this thread Peter.
>>
>> Oliver Abert <abert_at_[hidden]> writes:
>>
>>> On 29.03.2009, at 19:36, Peter Dimov wrote:
>>>
>>>> Oliver Abert:
>>>>> Hi Everyone,
>>>>>
>>>>> I am using Boost Threads (1.38) as threading library and I also use
>>>>> the thread_specific_ptr to store a minor amount of data per thread
>>>>> (I think currently it is like 5 different pointer values per
>>>>> thread). Technically everything works out fine, but I am having a
>>>>> performance problem on Mac OS X. On Linux the performance is 10
>>>>> times faster than on Mac OS. If I use pthreads on Mac OS I have
>>>>> identical performance to the Linux version. Both versions are
>>>>> running on the same machine using 8 threads both.
>>>>
>>>> What does your profiler say?
>>>
>>> about 80% of the time is spend in __spin_lock which in turnwas called
>>> by pthread_once. If I use only one thread (instead of 8) the
>>> percantage goes down to 2.5% - which is still a bit much for my
>>> taste.
>>
>> pthread_once is called by the thread_specific_ptr code to ensure that
>> the TLS key it uses has been allocated and is valid. It's a real
>> pain if
>> that is too slow.
>
> yes, i understand that so far - but there seems to be some more
> serious problem. Is it possible that there is some unintended mutex
> lock, because it seems like exactly that is happening. Maybe it is
> related to the static variables, which might get mutexed
> automatically? I heard there is a bug with the Apple gcc 4.0.1
> regarding statics, but this morning I also tried the intel 11.0
> compiler with the same dissapointing results. What makes me wonder,
> ist that the same code runs just fine on Linux.
>
> Some more background Information: The problem is definitevly caused by
> calls to get() of the shared pointer. I am using it in a realtively
> hot section of my code. Profiling is not so helpful, because there are
> a bunch of unknown libraries in between my call and the pthread_once
> call - and yes I also used a begug build of boost - I have not a clue
> what is happening in between.

Could you show the code that accesses the thread_specific_ptr?

Anthony

-- 
Author of C++ Concurrency in Action | http://www.manning.com/williams
just::thread C++0x thread library   | http://www.stdthread.co.uk
Just Software Solutions Ltd         | http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net