Thread creation overhead & efficiency

I am a new user of boost::thread. After implementing it I found that it slowed down execution of my code. I suspect that it may be due to the overhead of thread creation. I am creating/destroying thread_group ojects many times (tens of thousands) during code execution. Pardon my naïve question, but is this a problem for performance? I noticed some reference to a thread pool on this list. Is a thread pool the best alternative in this situation? Thanks for any guidance, James

On Jan 29, 2008 5:29 PM, James Sutherland <James.Sutherland@utah.edu> wrote:
[...] overhead of thread creation. I am creating/destroying thread_group ojects many times (tens of thousands) during code execution. Pardon my naïve question, but is this a problem for performance?
You may want to try http://threadpool.sourceforge.net/ Sebastian

On 1/29/08 9:54 AM, "Sebastian Gesemann" <s.gesemann@gmail.com> wrote:
You may want to try http://threadpool.sourceforge.net/
I was looking into that, but was wondering if anyone had experience to suggest that creation/destruction of threads was a significant overhead before I implement pools. James

Hi James - My impression was that creating new threads on Windows can be quite expensive, but I don't have a lot of experience there. Creating threads in Linux using posix_threads (also via boost) seems to be quite fast. However, I have noticed that thread_groups are much slower (factor of 4) than using my own vector<boost::thread*> and joining each individually. This is probably because I have only one controlling thread for my vector, so I don't need mutex. Brian On Jan 29, 2008 10:28 AM, James Sutherland <James.Sutherland@utah.edu> wrote:
On 1/29/08 9:54 AM, "Sebastian Gesemann" <s.gesemann@gmail.com> wrote:
You may want to try http://threadpool.sourceforge.net/
I was looking into that, but was wondering if anyone had experience to suggest that creation/destruction of threads was a significant overhead before I implement pools.
James _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Sorry, I have to amend my statement. I just ran an experiment which launched 10000 threads vs launching 4 threads and doing the same amount of work. Results seem to indicate that the former is quite expensive on Linux too -- about 2 orders of magnitude different in my test. Note that some of this effect could also be from side effects from time slicing, etc... in the kernel scheduler. Brian On Jan 29, 2008 11:25 AM, Brian Budge <brian.budge@gmail.com> wrote:
Hi James -
My impression was that creating new threads on Windows can be quite expensive, but I don't have a lot of experience there. Creating threads in Linux using posix_threads (also via boost) seems to be quite fast.
However, I have noticed that thread_groups are much slower (factor of 4) than using my own vector<boost::thread*> and joining each individually. This is probably because I have only one controlling thread for my vector, so I don't need mutex.
Brian
On Jan 29, 2008 10:28 AM, James Sutherland <James.Sutherland@utah.edu> wrote:
On 1/29/08 9:54 AM, "Sebastian Gesemann" <s.gesemann@gmail.com> wrote:
You may want to try http://threadpool.sourceforge.net/
I was looking into that, but was wondering if anyone had experience to suggest that creation/destruction of threads was a significant overhead before I implement pools.
James _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

On Tue, 29 Jan 2008 11:37:58 -0800, Brian Budge wrote:
Sorry, I have to amend my statement. I just ran an experiment which launched 10000 threads vs launching 4 threads and doing the same amount of work. Results seem to indicate that the former is quite expensive on Linux too -- about 2 orders of magnitude different in my test.
I think this also depends on whether your kernel has support for NPTL. -- Sohail Somani http://uint32t.blogspot.com

I have tried threadpool. It is my understanding that the code fragment below should run in constant time independent of the number of threads (assuming that the ³Test::operator()² method runs in constant time). However, I observe an increase in runtime as I increase the pool size from say 1 to 5. Any thoughts on this? James
int main() { const int nt = 5; const int nit = 10000;
time_t t1 = clock();
boost::threadpool::pool p(1); for( int j=0; j<nit; ++j ){ for( int i=0; i<nt; ++i ){ p.schedule( Test(i) ); } p.wait(); }
std::cout << "t=" << difftime(clock(), t1) << std::endl; return 0; }
participants (4)
-
Brian Budge
-
James Sutherland
-
Sebastian Gesemann
-
Sohail Somani