Subject: Re: [boost] [Block Pointer] benchmark
From: Frank Mori Hess (frank.hess_at_[hidden])
Date: 2011-05-26 19:44:36

On Wednesday, May 25, 2011, Phil Bouchard wrote:
> On 5/25/2011 1:39 PM, Frank Mori Hess wrote:
> > What version of boost are you referring to?
> boost_1_46_1
> > make_shared used to be slow due
> > to it doing a lot of unnecessary copying of the storage area for the
> > pointee, but that should have been fixed probably a couple years back
> > now. Copying a shared_ptr is fairly slow due to the atomic reference
> > counting, but I would expect a compiler to be able to elide the copy
> > of the make_shared return value.
> That optimization doesn't exist and it's worse with the Intel Compiler:

Well, I was thinking of the case where you are initializing a newly declared
shared_ptr as opposed to re-assigning the same shared_ptr over and over
again. Running a trimmed down and tweaked version of your benchmark
(attached) I get:

$ g++ -O3 -Wall pb_benchmark.cpp -lrt
fhess_at_tailpipe:~/test$ ./a.out
shared_ptr: 17500558 ns
shared_ptr: 12208956 ns

Note, I moved the declaration of the shared_ptrs inside the loops. Also,
your benchmark on my machine shows a strong dependence on the order the
tests are run. The first run is slower, which is why the "new" result is
before the "make" result above, to demonstrate the effect.

        Copyright (c) 2011 Phil Bouchard <phil_at_[hidden]>.

        Distributed under the Boost Software License, Version 1.0.

        See accompanying file LICENSE_1_0.txt or copy at

        See for documentation.

#include <sys/time.h>

#include <memory>
#include <iostream>
#include <boost/shared_ptr.hpp>
#include <boost/make_shared.hpp>
#include <ctime>
#include <iomanip>
#include <limits>

using namespace std;
using namespace boost;

timespec diff(timespec start, timespec end);

int main(int argc, char* argv[])
        timespec ts[2];
        cout << "new:" << endl;

        clock_gettime(CLOCK_PROCESS_CPUTIME_ID, & ts[0]);
        for (int i = 0; i < 100000; ++ i)
                shared_ptr<int> p(new int());
        clock_gettime(CLOCK_PROCESS_CPUTIME_ID, & ts[1]);
        cout << "shared_ptr:\t" << setw(numeric_limits<long>::digits10 + 2) << diff(ts[0], ts[1]).tv_nsec << " ns" << endl;

        cout << "make:" << endl;

        clock_gettime(CLOCK_PROCESS_CPUTIME_ID, & ts[0]);
        for (int i = 0; i < 100000; ++ i)
                shared_ptr<int> p = make_shared<int>();
        clock_gettime(CLOCK_PROCESS_CPUTIME_ID, & ts[1]);
        cout << "shared_ptr:\t" << setw(numeric_limits<long>::digits10 + 2) << diff(ts[0], ts[1]).tv_nsec << " ns" << endl;
    return 0;

timespec diff(timespec start, timespec end)
        timespec temp;
        if ((end.tv_nsec-start.tv_nsec)<0) {
                temp.tv_sec = end.tv_sec-start.tv_sec-1;
                temp.tv_nsec = 1000000000+end.tv_nsec-start.tv_nsec;
        } else {
                temp.tv_sec = end.tv_sec-start.tv_sec;
                temp.tv_nsec = end.tv_nsec-start.tv_nsec;
        return temp;


