Boost logo

Boost :

From: Ian McCulloch (ianmcc_at_[hidden])
Date: 2005-11-25 18:58:58


Hi Robert,

I think you should check your benchmark code again. I think it is not doing
what you think it is doing.

class oprimitive
{
public:
    // default saving of primitives.
    template<class T>
    void save(const T & t)
    {
        save_binary(&t, sizeof(T));
    }

    // default saving of arrays.
    template<class T>
    void save_array(const T * p, std::size_t n)
    {
        save_binary(p, n*sizeof(T));
    }

    void save(const std::string &s) { abort(); }

    void save_binary(const void *address, std::size_t count)
    {
        std::memcpy(buffer, address, count);
        p += count;
    }
    
    std::size_t size() { return s;}
    void reserve(std::size_t n){
        s = n;
        p = 0;
        buffer = new char[n];
    }
    ~oprimitive(){
        delete buffer;
    }
private:
    std::size_t s;
    std::size_t p;
    char * buffer;
};

There is a bug here: the oprimitive::save_binary() function always writes to
the *start* of the buffer. Incrementing 'p' here has no effect. It is not
too surprising that you see that a lot of repeated calls to save_binary()
with a small sized object is much faster than a single call to
save_binary() with a large object, because in the first case a single
memory address is being overwritten repeatedly (with lots of scope for
misleading compiler optimizations!), whereas the second case is limited by
the memory bandwidth.

Secondly, the buffer in the oprimitive class has much less functionality
than the vector<char> buffer, as well as the buffer I used previously
(http://lists.boost.org/Archives/boost/2005/11/97156.php). In particular,
it does not check for buffer overflow when writing. Thus it has no
capability for automatic resizing/flushing, and is only useful if you know
in advance what the maximum size of the serialized data is. This kind of
buffer is of rather limited use, so I think that this is not a fair
comparison.

FWIW, I include the benchmark I just ran. Amd64 g++ 3.4.4 on linux 2.6.10,
and cheap (slow!) memory ;)

vector<char> buffer:

Time using serialization library: 3.79
Size is 100000004
Time using direct calls to save in a loop: 3.42
Size is 100000000
Time using direct call to save_array: 0.16
Size is 100000000

primitive buffer (with the save_binary() function modified to do "buffer +=
count"):

Time using serialization library: 1.57
Size is 100000004
Time using direct calls to save in a loop: 1.35
Size is 100000000
Time using direct call to save_array: 0.16
Size is 100000000

Interestingly, on this platform/compiler combination, without the bug fix in
save_binary() it still takes 1.11 seconds ;) I would guess your Windows
compiler is doing some optimization that gcc is not, in that case.

Regards,
Ian


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk