From: Ion GaztaÃ±aga (igaztanaga_at_[hidden])
Date: 2008-06-23 12:33:29
Dean Michael Berris wrote:
> The DLmalloc implementation you compared against was written in C
> which you adapted to C++, correct?
I really haven't touched DLmalloc source. I've taken the 2.8.4 version
of DLmalloc (dlmalloc_pre_2_8_4.c in the archive) I've included that c
file in another dlmalloc_ext.c file that implements some new functions
(just the ones going to the global heap, I haven't implemented mspaces)
that implement expand in place, burst allocation, burst deallocation,
that use some internal structures and function of dlmalloc. The
"modified" DLMalloc (dlmalloc_ext.c) is (should be) compilable with a C
compiler and has C interface through functions and macros. A C
programmer could take advantage of burst allocation.
The extension is not clean code but dig into DLmalloc internals is not
easy. Then, I just built STL-like allocators wrapping those functions.
See http://www.drivehq.com/web/igaztanaga/allocplus.zip source for more
> I for one have been looking for a better alternative to both the
> standard allocator and the Boost.Pool allocator. A DLmalloc based
> implementation would be interesting to see if not as a
> Boost-provided/included implementation or one day a standard allocator
> alternative implementation.
I think you will find adaptive allocators very interesting. You also
have a Boost.Pool-like allocator (simple segregated storage,supporting
burst allocation) in the library. It was used to test adaptive pools
against classic node allocators.
> - have you tested how your implementation performs on multiple threads?
No. I've tested the library *without* locks and burst allocation is
still faster (this means that the speedup does not come from locking
minimization only). I think means that multithreading allocators like
ptmalloc or nedmalloc based on DLmalloc could also benefit from these
The ideas comes from Interprocess, where one just can't have per-thread
pools. That's why I just implemented some classic strategies: buffer
reuse (realloc) or pack allocations (burst). These mechanisms are easy
to implement, to understand and offer real speed benefits for many
applications. Not every application is heavily multi-threaded.
> - have you tried measuring the direct effect of random-sized,
> random-timed, random-ordered allocations/deallocations?
No, I just run out of time ;-). If anyone tests this, I would be glad.
> - i notice that you were using vmware; admittedly the effect of
> running Linux in a VMWare instance as a guest already causes
> performance degradation, have you considered using something else that
> takes advantage of processor virtualization features better (like
> Virtualbox)? or better yet, have you tried running it on a native
> Linux implementation?
Last time grub broke my startup configuration I decided to erase my
linux partition ;-). I know there is some performance impact but that
should affect both the standard allocator and DLmalloc.
> I'm looking forward to the answers to my questions and the further
> improvement of this article. Thanks very much for sharing this!
After so much work implementing and testing I needed to share this, at
least to think it was worth the effort. It has been already useful to
improve Interprocess containers but if this is found useful also with
heap allocators, that would be nice.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk