|
Boost : |
From: David White (dave_at_[hidden])
Date: 2002-04-18 06:13:51
Well, it is possible to write an allocator that is optimized for LIFO,
but doesn't require it, however I do think that it would be somewhat of
a speed hit. The main reason is, that with your proposed solution, the
allocator would have to store a list of all the blocks that have been
allocated, with the LIFO-only version, it wouldn't. How? Well this is
how it would work:
the allocator allocates memory in chunks - a chunk is fixed size, and
pretty big (exactly how big would require testing to find the optimal
size - let's say call it chunksize for now). Each chunk can be split
into blocks. So you can store your chunks like this:
stack<void*> chunks;
then you would have a pointer which would point at the prospective
location for the next memory allocation,
void* current;
The following invariant will always hold:
current == NULL || (current >= chunks.top() && current <=
chunks.top()+chunksize)
now when an allocation request is made, wanting s bytes, we do the
following [1] :
- If s > chunksize then fall back to the default allocator.
- Otherwise see if there is sufficient space left in this chunk for the
block, by comparing current to chunks.top()+chunksize.
- If there is then return the value of current, and have current move s
spaces forward, awaiting the next allocation.
- If there's not then we allocate a new block, push it onto chunks,
make current equal to chunks.top()+s, and return chunks.top().
Then, when a block is deallocated, we receive a request to deallocate
the block at p, which points to s bytes (note that the caller must
provide both the pointer and the size of the block. This shouldn't be a
problem, since vlarray knows this information anyway.), we do the
following:
- If s is greater than chunksize, we must have allocated this block from
the default allocator, forward the request to the default allocator.
- Otherwise, if current is equal to chunks.top(), then free the top most
chunk and pop it off the stack.
- make current equal to p
and that's it! All implemented using one simple stack and a pointer. The
blocks are stored by the users of the allocator - something they would
have to do anyway.
Now, I'm not sure if my approach would result in that much of a
time/space saving compared to your approach, I think it would offer
something though. However, if users did allocate vlarrays on the heap,
it would potentially effect performance quite badly. What's a better way
of approaching incorrect use of a library? Make the program crash
instantly with a hopefully useful state/assertion failure with comment,
or make the program work perfectly, but slowly? I tend to think the
first approach may be better, since the user will very quickly learn of
the problem - while with the second approach you could have the class
used for years, with users thinking that it provides them speed
improvements, but in reality, because of their misuse there are
slow-downs.
As for using mutexes, it can't be done. The only way to make it
threadsafe, is to have different copies of the allocator's stack stored
in thread-specific storage. Even if mutexes were viable, they wouldn't
be worth it, since the cost of a mutex is very significant compared to
the cost of a memory allocation.
Cheers,
David.
[1] I neglect to describe the issues of making sure that current is
properly aligned, but this shouldn't be hard; the user will just have to
pass in the object size.
On Thu, 2002-04-18 at 03:29, hicks wrote:
> Re: Query of Interest - vlarray
>
> As far as I understand, the desired behavior is a fast allocator.
> Allocation always occurs at the end of the stack,
> or to put it another way, just beyond the highest block of currently
> allocated memory.
> Because you intend to only allocate to temporary variables,
> (voluntary compliance), the highest block of currently
> allocated memory and the most recently allocated block of
> memory will (should) be the same.
> Supposing that the LIFO (most recently allocated,
> next to be freed) assumption is dropped
> since voluntary compliance may be violated.
> Instead, keep a freed bit for each allocated block.
> On deallocation, move the stack pointer only if the freed block
> is the last block, otherwise mark it free. If the stack pointer is moved,
> move it all the way down to the end of the last non-free block.
> The management time barely changes.
> Memory might run out, but a fatal error won't occur.
>
> Back to the issue of multithreading.
> In a multithreaded application, or a single threaded application linked
> to a multithreaded
> library supporting new, the call to new incurs a mutex (or whatever)
> overhead.
> Which is the bigger loss, the mutex call or the allocation strategy
> overhead?
>
>
>
> Craig Hicks
>
>
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk