|
Boost : |
Subject: Re: [boost] Proposal: Monotonic Containers
From: Simonson, Lucanus J (lucanus.j.simonson_at_[hidden])
Date: 2009-06-10 14:16:35
David Bergman wrote:
>>>>
>>>> Ok, so you could use Thorsten's auto_buffer as your storage, which
>>>> actually would give you, potentially, the best of two worlds
>>>> (allocation speed and - you seem to push this angle - cache) and
>>>> dynamic heap beyond that initial segment.
>>>
>>>
>>> Cache coherency is King!
>
> Yes, I have noticed that you push that angle very hard.
>
...
> 2. Why is cache coherency so much better when you use an inline array
> then when Thorsten does it? He also allocates inline (at first) and
> you only have a first, anyway. So, can you please explain why cache
> coherency is so much more King in your...
Cache coherency is I think a misapplied term here. What you two seem to be talking about is spatial locality of reference. The stack typically has good spatial locality of reference relative to the heap because it is contiguous. A vector or similar contiguous data structure always has good spatial locality of reference because it is contiguous unless it is very small. You will have no significant benefit from allocating a vector on the stack vs. the heap if it is very large because 1) your dynamic allocation cost is ammoratized over a very large number of vector elements and 2) the benefit from the first few elements already being in cache because of being on the stack is amoratized over a very large number of elements that are likely not in the cache and will perform the same as if allocated on the heap. Furthermore, the performance benefit of allocating a small vector on the stack as opposed to the heap will be dominated by the time saved by not going to to the dynamic memory allocation routine as opposed to the cache misses caused by the dynamic memory allocated being cold. Both of these performance concerns can be improved by pooling allocators. I question the benfit of this stack vs. heap based allocation for containers when the termonology used to describe its benefit is misapplied. By the way, allocating large containers on the stack is an extremely, extremely foolish thing to do because it causes crashes and performance problems.
Cache coherency is a concurrency issue where multiple processors (or cores) are sharing memory and when modifying the shared memory in their cache need to make sure that the cached version of the same address is also modified in the caches of other cores/processors. It is a serious problem when you have to handle it in software, and is best solved by hardware, which is tricky and expensive. Cell processors do not have hardware cache coherency, while intel processors do. Since you don't have to worry about cache coherency in general, and when you do need to worry about it your life is hell and you definitely know it, and since it is only an issue on some platforms, I doubt any boost library should ever consider the issue of cache coherency at all.
Regards,
Luke
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk