Subject: Re: [boost] [lockfree::fifo] Review
From: Gottlob Frege (gottlobfrege_at_[hidden])
Date: 2009-12-20 14:44:27
On Sun, Dec 20, 2009 at 11:17 AM, Tim Blechmann <tim_at_[hidden]> wrote:
> On 12/20/2009 04:57 PM, Chris M. Thomasson wrote:
>> "Tim Blechmann" <tim_at_[hidden]> wrote in message
>>>> Well, IMO, it should perform better because the producer and consumer
>>>> not thrashing each other wrt the head and tail indexes.
>>> the performance difference is almost the same.
>> Interesting. Thanks for profiling it. Have you tried aligning everything on
>> cache line boundaries? I would try to ensure that the buffer is aligned and
>> padded along with the head and tail variables. For 64-byte cache line and
>> 32-bit pointers you could do:
How about we go through the ring buffer by steps of 2^n - 1 such that
each next element is on a separate cache line? ie instead of
m_head = (m_head == T_depth - 1) ? 0 : (m_head + 1);
m_head = (m_head + 7) % T_depth;
You still use each slot, just in a different order. You calculate 'n'
to be whatever you need based on the cell size. As long as the
resultant step size is prime mod T_depth.
I'm not sure if the false-sharing avoidance would be worth the cost of
using up more cache lines. Probably depends on how full the queue is,
Or might that be worse?
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk