Subject: Re: [boost] Linear circular buffer
From: Soren Holstebroe (holstebroe_at_[hidden])
Date: 2009-06-25 13:43:31
>> The boost library contains a circular buffer template by Jan Gaspar. It
> I use it just fine in my high performance applications where I
> maintain the most recent N timings from API calls. Although iteration
> through the dataset may be a little painful, N is not large enough for
> me to notice (100 and 1000) the pain. The calls run in sub-microsecond
> time and the maintenance of automatically evicting older entries gives
> me (the user) a good guarantee that the data I'm getting is
I think your example is from a scenario where buffer processing is not
where most of the cpu time is spent. In that case Jan Gaspars circular
buffer is just fine. In fact, if what you want to do is to aggregate a
linear response (like an average function), all you have to do is to
look at the item going out of the buffer and item going in, you
wouldn't even have to look at whats in the middle of the buffer.
If your use of circular buffers is in applications, where buffer
processing is what the application is all about, then sub-microseconds
matter, if you are talking about 1 billion times 5 microseconds vs 1
billion times 3 microseconds.
One example is bioinformatics, where you want to drag a window across
the whole 3+ gigabases genome and maybe calculate some entropy measure
in that window.
Another is FIR filters in DSP, where you produce an output by
convolving each window with a second buffer.
> I agree about read performance, but I don't think you ever need to
> call 'linearize' if you don't intend to build a vector out of your
> circular buffer.
If your buffer processing is done by an external library, then you
need to linearize. In image processing and DSP, that would be a very
> Iterating through the container is O(N) where N is
> the size of the container, and the check for borders is O(1);
In the case where your buffer processing can use the iterators, the
overall complexity is the same, but the constant is not. You save a
border check if your buffer is linear.
> I can imagine that you can use the modulo (%) function given that you know
> the initial offset of the "first" entry in the buffer, and N being the
> size of the buffer
Modulo is (at least in the old days) calculated as the remainder of a
division. That is much slower than a border check.
> How about space efficiency then? I (the user) expect to have just 100
> elements' worth of memory used when I allocate 100 elements' worth of
> a circular buffer at construction time (same goes for the STL vector).
First, are you sure you get 100 elements allocated when you resize a
vector to 100?
A vector implementation might reserve a whole chunk of memory for you
when resize to make the next potential reallocation faster.
Anyway, my proposal is about speed and the allocation of, lets say
100, extra elements would not be an issue in all practical high
throughput application I can think of.
> Linear access time is nice especially if you intend to always just go
> through the whole data structure everytime -- having said this I don't
> think I'd necessarily want to pay for extra space usage if I can
> absolutely avoid it.
I have the exact opposite priority. I would not trade performance to
save a few hundred bytes, that might have been allocated anyway by the
allocator (or system paging or whatever).
> How do you intend to implement the internal buffer then without having
> to use more memory than necessary to store the elements?
As discussed, you are, as far as I know, not guaranteed that with
std::vector either in most implementations.
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk