From: Eric Niebler (eric_at_[hidden])
Date: 2007-07-22 22:33:25
Tom Brinkman wrote:
> boost::time_series is available here. The review stars at the end of the month.
> Here are some initial questions and observations.
> All questions in this post are related to the
> simple_moving_average_functor, which
> I have included at the bottom of this post.
> I have more questions, but I thought this would be a good start.
> 1) A simple way to do a "simple_moving_average" is to use a std::deque,
> adding elements using "push_back" and removing elements using "pop_front".
> This technique can be used for any of the smoothing algorithms that
> involve a fixed number of elements.
> Do any of the boost::time_series containers mimic the "deque" capability to
> add and remove elements at both ends of the array? I used a "deque" in
> the sample "functor" at bottom of this post.
No, none of the time series are like deque. Most of the time series
types use a vector of some ilk as the backing storage. I could probably
make this configurable. But I don't think you need a deque-ish time
series here, unless I'm mistaken. This is just an implementation detail,
and it doesn't need to be a time series, IIUC. Besides, deque is the
wrong data structure. You really want a circular buffer.
> 2) The "simple_moving_average_functor" has two functions,
> "finalize()" and "operator()". (see below) The second function passes
> in the offsets,
> for "start" and "stop". However, the "finalize" function does not
> pass any offsets.
> Im not shur how to integrate these "offsets" into the calculating the
> "moving average".
> The functor that I created below just ignores them. Any suggestions?
I think the right thing to do is pretend that values at skipped offsets
are zero, and factor them in to your moving average. The idea is that a
sparse time series with a 3 at offset 1 and a 4 at offset 10 should
yield the same moving average as a dense series with zeros everywhere
else. Make sense?
I think your implementation can be a lot more efficient. You're summing
all the values in the window at each iteration. Can't you store a
running sum, and a circular buffer of elements in the window? When you
get a new element, you subtract the value popped from the back and add
the value pushed to the front, divide by the size of the buffer, and
you're done. (Well, you also need to factor in any zeros at offsets that
have been skipped.)
> 3) After calling commit(), am I not able to add any more elements to a
> container. In other words, do I have to re-create the "time_series"
> container each time. Take for example, the case when Im calculating the
> "simple_moving_average" over a period of 10 days. As I move through the data
> set, all that I need to do is add another datat-point, and then remove
> the last data point.
> To re-populate the "time_series" container each time would be non-optimal. I
> wonder if "time_series" container could release that data, and let me start
> adding/removing elements to it again?
One of us is confused---I don't know which. :-) The moving average
function takes an input series and the size of the window, and creates a
new time series that contains the averages, right? So your functor will
stuff averages into an OutputInserter and when it's done, it calls
commit(). And then you're done. You don't need to add any more to the
Forgive me if I've totally made a hash of what you're trying to do.
> 4) Do "range_run_storage::transform" and "range_run_storage::for_each" support
> The example using "range_run_storage::transform" might look like this:
> boost::range_run_storage::transform(series1, series2,
> boost::lambda::_1 + boost::lambda::_2, inserter);
That should work. This version of range_run_storage::transform() is
documented to accept a BinaryFunction as its 3rd parameter. "_1 + _2"
should model BinaryFunction. range_run_storage::for_each() takes a
TernaryFunction. Check the docs.
> Example using "std::transform"
> boost::lambda::_1 + boost::lambda::_2, std::back_inserter(series3));
Careful here. Time series are not containers in the STL sense. They
don't have begin() and end() member functions. It's not clear what that
would mean when a series could stretch from -inf to +inf. (The only
exception is a clipped_series<>, which *does* model an STL Container,
because it is guaranteed to have a definite start and end offset.) And
using a std::back_inserter isn't going to work because time series do
not have an insert() member function.
Stick to the range_run_storage algorithms---that's what they're there for.
> 5) What is the prefered method to do simple math against the elements in a
> "time_series" container. Say for example I wanted to multiply the number
> 1.10 against each element. How would I do this using
Check out the range_run_storage::transform_inplace() algorithm.
-- Eric Niebler Boost Consulting www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk