|
Boost : |
From: Eric Niebler (eric_at_[hidden])
Date: 2007-07-26 13:27:02
Hugo Duncan wrote:
> Hi,
>
>> And if anybody would like to peruse the documentation online, you can
>> find it here:
>>
>> http://boost-sandbox.sourceforge.net/libs/time_series/doc/html/
>
> Since I am currently doinng some time series work, I thought I would have
> a look. First glance shows an impressive amount of clear thinking and
> extensability.
Thanks.
> Below are some use cases that are important to me, and
> which I can not immeadiatley see as possible within the framework.
>
> I see no mention of how to handle large time series that need processing
> incrementally. There are a lot of concepts in there, so I may well have
> missed something. At the moment I handle these by reading the data into a
> circular buffer with enough capacity to provide as much time history as I
> need for the algorithms to run at each time point. Using some
> imagination, I can see using the library to do this in either of two ways.
>
> The first would be to use the existing series types and update the series
> values for each new datapoint - but I don't see a way of discarding data
> points.
>
> The second would be to define a new series type that accepted new values,
> discarded old values, and generally did all the book keeping - is that
> possible within the confines of the framework?
I don't think either of these is the right approach. It shouldn't be the
series' job to keep a circular buffer of data for the algorithm to use.
Rather, if the algorithms requires a buffer of previously seen data, it
should cache the data itself, as in the rolling average implementation I
sent around a few days ago.
The Sequence concept on which all the time series' are built requires
readable and incrementable cursors. That means the time series
algorithms *should* all work with an "input" or single-pass series types
-- that is, one with a destructive read. That would be the way to go
IMO. I could see a time series type implemented in terms of std::istream
that reads runs from std::cin, for instance. Or more practially, one
that memory-maps parts of a huge file and traverses it with single pass
cursors. This would be a very interesting time series! The algorithms
haven't been tested with such a single pass series, but I don't see a
fundamental problem with it.
> I also have data that has non-constant sampling periods. At the moment I
> handle these by piecewise sampling to another (constant period) timebase.
> Can this fitted into the framework?
I'm not 100% sure I understand your use case. But most of the series
types and algorithms allow non-discrete sequences. That is, the offsets
can be floating point. Could that help?
> Finally, I don't see a convolution algorithm for applying filters.
> Probably easy enough to implement, and is in my view important enough to
> include in the core algorithms.
Yup, no convolution yet. Sure would be nice. Patches welcome! :-)
-- Eric Niebler Boost Consulting www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk