From: Eric Niebler (eric_at_[hidden])
Date: 2007-08-09 19:27:14
Matthias Troyer wrote:
> On 8 Aug 2007, at 22:26, Eric Niebler wrote:
>> Matthias Troyer wrote:
>>> On 7 Aug 2007, at 12:06, Eric Niebler wrote:
>>>> I'm sorry you ran into trouble with floating-point offsets. To be
>>>> honest, they were added to the time_series library as a bit of an
>>>> after-thought, and they are admittedly not a seamless addition. With
>>>> integral offsets, runs are in fact half-open. Floating-point runs
>>>> not -- it's not clear what that would mean. (E.g., a delta series D
>>>> a 42 at [3.14,3.14) -- that's a zero-width half open range! What
>>>> D[3.14] return?) Some of the algorithms won't work with floating-
>>>> offsets. If you feel this part of the library needs more thought,
>>>> a fair assessment. I'm certain that had you used integral offsets
>>>> experience would have been less frustrating.
>>>> I think with better docs and more strict type checking, integral and
>>>> floating point offsets can both be supported without confusion.
>>> Don't confuse the sparse/dense time series with piecewise constant
>>> functions. A delta series D has 42 at 3.14, not at any interval
>>> [3.14,3.14) - intervals should be used only for the piecewise
>>> constant functions.
>> Ah, but the library is built on top of lower-level abstractions that
>> assume intervals. An interval (a run) is how algorithms on time series
>> are expressed. This design was chosen because it makes it possible to
>> write generic algorithms for lots of different types of series, and it
>> works very well for integral offsets. The question is whether the
>> abstractions upon which time series is built are compatible with a
>> sparse series with floating point offsets and if so, what
>> convention can
>> be used so that the algorithms naturally give the correct results both
>> with points and with runs. The way the library currently handles
>> floating point offsets is /almost/ right, but not quite.
> How do you express the delta series as a run then? length 1?
For integer offsets, yes. For floating point, no. The time series
currently has a notion of an "indivisible" run -- one that cannot be
divided into smaller time slices. For float, that is essentially a run
like [3.14,3.14] -- a closed range. That mostly works, but it leads to
some inconsistent handling of termination conditions, since all other
runs are half-open. The solution may involve nothing more than
establishing a convention, or it may involve promoting the concept of
Point to the same importance as Run and specializing algorithms
appropriately. It'll take some thought.
-- Eric Niebler Boost Consulting www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk