
Boost : 
From: Eric Niebler (eric_at_[hidden])
Date: 20070809 19:27:14
Matthias Troyer wrote:
> On 8 Aug 2007, at 22:26, Eric Niebler wrote:
>
>> Matthias Troyer wrote:
>>> On 7 Aug 2007, at 12:06, Eric Niebler wrote:
>>>
>>>> I'm sorry you ran into trouble with floatingpoint offsets. To be
>>>> honest, they were added to the time_series library as a bit of an
>>>> afterthought, and they are admittedly not a seamless addition. With
>>>> integral offsets, runs are in fact halfopen. Floatingpoint runs
>>>> are
>>>> not  it's not clear what that would mean. (E.g., a delta series D
>>>> has
>>>> a 42 at [3.14,3.14)  that's a zerowidth half open range! What
>>>> should
>>>> D[3.14] return?) Some of the algorithms won't work with floating
>>>> point
>>>> offsets. If you feel this part of the library needs more thought,
>>>> that's
>>>> a fair assessment. I'm certain that had you used integral offsets
>>>> your
>>>> experience would have been less frustrating.
>>>>
>>>> I think with better docs and more strict type checking, integral and
>>>> floating point offsets can both be supported without confusion.
>>> Don't confuse the sparse/dense time series with piecewise constant
>>> functions. A delta series D has 42 at 3.14, not at any interval
>>> [3.14,3.14)  intervals should be used only for the piecewise
>>> constant functions.
>>
>> Ah, but the library is built on top of lowerlevel abstractions that
>> assume intervals. An interval (a run) is how algorithms on time series
>> are expressed. This design was chosen because it makes it possible to
>> write generic algorithms for lots of different types of series, and it
>> works very well for integral offsets. The question is whether the
>> abstractions upon which time series is built are compatible with a
>> sparse series with floating point offsets and if so, what
>> convention can
>> be used so that the algorithms naturally give the correct results both
>> with points and with runs. The way the library currently handles
>> floating point offsets is /almost/ right, but not quite.
>
> How do you express the delta series as a run then? length 1?
For integer offsets, yes. For floating point, no. The time series
currently has a notion of an "indivisible" run  one that cannot be
divided into smaller time slices. For float, that is essentially a run
like [3.14,3.14]  a closed range. That mostly works, but it leads to
some inconsistent handling of termination conditions, since all other
runs are halfopen. The solution may involve nothing more than
establishing a convention, or it may involve promoting the concept of
Point to the same importance as Run and specializing algorithms
appropriately. It'll take some thought.
 Eric Niebler Boost Consulting www.boostconsulting.com The Astoria Seminar ==> http://www.astoriaseminar.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk