|
Boost : |
From: Zach Laine (whatwasthataddress_at_[hidden])
Date: 2007-08-13 17:15:58
On 8/12/07, Eric Niebler <eric_at_[hidden]> wrote:
>
> Zach Laine wrote:
> > On 8/10/07, Eric Niebler <eric_at_[hidden]> wrote:
> >> Discretization for the other series types is not useless. For sparse,
> >> for example, it enforces that samples may only exist at offsets that are
> >> a multiple of the discretization.
> >
> > I was reading the rest of your email and having a hard time understand
> > how we were talking past each other so completely. Here is where I
> > could first figure out how we'd gotten out of phase. I was operating
> > under the impression that sparse series could have values at arbitrary
> > offsets (not integer multiples of the discretization). It was this
> > misunderstanding on my part that lead to my suggestion to remove
> > discretization from some types. Consider it withdrawn. Though you're
> > probably sick of hearing this, the use of discretization should be
> > spelled out more explicitly in the documentation to make this clear.
>
> OK, I think we're on the same page now, but just to clarify the
> situation, the following two series are identical:
>
> sparse_series<int> s( discretization = 5 );
> dense_series<int> d( discretization = 5 );
>
> make_ordered_inserter(s)(1,1)(2,2)(3,3).commit();
> make_ordered_inserter(d)(1,1)(2,2)(3,3).commit();
>
> These both have a 1 at offset 1, a 2 at offset 2 and a 3 at offset 3.
> The offsets represent T=0,5,10 respectively due to the discretization.
> Because the discretization is a logical multiplicative factor of the
> offsets, the sparse series is guaranteed to have values only at points
> that are multiples of the discretization; that is, T=0,5,10.
I think I've got it; thanks for clearing this up.
> The docs have a ways to go to make this clear, I agree. But now that its
> clear and your suggestion is withdrawn, is your objection to the time
> series library also withdrawn?
I guess my vote is now a provisional yes. I like the library overall,
but there are too many details that appear not to be solidified for me
to vote yes without seeing them addressed:
The issues Steven has raised wrt the use of commit(). If there really
are no efficiency gains from the commit() technique, it is unidiomatic
enough that I think it should go away.
The documentation is pretty far enough from where I think it should
be. I used xpressive for the first time over the weekend, so I now
know you know how to write great docs. :)
The rolling-window algorithm should be added.
I'm still hazy on whether adding data to the high end and dropping
data off the low end is a reasonable and efficient usage of a series,
or, if not, whether there are methods users can employ to efficiently
process series that are too large to keep in memory all at once. If
this is easy to do with the library, I think it's important enough as
a use case that it deserves its own example.
coarse_grain() and fine_grain() need to be customizable.
I would still like to see a int/floating point mapping from sample
space to index space for dense_series<> that you alluded to in a
previous email.
So, with the understanding that these issues will be addressed
post-review, I vote yes. I leave it up to the review manager to
determine whether this volume of provisions means that my vote should
count as "yes, but please address this", or "no, it needs another
pass."
Zach Laine
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk