From: Eric Niebler (eric_at_[hidden])
Date: 2007-08-09 13:05:42
Paul A Bristow wrote:
>> -----Original Message-----
>> From: boost-bounces_at_[hidden]
>> [mailto:boost-bounces_at_[hidden]] On Behalf Of John Phillips
>> Sent: 31 July 2007 13:31
>> To: boost_at_[hidden]
>> Cc: boost-users_at_[hidden]
>> Subject: [boost] Time Series review - 7/30-8/8
> * What is your evaluation of the design?
> Fundamentally sound and potentially useful in many domains, not just the finance one for which it was designed.
> * What is your evaluation of the implementation?
> Looks OK. Some details of difficult areas like FP that can almost certainly be improved. (See also some detailed notes and
> questions below)
> * What is your evaluation of the documentation?
> Good, but far too steep to tempt most potential users. Needs an outline, a tutorial, and more examples, especially several outside
Yes, I'm getting that message loud and clear. :-)
> * What is your evaluation of the potential usefulness of the library?
> Although there are many problem for which a plain C array of doubles will suffice, it is surprisingly how soon more complicated data
> structures emerge, and that is where I see this package making the difficult much easier.
> * Did you try to use the library? With what compiler? Did you have any problems?
> No, time/ priorities did not permit, but I hope to.
> * How much effort did you put into your evaluation?
> A quicker reading than I would wish.
> * Are you knowledgeable about the problem domain?
> Somewhat, but with physical measurements, not finance.
> * Am I confident that the library will be maintained.
> Yes, definitely.
> * Do you think the library should be accepted as a Boost library?
> I note that a number of critical comments have been made of the library: I don't think any of these are showstoppers and I think it
> would be unfortunate if this prevented acceptance of the library. I am confident that these criticisms can be addressed.
Thank you. I also am confident the FP problem can be dealt with.
> (As an aside, I think it would also set an unfortunate example to reject a high quality library that has been kindly 'donated' by a
> commercial organisation. No doubt it took some effort to persuade them to do this, and that it took a *lot of effort for them to
> make a decision to do so*! I believe we want to encourage other organisations to make freely available, through Boost, work that
> they have funded.)
IMHO, if time_series is accepted, I hope it's for its merits and not
because it was donated by anyone.
> Some detailed observations:
> 1 Is the Time series is restricted to time? - the series time unit could be anything, voltage, age, - so not an ideal name?
The data structures are fairly general, that's true, which is why in
another message I suggested moving them into the range_run_storage
namespace so they can be reused. A "TimeSeries" is an
InfiniteRangeRunStorage with a discretization, and the Time_series
library includes a bunch of algorithms that are specific to the time
> So I am tempted to suggest Data_series or even Boost.Series?
> But a rose by any other name would smell as sweet ;-)
> (Can I claim a brownie point for the first Shakespeare quote in a Boost review? ;-)
+1 brownie point ;-)
> However there is plenty of precedent as you give http://en.wikipedia.org/wiki/Time_series
> and time is the most common 'variable'.
> 2 How would one signify missing data??? Use NaN? as zero?
NaN works if your data is floating point, some magic number otherwise.
You could even use optional<T> as your value type if you were so
inclined, so long as you define the appropriate operations on that type.
> 3 The series types are rather tersely defined.
> The learning curve would be flattened by more explanation
> and examples at the point of definition, aiming for lucidity as well as accuracy.
> 4 The delta_series should at least mention Dirac! Why does Heaviside get his name on a series and Dirac not? Wikipedia link would
> be good.
A link to wikipedia would be good. I wouldn't want Dirac to get jealous
> 3 Despite the warning, I am sure many will make the mistake of not committing.
> A stronger admonishment might be useful?
> "You must .commit() the returned inserter when you are done with it."
I could do that, but I've been getting lots of negative feedback about
.commit(), and I think it might make sense to provide a higher-level
wrapper. I could imagine something like:
(1, 2) (2, 4) (3, 6) (4, 8);
Here, assign_ordered() returns a type that is movable and automatically
.commit()'s in its destructor.
I might also rip out the inserter-as-an-output-iterator duality, as it
may end up leading people astray.
> 4 I'd like to see examples with each definition, and preferably not all from finance, lest people imagin that it is only suitable
> for finance when it is actually at least as useful for all things physical and even sociological.
> 5 It may astonish Boosters to know that there are very many potential users who will not understand "This operation is amortized
> O(1)". Links like [@http://en.wikipedia.org/wiki/Amortized_analysis amortized 0(1)] would help them.
> 6 A proof of application using circular buffer would be most valuable. There are very many people with an endless torrent of data
> hitting them, without an infinite supply of memory. Showing how to get information from the data flood could sell time_series to
> 7 A single example, in a single finance arena (nice and commented) is really not enough.
> 8 The docs really needed a much more gentle tutorial introduction, so that users can see the wood for the metaprocessing-trees.
> Typos - I only spotted one in Zeros and Sparse Data
> accomlish should be accomplish.
Thanks for your feedback.
-- Eric Niebler Boost Consulting www.boost-consulting.com The Astoria Seminar ==> http://www.astoriaseminar.com
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk