Boost logo

Boost :

Subject: Re: [boost] [chrono/date] Performance goals and design summary
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2013-05-05 04:15:08


Le 05/05/13 09:31, Vicente J. Botet Escriba a écrit :
> Le 05/05/13 04:31, Howard Hinnant a écrit :
>> Fwiw, I went to
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3344.pdf to
>> experiment with the "Modified Following" benchmark which has some
>> code shown for it.
>>
>> I wanted to see what it would look like with two date types: field
>> and serial, instead of just one (date) which has been implemented as
>> both.
>>
>> Here is the original code from N3344:
>>
>> bool
>> isNonBusinessDay(const Date& targetDate,
>> const Date& startDate,
>> const int* calendar)
>> {
>> int offset = targetDate - startDate;
>> int wordIndex = offset / 8 / sizeof(int);
>> int bitIndex = offset - wordIndex * sizeof(int);
>> return (1 == (calendar[wordIndex] & (1 << bitIndex)));
>> }
>>
>> Date
>> modifiedFollowing(const Date& targetDate,
>> const Date& startDate,
>> const int* calendar)
>> {
>> Date date(targetDate);
>> while (isNonBusinessDay(date, startDate, calendar))
>> ++date;
>> if (targetDate.month() == date.month())
>> return date;
>> date = targetDate;
>> do
>> {
>> --date;
>> } while (isNonBusinessDay(date, startDate, calendar));
>> return date;
>> }
>>
>> For purposes of discussion, assume we now have two date types:
>>
>> 1. Serial type named day_point, which is a chrono::time_point.
>> 2. Field type named ymd (not a great name, just trying to
>> differentiate it).
>>
>> All isNonBusinessDay does is subtract two Dates. This is clearly the
>> domain of a serial type:
>>
>> bool
>> isNonBusinessDay(const day_point& targetDate,
>> const day_point& startDate,
>> const int* calendar)
>> {
>> days offset = targetDate - startDate;
>> days wordIndex = offset / (8 * sizeof(int));
>> days bitIndex = offset - wordIndex * sizeof(int);
>> return (1 == (calendar[wordIndex.count()] & (1 <<
>> bitIndex.count())));
>> }
>>
>> The modifications are trivial and expense is not compromised. The
>> code is perhaps a little more type safe, utilizing the days units,
>> but nothing spectacular. I would expect the exact same assembly to
>> be generated.
>>
>> modifiedFollowing is more interesting. If it takes two day_points
>> (two serial dates), it might look like this:
>>
>> day_point
>> modifiedFollowing(const day_point& targetDate,
>> const day_point& startDate,
>> const int* calendar)
>> {
>> day_point date(targetDate);
>> while (isNonBusinessDay(date, startDate, calendar))
>> ++date;
>> if (ymd(targetDate).month() == ymd(date).month()) //
>> serial->field twice
>> return date;
>> date = targetDate;
>> do
>> {
>> --date;
>> } while (isNonBusinessDay(date, startDate, calendar));
>> return date;
>> }
>>
>> The only difference is the need to convert the serial dates to a ymd
>> type so that the month can be extracted. This is done exactly twice
>> on the line commented. Otherwise the code is remarkably similar, and
>> I would argue, the exact same efficiency.
>>
>> <disclaimer>
>> std::chrono::time_point is currently missing operator-- and
>> operator++. I view this a defect that should be corrected.
>> </disclaimer>
>>
>> One could explore with passing in targetDate as a ymd type instead:
>>
>> day_point
>> modifiedFollowing(const ymd& targetDate,
>> const day_point& startDate,
>> const int* calendar)
>> {
>> day_point date(targetDate); // field->serial
>> day_point sdate = date;
>> while (isNonBusinessDay(date, startDate, calendar))
>> ++date;
>> if (targetDate.month() == ymd(date).month()) // serial->field
>> return date;
>> date = sdate;
>> do
>> {
>> --date;
>> } while (isNonBusinessDay(date, startDate, calendar));
>> return date;
>> }
>>
>> This rewrite trades one serial->field conversion for one
>> field->serial conversion. It might be a win if the client actually
>> has a ymd already for input, as my measurements are showing that
>> serial->field conversions are more expensive than field->serial. My
>> measurements are raw and new, so I could be off on that. And no
>> doubt such a measurement is going to depend upon things like hardware
>> and algorithms (caches).
>>
>> But the main point is that having two date types is not disruptive,
>> and mainly serves to give the client more options in optimizing his
>> date algorithms.
>>
>>
> I agree completely that we heed several dates and the standard (the
> library) must describe the performances provided by each one.
> <snip>
>
> For the purpose of showing the Date concept I will use a template
>
>
> template <typename Date1, typename Date2>
> days_date
> modifiedFollowing(const Date1& targetDate,
> const Date2& startDate,
> const int* calendar)
> {
> ymd_date ymdTargetDate(targetDate); // serial->field or nothing
> (so that the month() is efficient - we could store the month also).
> days_date date(targetDate); // field->serial or nothing, but at
> least there is a conversion (as we are making day arithmetic)
> days_date sdate = date;
> while (isNonBusinessDay(date, startDate, calendar))
> ++date;
> if (ymdTargetDate.month() == date.month()) // serial->field ++
> No need to convert explicitly
> return date;
> date = sdate;
> do
> {
> --date;
> } while (isNonBusinessDay(date, startDate, calendar));
> return date;
> }
>
>
> As you can see the two dates with the same interface have not only
> drawbacks ;-) Providing the same interface and been convertible to
> each other helps the user.
>
One additional advantage of using date.month() and not using a specific
conversion:
* class days_date could have a better algorithm that converting the
days_date to ymd_date, e.g. can convert it to an ordinal_date and use a
table to get the month from the is_leap() and day_of_year().

How better than the class days_date could know how to make these
operations more efficient? itself or the user?

Best,
Vicente


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk