Boost logo

Boost :

Subject: Re: [boost] [chrono/date] Performance goals and design summary
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2013-05-05 03:31:16


Le 05/05/13 04:31, Howard Hinnant a écrit :
> Fwiw, I went to http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3344.pdf to experiment with the "Modified Following" benchmark which has some code shown for it.
>
> I wanted to see what it would look like with two date types: field and serial, instead of just one (date) which has been implemented as both.
>
> Here is the original code from N3344:
>
> bool
> isNonBusinessDay(const Date& targetDate,
> const Date& startDate,
> const int* calendar)
> {
> int offset = targetDate - startDate;
> int wordIndex = offset / 8 / sizeof(int);
> int bitIndex = offset - wordIndex * sizeof(int);
> return (1 == (calendar[wordIndex] & (1 << bitIndex)));
> }
>
> Date
> modifiedFollowing(const Date& targetDate,
> const Date& startDate,
> const int* calendar)
> {
> Date date(targetDate);
> while (isNonBusinessDay(date, startDate, calendar))
> ++date;
> if (targetDate.month() == date.month())
> return date;
> date = targetDate;
> do
> {
> --date;
> } while (isNonBusinessDay(date, startDate, calendar));
> return date;
> }
>
> For purposes of discussion, assume we now have two date types:
>
> 1. Serial type named day_point, which is a chrono::time_point.
> 2. Field type named ymd (not a great name, just trying to differentiate it).
>
> All isNonBusinessDay does is subtract two Dates. This is clearly the domain of a serial type:
>
> bool
> isNonBusinessDay(const day_point& targetDate,
> const day_point& startDate,
> const int* calendar)
> {
> days offset = targetDate - startDate;
> days wordIndex = offset / (8 * sizeof(int));
> days bitIndex = offset - wordIndex * sizeof(int);
> return (1 == (calendar[wordIndex.count()] & (1 << bitIndex.count())));
> }
>
> The modifications are trivial and expense is not compromised. The code is perhaps a little more type safe, utilizing the days units, but nothing spectacular. I would expect the exact same assembly to be generated.
>
> modifiedFollowing is more interesting. If it takes two day_points (two serial dates), it might look like this:
>
> day_point
> modifiedFollowing(const day_point& targetDate,
> const day_point& startDate,
> const int* calendar)
> {
> day_point date(targetDate);
> while (isNonBusinessDay(date, startDate, calendar))
> ++date;
> if (ymd(targetDate).month() == ymd(date).month()) // serial->field twice
> return date;
> date = targetDate;
> do
> {
> --date;
> } while (isNonBusinessDay(date, startDate, calendar));
> return date;
> }
>
> The only difference is the need to convert the serial dates to a ymd type so that the month can be extracted. This is done exactly twice on the line commented. Otherwise the code is remarkably similar, and I would argue, the exact same efficiency.
>
> <disclaimer>
> std::chrono::time_point is currently missing operator-- and operator++. I view this a defect that should be corrected.
> </disclaimer>
>
> One could explore with passing in targetDate as a ymd type instead:
>
> day_point
> modifiedFollowing(const ymd& targetDate,
> const day_point& startDate,
> const int* calendar)
> {
> day_point date(targetDate); // field->serial
> day_point sdate = date;
> while (isNonBusinessDay(date, startDate, calendar))
> ++date;
> if (targetDate.month() == ymd(date).month()) // serial->field
> return date;
> date = sdate;
> do
> {
> --date;
> } while (isNonBusinessDay(date, startDate, calendar));
> return date;
> }
>
> This rewrite trades one serial->field conversion for one field->serial conversion. It might be a win if the client actually has a ymd already for input, as my measurements are showing that serial->field conversions are more expensive than field->serial. My measurements are raw and new, so I could be off on that. And no doubt such a measurement is going to depend upon things like hardware and algorithms (caches).
>
> But the main point is that having two date types is not disruptive, and mainly serves to give the client more options in optimizing his date algorithms.
>
>
I agree completely that we heed several dates and the standard (the
library) must describe the performances provided by each one.
Next follow the same using two dates days_date and ymd_date with the
expected interfaces.

days_date
modifiedFollowing(const ymd_date& targetDate,
                   const days_date& startDate,
                   const int* calendar)
{
     days_date date(targetDate); // field->serial
     days_date sdate = date;
     while (isNonBusinessDay(date, startDate, calendar))
         ++date;
     if (targetDate.month() == date.month()) // serial->field ++ No need to convert explicitly
         return date;
     date = sdate;
     do
     {
         --date;
     } while (isNonBusinessDay(date, startDate, calendar));
     return date;
}

++ No need to convert explicitly

For the purpose of showing the Date concept I will use a template

template <typename Date1, typename Date2>
days_date
modifiedFollowing(const Date1& targetDate,
                   const Date2& startDate,
                   const int* calendar)
{
     ymd_date ymdTargetDate(targetDate); // serial->field or nothing (so that the month() is efficient - we could store the month also).
     days_date date(targetDate); // field->serial or nothing, but at least there is a conversion (as we are making day arithmetic)
     days_date sdate = date;
     
     while (isNonBusinessDay(date, startDate, calendar))
         ++date;
     if (ymdTargetDate.month() == date.month()) // serial->field ++ No need to convert explicitly
         return date;
     date = sdate;
     do
     {
         --date;
     } while (isNonBusinessDay(date, startDate, calendar));
     return date;
}

As you can see the two dates with the same interface have not only drawbacks ;-) Providing the same interface and been convertible to each other helps the user.

Best,
Vicente


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk