Boost logo

Boost :

Subject: Re: [boost] [chrono/date] Performance goals and design summary
From: Howard Hinnant (howard.hinnant_at_[hidden])
Date: 2013-05-04 22:31:49


Fwiw, I went to http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3344.pdf to experiment with the "Modified Following" benchmark which has some code shown for it.

I wanted to see what it would look like with two date types: field and serial, instead of just one (date) which has been implemented as both.

Here is the original code from N3344:

bool
isNonBusinessDay(const Date& targetDate,
                 const Date& startDate,
                 const int* calendar)
{
    int offset = targetDate - startDate;
    int wordIndex = offset / 8 / sizeof(int);
    int bitIndex = offset - wordIndex * sizeof(int);
    return (1 == (calendar[wordIndex] & (1 << bitIndex)));
}

Date
modifiedFollowing(const Date& targetDate,
                  const Date& startDate,
                  const int* calendar)
{
    Date date(targetDate);
    while (isNonBusinessDay(date, startDate, calendar))
        ++date;
    if (targetDate.month() == date.month())
        return date;
    date = targetDate;
    do
    {
        --date;
    } while (isNonBusinessDay(date, startDate, calendar));
    return date;
}

For purposes of discussion, assume we now have two date types:

1. Serial type named day_point, which is a chrono::time_point.
2. Field type named ymd (not a great name, just trying to differentiate it).

All isNonBusinessDay does is subtract two Dates. This is clearly the domain of a serial type:

bool
isNonBusinessDay(const day_point& targetDate,
                 const day_point& startDate,
                 const int* calendar)
{
    days offset = targetDate - startDate;
    days wordIndex = offset / (8 * sizeof(int));
    days bitIndex = offset - wordIndex * sizeof(int);
    return (1 == (calendar[wordIndex.count()] & (1 << bitIndex.count())));
}

The modifications are trivial and expense is not compromised. The code is perhaps a little more type safe, utilizing the days units, but nothing spectacular. I would expect the exact same assembly to be generated.

modifiedFollowing is more interesting. If it takes two day_points (two serial dates), it might look like this:

day_point
modifiedFollowing(const day_point& targetDate,
                  const day_point& startDate,
                  const int* calendar)
{
    day_point date(targetDate);
    while (isNonBusinessDay(date, startDate, calendar))
        ++date;
    if (ymd(targetDate).month() == ymd(date).month()) // serial->field twice
        return date;
    date = targetDate;
    do
    {
        --date;
    } while (isNonBusinessDay(date, startDate, calendar));
    return date;
}

The only difference is the need to convert the serial dates to a ymd type so that the month can be extracted. This is done exactly twice on the line commented. Otherwise the code is remarkably similar, and I would argue, the exact same efficiency.

<disclaimer>
std::chrono::time_point is currently missing operator-- and operator++. I view this a defect that should be corrected.
</disclaimer>

One could explore with passing in targetDate as a ymd type instead:

day_point
modifiedFollowing(const ymd& targetDate,
                  const day_point& startDate,
                  const int* calendar)
{
    day_point date(targetDate); // field->serial
    day_point sdate = date;
    while (isNonBusinessDay(date, startDate, calendar))
        ++date;
    if (targetDate.month() == ymd(date).month()) // serial->field
        return date;
    date = sdate;
    do
    {
        --date;
    } while (isNonBusinessDay(date, startDate, calendar));
    return date;
}

This rewrite trades one serial->field conversion for one field->serial conversion. It might be a win if the client actually has a ymd already for input, as my measurements are showing that serial->field conversions are more expensive than field->serial. My measurements are raw and new, so I could be off on that. And no doubt such a measurement is going to depend upon things like hardware and algorithms (caches).

But the main point is that having two date types is not disruptive, and mainly serves to give the client more options in optimizing his date algorithms.

Howard


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk