Boost logo

Boost :

Subject: Re: [boost] [chrono/date] conversion between concrete dates
From: Howard Hinnant (howard.hinnant_at_[hidden])
Date: 2013-05-11 18:22:41


On May 10, 2013, at 1:45 PM, "Vicente J. Botet Escriba" <vicente.botet_at_[hidden]> wrote:

> When I add validation on the source date format I get
>
> clang 3.2
> * empty field->serial ~6.3ns.
> * field->serial ~13.4ns.
> * empty serial->field ~1ns.
> * serial->field ~17.9ns.
>
> gcc-4.8.0
> * empty field->serial ~7.5ns.
> * field->serial ~15.7ns.
> * empty serial->field ~1ns.
> * serial->field ~21.7ns.

I've been experimenting with adding validation today. I'm guessing that all of your validation is in a translation unit hidden from the testing loop. Is that correct?

I've been putting my validation in a header because I want to make it constexpr, and constexpr stuff has weak linkage. The motivation for making it constexpr is that for any part of the validation that involves compile-time information, the validation happens at compile time.

And my first experiments today involve putting some of the validation back into the unit specifiers, in contrast to the direction I was heading earlier.

Specifically:

// invariants:
// 1 <= d_
class day
{
    int d_;

    static
    constexpr
    int
    __attribute__((__always_inline__))
    check_invariants(int d)
    {
        return 1 <= d ? d : throw bad_date{};
    }
public:
    constexpr
    explicit
    __attribute__((__always_inline__))
    day(int d)
        : d_(check_invariants(d))
        {}

    constexpr
    __attribute__((__always_inline__))
    operator int() const
        {return d_;}
};

// invariants:
// 1 <= m_ && m_ <= 12
class month
{
    int m_;

    static
    constexpr
    int
    __attribute__((__always_inline__))
    check_invariants(int m)
    {
        return 1 <= m && m <= 12 ? m : throw bad_date{};
    }
public:
    constexpr
    explicit
    __attribute__((__always_inline__))
    month(int m)
        : m_(check_invariants(m))
        {}

    constexpr
    __attribute__((__always_inline__))
    operator int() const
        {return m_;}
};

// invariants:
// none
class year
{
    int y_;
public:
    constexpr
    explicit
    __attribute__((__always_inline__))
    year(int y)
        : y_(y)
        {}

    constexpr
    __attribute__((__always_inline__))
    operator int() const
        {return y_;}

    constexpr
    bool
    __attribute__((__always_inline__))
    is_leap() const
        {return y_ % 4 == 0 && (y_ % 100 != 0 || y_ % 400 == 0);}
};

Because of a bug in clang (http://llvm.org/bugs/show_bug.cgi?id=12848) I've had to mark everything with always_inline to get the compiler to optimize it properly. But once done, it does the optimizations nicely.

Now the ymd_date (or whatever name) constructors can be carefully crafted to not re-validate information that is already known. For example if the ymd_date constructor takes a month (not an int), then there is no need for it to re-validate in the month at that point. month is known to be valid.

I've removed what I call "range checking", which means there is no validation on year.

Here is a partial implementation of what I'm testing for ymd_date:

class ymd_date
{
    year y_;
    month m_;
    day d_;

    static
    constexpr
    day
    __attribute__((__always_inline__))
    check_invariants(year y, month m, day d)
    {
        return m != 2 ?
               (
                   d <= limit[m-1] ? d : throw bad_date{}
               ) :
               (
                   y.is_leap() ? (d <= 29 ? d : throw bad_date{}) :
                                 (d <= 28 ? d : throw bad_date{})
               );
    }

    static
    constexpr
    day
    __attribute__((__always_inline__))
    check_invariants(year y, month_day md)
    {
        return md.month() != 2 || md.day() <= 28 || y.is_leap() ?
                   md.day() : throw bad_date{};
    }
public:
    constexpr
    __attribute__((__always_inline__))
    ymd_date(year y, month m, day d)
        : y_(y),
          m_(m),
          d_(check_invariants(y_, m_, d))
        {}

The class is holding objects of type year, month and date instead of 3 ints (or whatever) so that the invariants of the individual components are not compromised when storing into, or returning from the ymd_date (i.e. they don't have to unnecessarily undergo re-validation).

The ymd_date validator taking year, month and day doesn't have to validate the month, it is known to be valid. It doesn't have to validate the year, there is nothing to validate. It only has to validate the day. And it doesn't need to check that the day >= 1, the day constructor already took care of that.

My experiments with looking at assembly generated at -O3 is that if either the month or day is a compile-time object, the validation code is reduced. For example it is common for day to be the first of the month, or perhaps the 5th, or any other fixed number <= 28. When this happens, and I construct a:

    ymd_date ymd(year(y), month(m), day(1));

I can see in the generated assembly that everything disappears except ensuring that 1 <= m <= 12. Similarly when only the month is compile-time information I'm seeing the constraint checking on d is simplified, especially for the case that the month is not feb.

But even when all three unit specifiers are run time information, when I run this through a field->serial conversion:

    const int Ymin = 1900;
    const int Ymax = 2100;
    volatile int k;
    int count = 0;
    auto t0 = std::chrono::high_resolution_clock::now();
    for (int y = Ymin; y <= Ymax; ++y)
    {
        for (int m = 1; m <= 12; ++m)
        {
            int last = days_in_month(y, m);
            for (int d = 1; d <= last; ++d)
            {
                ymd_date ymd{year(y), month(m), day(d)};
                k = days_from(ymd.year(), ymd.month(), ymd.day());
                ++count;
            }
        }
    }
    auto t1 = std::chrono::high_resolution_clock::now();
    typedef std::chrono::duration<float, std::nano> sec;
    auto encode = t1 - t0;
    std::cout << encode.count() / count << '\n';
    std::cout << sec(encode).count() / count << '\n';

I'm seeing times that are only 0.1ns to 0.2ns slower. This information is preliminary. My optimizer may again be getting the best of me. But in this case, I do not believe I have the option of moving the validation out of the translation unit with the test loop since I believe that this really must be constexpr to take advantage of common cases like:

          ymd_date ymd{year(y), month(m), day(1)};

If the validation really is this cheap, this pulls the motivation for the unchecked field types. And I currently don't see a motivation for a checked serial type. Only an unchecked serial type make sense to me since the only thing that can go wrong with it is for it to move out of range. And that range can easily be made ridiculously large (+/- tens of thousands of years, if not millions of years).

This style of validation checking has renewed my interest in the month_day type. A month_day type can be created and validated once, and then the ymd_date object can be constructed multiple times with a run-time year and the fixed month_day type with a faster validation check than with separate year, month and day components:

    static
    constexpr
    day
    __attribute__((__always_inline__))
    check_invariants(year y, month_day md)
    {
        return md.month() != 2 || md.day() <= 28 || y.is_leap() ?
                   md.day() : throw bad_date{};
    }

    constexpr
    __attribute__((__always_inline__))
    ymd_date(year y, month_day md)
        : y_(y),
          m_(md.month()),
          d_(check_invariants(y_, md))
        {}

And if month_day happens to be constexpr, and the day happens to be <=28, or the month happens to not be feb, this validation completely disappears at compile time. This is made possible because the month_day constructor has already non-reduntantly performed other parts of the validation (and at compile-time if the month_day is constexpr).

Howard


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk