|
Boost : |
From: Michael Kenniston (msk_at_[hidden])
Date: 2002-04-22 13:31:19
I vote in favor of accepting GDTL, given the understanding that
time zones will be added later. I have minor quibbles here and
there, but I think the basic approach (providing decoupled sets
of classes for different time sytems) is sound.
This is a sufficiently complex task that there is no way anyone
is going to get it perfect on the first try (both Jeff and I have
created similar things in the past without ever being totally
satisfied with the results). Therefore, as Paul pointed out,
it's a good idea to start getting experience with it so we can
learn what needs to be improved.
Before responding to some of the previous review comments, I'd
like to make a few observations. As in the GDTL documentation,
"time system" here means "a system used by humans for
reckoning times and/or dates, and which includes the definition
of a set of values and rules for manipulating them." Obviously,
a time system can be viewed as an abstract data type. UTC,
TAI, Local Times, and all calendars are examples of time
systems.
1) GDTL is meant to be quite general, so what Jeff is actually
trying to do is set up an extensible framework. He doesn't
have to personally implement every plausible calendar or clock,
but he does have to consider all the implications that they
might have on the architecture (or at least, as many implications
as we can think of).
Some examples of time systems may appear oddball and irrelevant,
but I believe it's important to consider them and make sure the
framework can handle them. My experience is that if you spend
a lot of time getting the obscure corner cases correct and
consistent, you will be rewarded with greater integrity of the
system as a whole. (I.e. the cleanest solution to a special case
is often to solve the general case.) Furthermore, the definition
of "oddball" will vary enormously depending on one's cultural
background. Finally, some of the inconvenient and difficult
properties of seldom-implemented time systems (which many
libraries simply ignore or sweep under the rug) turn out to
show up - in very subtle ways - even in our everyday systems.
Thinking about how to get things like the Islamic and Mayan
calendars right has greatly helped me to understand the
fundamental properties of UTC and time zones.
2) Some time systems have a most unfortunate property that,
for lack of a better term, I call "unpredictability". For
details and a formal definition see the wiki page
http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?GDTL/Trade-Offs
The gist of the problem is that there is no order-preserving mapping
from the elements of an unpredictable time system onto a set
of consecutive integers, i.e. there is no counting representation.
The obvious instance of an unpredictable time system is the
Islamic religious calendar, where certain months begin on the
day when a specified phase of the moon is observed. If the
weather is overcast over an entire region so the moon cannot be
observed, the month doesn't begin yet. This literally means you
cannot predict the length of that month ahead of time. A time
system with such a characteristic strikes most westerners as
peculiar, but our own systems of Local Time, and even UTC, have
exactly the same property! It's much more subtle, because in the
case of UTC it involves only one (leap) second every couple of
years, and in the case of Local Time the unpredictability of
DST rule changes may only rear its ugly head once every decade
or two, but the fundamental issue is the same.
3) Since it is a candidate for boost, and that implies high
quality, GDTL must be correct - and not just sometimes, or
most of the time, but all the time. This is a lot harder
than it sounds: In order to show correctness (via proof,
testing, code review, or whatever), one must be able to
/define/ correctness. The obvious way to do that is to
define correctness for a GDTL class as behavior corresponding
to some international standard (like UTC or TAI). True
correspondence is quite difficult to implement on most
current machines, and anything less is just plain wrong.
There are lots of Date-and-Time libraries already out there,
and even a couple of books on the topic, so it's only worth
Jeff's time to create a new one (and our time to review it)
if GDTL is going to be significantly better than the existing
libraries. I see two such areas where GDTL can improve on
weaknesses in existing art: using templates to enable the
user to control things like precision/epoch/range, and being
very careful about correctness.
4) The distinction between /representation/ and /conversion/
is important. Even if a class cannot be implemented correctly
using a counted representation, it can still be possible to
provide a conversion to a counted system. The catch is that
the conversion will be approximate and unstable. This is
acceptable (well, actually I'd rather not accept it, but it's
unavoidable) for a conversion, but not for a class's internal
representation.
5) The ptime logic/arithmetic that Jeff has implemented corresponds
to GMT, which although long officially deprecated is still what most
computers actually implement. We do need to add real UTC (among
other things), but what is currently in GDTL is good place to start.
Now on to some specific comments:
Bill Kempf wrote:
> I *strongly* feel that the gdtl
> namespace "plumbing" *must* provide a mechanism for universal
> representation.
I disagree, and in fact claim that a "universal representation"
is impossible to implement correctly. I've implemented three
date-and-time libraries myself, the latest two of which were
grounded in a such a counted/physical/universal time system, and
that experience led me to conclude that this approach is unworkable
for a truly general library. Jeff and I have discussed this, and
I fully support his position that this is the wrong approach.
The key is that, as mentioned above, many time systems are not
counted. Even UTC is not a counted system, although most people
assume that you can just use time_t to represent it. For example,
the time_t value corresponding to Dec 31 23:59:59 1998 is 915148799,
and the time_t for Jan 01 00:00:00 1999 is 915148800. (These numbers
were the same on both MSVC/NT and gnu/SunOS, which is what I have
access to right now.) If we were using GMT, which is what the
unix/posix routines were originally designed for, this would be
ok, but for UTC it's wrong: What value should we use for
Dec 31 23:59:60 1998? This was a legitimate time (a leap second),
but there is no integer available that fits into the sequence
properly.
Of course we could modify the routines to reserve an integer in this
specific case, but the problem doesn't go away. Let's look at the
pair Dec 31 23:59:59 2002 and Jan 01 00:00:00 2003. Should we assign
the integers N and N+1, or N and N+2? We simply don't know yet,
and if/when we guess wrong we end up with a non-counted system.
Thus, in order for representation of UTC to be stable, it cannot
use a counted internal representation.
This is a classic case of something that "everybody knows" being
wrong. Everybody knows their windows/unix/linux clock routines
are right, yet in fact what people are doing is taking a GMT clock
and setting it to UTC, resulting in a hybrid mishmash that doesn't
match any international or national standard. When you check the
radio or TV for the "correct" time and set that into the "gmtime"
of your computer, you are actually committing a type error. The
only reason we don't notice is that the clock chips on our PCs
are so bad that the clock is always off by much more than a second
anyway. :-(
Joachim Achtzehnter wrote:
> To be honest, I think additional calendars such as a Mayan calendar,
that
> may or may not be added in the future, are not as relevant to the vast
> majority of users. What I regard as more important is to support a few
> widely used calendar/time systems well.
Certainly the vast majority of users will only be using TAI, UTC, and
Local Time, and GDTL must support them with tools that are convenient
and correct. However, there are other potential users, and if the GDTL
framework doesn't support them, some other framework will and we'll end
up just continuing the proliferation of libraries. (I /am/ willing to
draw the line at relativistic effects, though. :-)
Bill Seymour wrote:
> Why not always store civil time as UTC and convert to some
> specified time zone only at I/O time when human-readable
> dates and times are required? Doesn't that make the
> problems go away?
Unfortunately, no. The conversions between civil and UTC are themselves
unpredictable, so if we used UTC internally that would make the
representation of local times unstable.
This can get a bit confusing, so here's an example. Star-crossed
lovers Joe and Sally agree to meet 10 years from now at the Sears
Tower, at 4:15 PM on April 22, 2012, local Chicago time. Sally enters
the date into her PDA, but Joe writes it down on paper. If the PDA
stores local time by immediately converting it into UTC /using the
current DST rules/, and the legislature changes the rules in 2008,
then Joe and Sally will not show up for their rendez-vous at the
same time. Tragedy and heartbreak ensue, followed by a bad TV
mini-series.
In general, when a timepoint is specified in an unpredictable system,
it must be stored in that system - without conversion - in order to
remain correct.
Ross Smith wrote:
> Time intervals and points in time
> are natural physical quantities that should be manipulated
independently
> of the units we choose to measure them with.
<snip>
> The calendar types would have no
> arithmetic properties themselves; they exist only to handle the
> conversion between the opaque time types and calendar-specific
> representations.
If only we could do it that way. I built a couple of libraries based
on that principle, only to realize later that they could not be made
to work correctly with unpredictable time systems.
Joachim Achtzehnter wrote:
> Some calendar/time systems (those most
> influenced by politicians) are so screwy that it is difficult to avoid
> ambiguity in mappings, but that is another matter.
Actually, I think this is the crux of the matter. If we assume a clean
predictable mapping from "physical" time to "civil" time, then we are
condemning ourselves to a library riddled with exceptions. If we base
the library on a model that accounts for the screwiness that the real
world forces upon us, the structure of the library can stay cleaner.
George Heintzelman wrote:
> For example, suppose you're in
> the financial world. Even putting aside the issue of business days, if
> I add 1 year to a particular instant, I want to get an instant which
> represents the same wall-clock time on the same date next year. But
> that interval, while a logical constant, is not a fixed number of
> seconds, because of leap days and leap seconds. So I would not be able
> to specify the desired interval as a generic opaque interval here.
An excellent example. Not only is the time system being used
unpredictable, it's also irregular, providing yet another reason why
we cannot base all time systems on physics.
Joachim Achtzehnter wrote:
> These concepts [duration of 1 year] make sense only in
> the context of a calendar/time system. They are clearly not the same
> concept as the physical duration defined above. In my view it would be
a
> mistake to pretend they are the same.
There is a basic philosophical difference here. The idea of creating
a class which corresponds to the "real" physical time is quite
attractive,
but after having built a library that way I now believe the idea to
be a seductive mirage. While I won't argue the metaphysics of whether
such a physical reality exists, I will argue that the only thing we can
model are human-defined time systems. When you speak of physical
time, I'd guess you mean TAI (atomic clock) time, which is the best
clock we have right now. However, 100 years ago GMT was physical time,
since the most accurate clock we had back then was the earth's
rotation. I expect 100 years from now the cesium atom will no longer
be the gold standard for timekeeping (in fact NIST is already working
on its replacement).
Therefore, I will argue that /all/ time systems are human artifacts,
and the library will benefit if they are all treated as equally valid.
Although this is a philosophical position, my argument is actually
based on pragmatism - I simply couldn't get things to work out right
when I tried to base everything on "physical" time.
Ross Smith wrote:
> Calendar-specific operations like "add 5 working days" bear a
> superficial resemblance to arithmetic but are really something else,
and
> should be kept separate from true time arithmetic, just as the
> civil-calendar and physical-time types they operate on should be
> separate.
How about "add 5 days"? Is a "day" a physical quantity? When we
were using GMT it was, but now it's not. Adding "5 working days",
"5 rotations of the earth", or "5*24*3600 seconds" are all just
operations. Some of them have pleasant properties in common with
the addition of numbers, and it will certainly be useful to
document that, but I'm not convinced that there is otherwise
anything conceptually special about them.
Joachim Achtzehnter wrote:
> Strongly agree: Dealing with different time zones can be a very
> frustrating experience. It is precisely applications that need to do
this
> that would most benefit from library support. At the very minimum I
would
> expect full support for one local timezone plus UTC, but multiple
> timezones would be a real benefit.
I too strongly agree. I'm used to dealing with a dozen or so
time zones (including DST and irregular special cases)
simultaneously and consider that capability to be essential.
I'm simply willing to wait for the next version to get it.
Ditto for TAI, UTC, Julian, MJD, etc.
-- - Michael Kenniston mkenniston_at_[hidden] msk_at_[hidden] http://www.xnet.com/~msk/
Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk