Boost logo

Boost :

From: James Fowler (boost_list_at_[hidden])
Date: 2005-03-04 16:09:55

Brian Braatz wrote:

>First off- note that Timer can take a start time on the ctor: ...
>high_resolution_timer does not however: ...
> high_resolution_timer() { ...
> if (!QueryPerformanceFrequency(&frequency))
> throw std::runtime_error("Couldn't acquire frequency");
> I need to feed the timer the START time (or get FROM it the start
>time would be preferred)
>The reason I need this is because for logging purposes (working with the
>Profiler from Christopher Diggins) I need to log the start and stop
>times, and I need it to (for internal cross reference) be the same
>number for multiple purposes.
IMHO, based on the example you gave, you're mixing two separate
problems: how to capture a representation of current time, and how to
present an interface to capture durations of time. I haven't really
examined the Profiler mentioned lately in detail, so maybe you've
actually already dealt with this in the real code. If so, I'd like to
take a look, as I'm working through similar issues myself. If not, I'm
working on something that be useful to you.

Here's some food for thought:

    * interface issues for capturing durations
          o consistent, portable representation
          o normalized conversion(s) for use in external expressions
            (i.e., as double representing seconds)
          o support for internal expressions (add, subtract, compare)
          o variety of use cases for establishing boundary moments
                + constructor starts, destructor stops (and stores to
                  target provided to constructor)
                + manual start / pause / continue / stop / restart
                  (clear accumulated duration)
                + auxiliary class (takes reference to instance of
                  duration, constructor calls start|continue, destructor
                  calls stop)
                + and numerous other variations on the theme...
          o support for iostreams
          o represented as a class (possibly templatized on mechanism
            for actual time capture)
                + internal representation of
                  start_time/stop_time/accumulated duration can be based
                  on portable POD types
    * interface for capturing high resolution time
          o can be very challenging to balance:
                + resolution (as high possible??? !!!)
                + reliability and accuracy (otherwise resolution doesn't
                  mean much...)
                + efficiency (all this with zero runtime overhead
                  please, and fetch me a fresh cup of coffee while
                  you're at it)
          o variations in interface
                + hard to find one ideal portable API to build on, lots
                  of choices
                      # ACE has some nice timer wrappers
                        but that doesn't help unless you're using ACE
                      # QueryPerformanceCounter(...) &
                        QueryPerformanceFrequency(...)on Windows
                      # time() is ubiquitous, but with pitiful resolution...
                      # gettimeofday() available on various POSIX systems
                      # gethrtime() on Solaris
                        systems (maybe RTLinux too)
                      # reading tsc counters on Pentium-based systems
                      # and there are assuredly a few more out there
                + representation - some use an integer type, some use
                  structs, ...
                + resolution - from very coarse (1 second from time())
                  to very fine (nanoseconds for gethrtime())
                + context - "wall clock" time (like gettimeofday()),
                  process time (like clock()), and more...
                + scaling - constant (sec/usec in gettimeofday()) vs.
                  runtime dynamic (QueryPerformanceCounter(...) /
                + offset - fixed ( gettimeofday()) vs. floating
                + overflow for floating offset representations
                + calibration for floating offset representations
                  (baseline for conversion to known offset)
          o variations in behavior - due to implementation or dynamic
            environmental influences
                + known issues, like QueryPerformanceCounter() sometimes
                + degree of separation from actual "real time clock"
                      # query RTC hardware directly (should be most
                        stable - but not necessarily fastest...)
                      # query counters slaved to CPU clock (like the
                        "rdtsc" instruction for Pentiums)
                      # query global internal "current time" value
                        incremented by periodic process (interrupts...)
                + local uncertainty, i.e. the first (in hard real time)
                  of two nearly simultaneous calls may have a result
                  slightly "later" value than the second
                      # called in one thread?
                      # called in multiple threads on one CPU?
                      # called in multiple processes on one CPU?
                      # called in multiple threads/processes on multiple
                + performance impact - overhead varies
                      # based on API calls used, can differ by platform
                        / kernel version / etc...
                      # based on frequency called (impact on caches, SMP
                + precision - useful precision may be less than
                  representation allows
                + response to system "idling"- load-based CPU speed
                  throttling, sleep & hibernation periods, etc.
                + drift - representations with "wall clock" context but
                  floating offsets may (or may not) diverge over time
                  from calibration points
    * thoughts on requirements for an ideal portable C++ high resolution
      time capture mechanism
          o generic interface
                + specifies opaque "raw_hrtime" form in which time is stored
                      # not necessarily an object, may use POD type as
                      # don't want to require constructor/destructor
                + common signatures for key operations
                      # store "now" in an instance of raw_hrtime
                      # move/copy instance of raw_hrtime
                      # convert from raw_hrtime into normalized form(s)
                            * separate conversions for floating and
                              fixed offsets
                            * conversion to integral POD type(s)
                                  o also provide converted equivalents
                                    for scale and offset
                            * conversion to double (in seconds)
                      # overflow check
                      # get scaling factor
                      # get offset
                + traits to make variations in interface of underlying
                  API available for compile-time or run-time use
                      # size & alignment requirements for raw_hrtime
                      # resolution, context, mode (fixed|variable) for
                        scaling and offset
                + traits / operations addressing potential variations in
                      # optional, may only represent "best guess"
                + common signatures for potentially optional operations
                  (default implementation may be feasible)
                      # calibration
                      # drift detection
          o ideally zero runtime overhead added by generic wrapper for
            each capture of "now" in raw_hrtime
                + highest priority is capture
                + secondary priority is direct (unprocessed) output
                      # output raw_hrtime without forcing conversion
                            * do NOT apply - or even query - scaling and
                      # optionally to store additional information as
                        necessary to decode raw form
                            * traits on interface variation
                            * scaling/offset values
                      # can treat as raw (possibly aligned) opaque chunk
                        of memory
                + local manipulation only by explicit request
                      # deferred (lazy-evaluation) for any conversions
                            * use scaling and offset to provide
                              normalized representation
                      # calibration on demand
    * "Friendly" class wrapper for hires time type...
          o A common class interface implemented around the generic
            raw_hrtime interface should be able to provide
                + Convenient features like automatically fetching "now"
                  in the constructor
                + automatic conversion to normalized form (like double)
                + comparison and math operations
                + iostream output
                + and do all this in clean, portable code
          o A templatized version can be created to allow compile time
            variation multiple raw_hrtime implementations
          o A concrete version could provide "best effort" default
            support for hi-res time
                + may need more than one...
          o A hybrid version could potentially allow for selection of
            the "closest" available implementation based on filtering
            raw_hrtime variants
                + example: VARIABLE_OFFSET_OK + MAX_SMP_CONSISTENCY
                + may or may not be worth it if this requires runtime
                  performance hit...

Hopefully this gives a taste of the complexities surrounding the
seemingly simple task of getting a precise representation of "now". It
can be a very dangerous area to explore for anyone with perfectionistic
tendencies ;) . IMHO this is a classic "can't see the forest for the
trees" issue (and I've gotten lost in that forest a few times over the
years...). With the myriad possibilities for differing results driven
by the matrix of potential variation, trying to predict just how a
given combination will behave becomes an exasperating challenge. How
can we get past that? Stop trying to predict the results. Go produce
the results instead. Line up all the possible combinations that can
work on a given platform, turn them loose and take notes. Pinning exact
timing on individual moments may always involve a little uncertainty,
but analyzing average trends can lead to deeper insights and justifiable
confidence in the system overall. Some of the issues with
QueryPerformanceCounter(...) may certainly be "real", yet their impact
on the real world of your applications may prove irrelevant with a
little testing.

>There you go, those are my requests. Now I will sit here quietly and
>hope there is a Santa Claus....
So was this less, or more, than you asked for? Although I suppose you'd
rather have it done than have it described. :) Alas, for that you'll
have to be patient, but perhaps not for long. Since I'm in the process
of writing documentation which needed to address high resolution timing
anyway, organizing my thoughts in this response probably helped me more
than it will help you! At least, until the whole package is ready to go
under the tree...

- james

 James Fowler, Open Sea Consulting, Marietta, Georgia, USA
 Do C++ Right., opening soon!

Boost list run by bdawes at, gregod at, cpdaniel at, john at