Project

General

Profile

Bug #1941

timer intervals incorrectly rounded to clock resolution

Added by Bryan Cantrill over 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Category:
kernel
Start date:
2012-01-01
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

If one currently attempts to create (say) a 1,000 usec POSIX interval timer
on x86, the actual cyclic created will not have the specified interval
of 1,000,000 nanoseconds, but rather 1,000,005 nanoseconds. Lest this
difference seem small, the 5 ppm error amounts to 5 microseconds per second;
after a mere 200 seconds of operation of such a timer, timer execution will
be off by an entire interval (namely, a millisecond). This is entirely
busted, and violates the most basic tenet of POSIX interval timers: that
they fire at the rate of the specified interval. The defective code is
astonishingly deliberate; in timer_settime():

        /*
         * From the man page:
         *      Time values that are between two consecutive non-negative
         *      integer multiples of the resolution of the specified timer
         *      shall be rounded up to the larger multiple of the resolution.
         * We assume that the resolution of any clock is less than one second.
         */
        if (it->it_backend->clk_clock_getres(&res) == 0 && res.tv_nsec > 1) {
                long rem;

                if ((rem = when.it_interval.tv_nsec % res.tv_nsec) != 0) {
                        when.it_interval.tv_nsec += res.tv_nsec - rem;
                        timespecfix(&when.it_interval);
                }
                if ((rem = when.it_value.tv_nsec % res.tv_nsec) != 0) {
                        when.it_value.tv_nsec += res.tv_nsec - rem;
                        timespecfix(&when.it_value);
                }
        }

Here is the relevant section of the man page for timer_settime():

       Time values that are between two consecutive non-negative integer  mul‐
       tiples  of  the resolution of the specified timer will be rounded up to
       the larger multiple of the  resolution.  Quantization  error  will  not
       cause the timer to expire earlier than the rounded time value.

These two sentences say absolutely nothing about changing the programmed
interval -- merely that the timer will not fire earlier than the specified
time. Indeed, changing the programmed interval brings the implementation
into direct contradiction of the defined semantics of it_interval:

       The reload value of the timer is set to  the  value  specified  by  the
       it_interval  member  of  value.  When  a timer is armed with a non-zero
       it_interval, a periodic (or repetitive) timer is specified.

Note that there is nothing that this does not say that the "reload value
of the timer is set to the value specified by the it_interval member of
value, rounded up to the nearest value that is evenly divided by the clock
resolution."

It is unfortunate that our current facilities for arbitrary resolution
interval timers are being sullied by a misread of the standard (a misread
that might well extend to a standards "test" that is itself broken).
Fortunately, the fix here is simple: the code that modifies the time and
interval to be evenly divided by the clock resolution should simply be
ripped out.

History

#1

Updated by Ilya Yanok about 7 years ago

I've taken a short look at this problem and have some comments. I have to admit I'm completely new to Illumos so probably I'm missing somthing.

First of all I've found that CLOCK_HIGHRES timers are (almost) not affected: they expose fixed resolution of 2 nsec regardless of the actual resolution of cyclics backend used. So the error introduced by the rounding is 1nsec max.

Next, CLOCK_REALTIME timers don't rely on system ticks too (as they use cyclics internally) but realtime_timeout() interface used for these timers has nsecs_per_tick resolution hardcoded into it. So simply ripping out the rounding code won't give us the expected result: probably we will do better with the first expiration but after that all subsequent events will be aligned to the tick boundary anyway.

Probably we might want to use timeout_generic() with some smaller resolution (how small can it be?) instead of realtime_timeout()... I can prepare the patch but I don't really have enough understanding to predict if we won't brake something by using for ex. 1 nsec resolution...

#2

Updated by Bryan Cantrill about 7 years ago

  • Assignee set to Bryan Cantrill
#3

Updated by Rich Lowe about 7 years ago

  • Category set to kernel
  • Status changed from New to Resolved
  • % Done changed from 0 to 100
  • Tags deleted (needs-triage)

Resolved in r13616 commit:5d28731f11c2

Also available in: Atom PDF