Bug #11968
closedAPIC calibration should explicitly initialise the PIT
100%
Description
As part of calibrating the APIC tick rate, we use the Intel 8254 Programmable Interval Timer (PIT). This ancient hardware still appears, often emulated, in most physical and virtual x86 systems.
In the past, Google have fixed bug 130531009 in the PIT emulation they provide in the Google Compute Engine (GCE) hypervisor. After that bug fix, the situation is much improved, but an apparent quirk in the emulation still remains.
In apic_calibrate_impl()
, we disable interrupts and loop waiting for the PIT counter to hit a particular point in its periodic cycle. On a current GCE instance, we see the PIT value decrease on the first few reads, and then come to rest at 0
. Subsequent reads in the first loop get stuck on this zero value. Looking at the implementation of the function, we're pretty much assuming that anybody who has used the PIT before us has returned it to an appropriate state for this operation. With other PIT implementations, our efforts to do this appear to be good enough -- but in GCE, it seems the PIT is stuck after counting down once for whatever reason.
Using kmdb, I was able to demonstrate (through the use of ::in
and ::out
) that resetting the PIT to a periodic mode (i.e., mode 2 or 3) at the top of the function and loading a counter value was enough to "unstick" it. Rick McNeal from Nexenta reports that adding a call to microfind()
at the top of the function also unsticks the PIT, which makes sense; the last thing microfind()
does is reprogram the PIT into mode 3 (square wave) and load a value, in attempt to leave things as we expect we might have found them.
Adding the PIT program and load sequence at the top of apic_calibrate_impl()
has the advantage of being documented PIT behaviour, and behaviour that works in the other places we use the PIT today. This should dramatically reduce the risk of making such a change.
Updated by Joshua M. Clulow over 2 years ago
Code review: https://code.illumos.org/c/illumos-gate/+/164
Updated by Joshua M. Clulow over 2 years ago
Note that Mode 3: Square Wave Mode is, as it happens, a poor choice here. This mode starts at the initial count value and counts down in steps of two for each clock pulse. This results in a value of apic_ticks_per_SFnsecs
that is exactly half what we used to get before the change.
Updated by Joshua M. Clulow over 2 years ago
Testing Notes¶
I have built and tested this change under OpenIndiana in three places. In the two non-GCE cases I confirmed that the result of the calibration, as stored in apic_ticks_per_SFnsecs
was the same before and after the change.
- physical machine, Intel NUC6i7KYB, w/ Intel(r) Core(tm) i7-6770HQ CPU @ 2.60GHz
apic_ticks_per_SFnsecs
=12582
- virtual machine, running under SmartOS KVM/QEMU
apic_ticks_per_SFnsecs
=524302
- virtual machine, running under Google Compute Engine (GCE), where without this change we cannot boot
Updated by Electric Monk over 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 11ed32a0b3b424ec966d0330d0efaf049baaf8d2
commit 11ed32a0b3b424ec966d0330d0efaf049baaf8d2 Author: Joshua M. Clulow <josh@sysmgr.org> Date: 2019-11-15T23:41:23.000Z 11968 APIC calibration should explicitly initialise the PIT Reviewed by: John Levon <john.levon@joyent.com> Reviewed by: Paul Winder <paul@winders.demon.co.uk> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Robert Mustacchi <rm@fingolfin.org>