deadman fires spuriously when running on VMware
When the system hibernates and restarts, the counter that it uses to
measure time gets reset to nearly zero. As a result, in the clock
subsystem, we add the counter's value to the current time if the counter
goes backwards by more than a second or two.
Unfortunately, when running on VMWare, sometimes VMWare does a bad thing
and sends the counter backwards by more than that in the course of
normal operations. As a result, we end up adding a time almost as large
as the current uptime to the clock, resulting in the uptime of the
system suddenly doubling and the clock being off by days or weeks.
This can cause a variety of problems; one of them is that it may cause
the deadman subsystem to trigger, thinking that the system has been
unresponsive for a long time.
The fix to this problem is to change the way we handle sudden jumps
backwards in time; if the counter jumps backwards a lot, but is still
larger than some small value (a second or two), we should not add it to
the current time; instead, we decide that this jump is probably a result
of VMWare's glitch, and we don't add to the time until we start getting
reliable readings again.
Updated by Electric Monk about 6 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit e014e7f89c5273294b22953615734b04c11b1b4f Author: Paul Dagnelie <firstname.lastname@example.org> Date: 2016-02-22T17:54:32.000Z 6641 deadman fires spuriously when running on VMware Reviewed by: Matthew Ahrens <email@example.com> Reviewed by: Dan Kimmel <firstname.lastname@example.org> Reviewed by: Josef 'Jeff' Sipek <email@example.com> Reviewed by: Igor Kozhukhov <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>