System time changes combined with spurios wakeups cause DP_POLL to return prematurely
The dvpoll argument of the DP_POLL ioctl includes a relative timeout in
milliseconds. According to the man page for poll.7d the ioctl should block until
events are ready for any of the file descriptors, the timeout expires, or
a signal is received. In cases where the system time changes, however, DP_POLL
may return prematurely (before the timeout has expired with no events ready).
When DP_POLL returns prematurely it looks to applications that the timeout has
expired. This has caused a number of problems for java applications using
java.nio.channels.Selector, which is implemented via DP_POLL and /dev/poll.
The problem is a result of dpioctl using cv_waituntil_sig to wait for events
to be ready or for a signal to be delivered. cv_waituntil_sig takes an
absolute timeout in the future and blocks until a signal is received, the
timeout expires, or it's woken up (aka events are ready). Since
cv_waituntil_sig takes an absolute timeout is has additional logic to handle
cases where the system time changes. In the event that time has changed it
returns -1, which callers interpret as the timeout expiring.
dpioctl interprets cv_waituntil_sig returning -1 as the specified timeout
expiring, it doesn't differentiate between the time changing and the actual
timeout expiring. This in turn causes the premature return from DP_POLL to the caller.
In the common case we don't see this problem, the system time changing doesn't
cause threads blocked in cv_waituntil_sig to wakeup. In order to trigger this
something needs to wake up the blocked thread after the time has been changed.
We typically see this happen as a result of the process forking from a
different thread, which causes all of the threads block in cv_waituntil_sig to
wakeup. The dpioctl logic correctly detects this as a spurious wakeup (aka no
events were ready) and calls back to cv_waituntil_sig, but as a result of the
timechanged variable being incremented cv_waituntil_sig immediately returns
I've written a simple test program in C that triggers the behavior, I'll
attach it to the bug. I've tested a fix for the issue that changes dpioctl to
use a wrapper around cv_relwait_sig rather then cv_waituntil_sig.
cv_relwait_sig expects a relative timeout rather then an absolute timeout, so
we avoid the problems associated with time changing.
Updated by Eric Schrock almost 9 years ago
- Status changed from New to Resolved
user: Matt Amdur <Matt.Amdur@delphix.com>
date: Thu Oct 20 07:54:20 2011 -0700
1605 System time changes combined with spurios wakeups cause DP_POLL to return prematurely
Reviewed by: Adam Leventhal <email@example.com>
Reviewed by: George Wilson <firstname.lastname@example.org>
Reviewed by: Richard Lowe <email@example.com>
Reviewed by: Robert Mustacchi <firstname.lastname@example.org>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>