Project

General

Profile

Actions

Bug #14740

open

cyclics should panic when juggling for too long

Added by Patrick Mooney 23 days ago.

Status:
In Progress
Priority:
Normal
Category:
kernel
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Recently, while developing a change to bhyve, we encountered a situation where a misbehaving cyclic (one that was repeatedly rescheduling itself in the past) caused a system to become unresponsive. When cyclic_juggle_one_to attempted to remove the misbehaving cyclic so it could be moved to a different CPU, we fell into its exponentially increasing delay loop:

        /*
         * Remove the cyclic from the source.  As mentioned above, we cannot
         * block during this operation; if we cannot remove the cyclic
         * without waiting, we spin for a time shorter than the interval, and
         * reattempt the (non-blocking) removal.  If we continue to fail,
         * we will exponentially back off (up to half of the interval).
         * Note that the removal will ultimately succeed -- even if the
         * cyclic handler is blocked on a resource held by a thread which we
         * have preempted, priority inheritance assures that the preempted
         * thread will preempt us and continue to progress.
         */
        for (delay = NANOSEC / MICROSEC; ; delay <<= 1) {
                /*
                 * Before we begin this operation, disable kernel preemption.
                 */
                kpreempt_disable();
                if (cyclic_remove_here(src, ndx, &when, CY_NOWAIT))
                        break;

                /*
                 * The operation failed; enable kernel preemption while
                 * spinning.
                 */
                kpreempt_enable();

                CYC_PTRACE("remove-retry", idp, src);

                if (delay > (cyclic->cy_interval >> 1))
                        delay = cyclic->cy_interval >> 1;

                /*
                 * Drop the RW lock to avoid a deadlock with the cyclic
                 * handler (because it can potentially call cyclic_reprogram().
                 */
                rw_exit(&idp->cyi_lock);
                drv_usecwait((clock_t)(delay / (NANOSEC / MICROSEC)));
                rw_enter(&idp->cyi_lock, RW_WRITER);
        }

While this loop drops and reacquires cyi_lock, cpu_lock is held for the duration. Since the loop was not ultimately successful, contrary to the included comment, this meant that all other operations requiring cpu_lock were wedged. Defending against the underlying problem (the programmer error of such rescheduling error on the offending cyclic) is not feasible, we could make it easier to debug the system when it happens. Since the system is not going to make much other forward progress if cpu_lock is held for such an extended period of time (forever in our example case), panicking the system would provide all the clues necessary in the resulting dump to diagnose the issue.

We propose a threshold of ~30 seconds for the delay, after which we initiate the panic. That should be long enough to steer us clear of any legitimate system behavior that falls prey to load or scheduling issues without being too long of a wait if a developer is tempted to physically reboot the system if they're unable to NMI when it goes unresponsive.

No data to display

Actions

Also available in: Atom PDF