Project

General

Profile

Actions

Bug #7331

open

i40e spends too much time in interrupt context when optics are removed

Added by Hans Rosenfeld almost 6 years ago. Updated almost 6 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2016-08-26
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

I'm seeing the following problem on a system with a 4-port i40e:

When the fibers are pulled (including the optics, as I have been told), each i40e instance will keep one CPU 100% busy in the adminq interrupt doing one link check after another. It seems that we stay in i40e_intr_adminq_work() for a very long time as more requests come in while we're still busy processing one. All of them are for link check, which is done while holding the general lock. We noticed that a system suffering from this interrupt storm will hang at boot or at shutdown.

Analyzing this with DTrace showed that every time we get a request from the queue there are between 4 and 16 more requests remaining. The total runtime of i40e_intr_adminq_work() can add up to several 100ms.

I have tried patching the driver to skip the link check, and the system apparently behaves normal. I have also checked just removing the mutex_enter/mutex_exit of the general lock, but that didn't improve the situation.

Actions

Also available in: Atom PDF