Project

General

Profile

Bug #7331

i40e spends too much time in interrupt context when optics are removed

Added by Hans Rosenfeld almost 4 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2016-08-26
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

I'm seeing the following problem on a system with a 4-port i40e:

When the fibers are pulled (including the optics, as I have been told), each i40e instance will keep one CPU 100% busy in the adminq interrupt doing one link check after another. It seems that we stay in i40e_intr_adminq_work() for a very long time as more requests come in while we're still busy processing one. All of them are for link check, which is done while holding the general lock. We noticed that a system suffering from this interrupt storm will hang at boot or at shutdown.

Analyzing this with DTrace showed that every time we get a request from the queue there are between 4 and 16 more requests remaining. The total runtime of i40e_intr_adminq_work() can add up to several 100ms.

I have tried patching the driver to skip the link check, and the system apparently behaves normal. I have also checked just removing the mutex_enter/mutex_exit of the general lock, but that didn't improve the situation.

History

#1

Updated by Hans Rosenfeld almost 4 years ago

A few more properties of the device:

        name='api-version' type=string items=1 dev=none
            value='1.4'
        name='firmware-build' type=string items=1 dev=none
            value='892b'
        name='firmware-version' type=string items=1 dev=none
            value='4.40'
        name='printed-board-assembly' type=string items=1 dev=none
            value=''

#2

Updated by Robert Mustacchi almost 4 years ago

When this happens is the interface sending traffic, plumbed up with IP addresses, or just sitting there doing nothing?

#3

Updated by Hans Rosenfeld almost 4 years ago

All four interfaces are plumbed, but they are down and have no IP addresses configured.

Also available in: Atom PDF