Project

General

Profile

Bug #7189

network failover doesn't always set LSO property

Added by Arne Jansen almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2016-07-18
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

After a network failover (mpath based) from igb to e1000g (82574) we observed malfunctioning (very slow) TCP connections. The problem was that TCP still sent large segments down the stack in the expectation that those will get split up by the hardware, as is the case with igb. Unfortunatly e1000g doesn't support it with our hardware, so those segments just got discarded. Only the retransmits got sent in packets <= MTU and the data got sent eventually.
The TCP connection still had the lso (large segment offloading) flag set, which normally should get adjusted on network failover. The adjustment normally happens in conn_ip_output, in the block

    nce = ixa->ixa_nce;
    if (nce->nce_is_condemned) {
        error = ip_verify_nce(mp, ixa);
    [...]

On failover, the nce correctly got condemned, but in conn_ip_output, the previous block
    if (ire->ire_generation != ixa->ixa_ire_generation) {
        error = ip_verify_ire(mp, ixa);
        if (error != 0) {
    [...]

already got triggered and switched the nce to a fresh nce so that the is_condemned check didn't trigger later on.

It might be sufficient to just swap those two block, but it would be good if someone with the deeper knowledge of that part of the stack could have a look.
Also, there are several more places where ixa->ixa_nce gets switched. Maybe it is necessary to update the connection properties there, too.

As a workaround we disabled LSO on igb.

Also available in: Atom PDF