Project

General

Profile

Feature #7865

Allow i40e to use multiple rings

Added by Paul Winder almost 4 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
2017-02-14
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

Currently, i40e has a single ring (and interrupt vector) for processing transmit and receive data and this seems to limit the potential throughput. The best I could achieve was around 30Gb/s with large MTU sizes.

The plan is to make the implementation as simple as possible, i.e. to keep a single group with multiple rings assigned to a single PF. Multiple VF's (and groups) could be a separate project.


Related issues

Related to illumos gate - Bug #8318: i40e polling panics on debug after 7865ClosedRobert Mustacchi2017-06-04

Actions
#1

Updated by Paul Winder almost 4 years ago

One of the motivations for this work was a large variability in throughput on the i40e when placed under load, such that using iperf with multiple connections would invariably lead to at least one of them dropping almost down to zero. Initially I still had this problem but I managed to track it down. I believe it was the interaction between when the ring polling code disables interrupts whilst still active in the interrupt routine. Serialising the enabling and disabling of the interrupts in i40e_rx_ring_intr_[en|dis]able with i40e_ring_rx() stopped the throughput dropping down removed and lot of variability.

In tests using iperf with an MTU of 9000, with a single connection, I got throughput figures of

iperf -c 172.16.101.2  -t 30 -w 8M -P 1
------------------------------------------------------------
Client connecting to 172.16.101.2, TCP port 5001
TCP window size: 8.00 MByte
------------------------------------------------------------
[  3] local 172.16.101.1 port 43639 connected with 172.16.101.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-30.0 sec  77.2 GBytes  22.1 Gbits/sec

With two connections:
iperf -c 172.16.101.2  -t 30 -w 8M -P 2
------------------------------------------------------------
Client connecting to 172.16.101.2, TCP port 5001
TCP window size: 8.00 MByte
------------------------------------------------------------
[  4] local 172.16.101.1 port 60196 connected with 172.16.101.2 port 5001
[  3] local 172.16.101.1 port 52806 connected with 172.16.101.2 port 5001
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-30.0 sec  69.4 GBytes  19.9 Gbits/sec
[  3]  0.0-30.0 sec  68.5 GBytes  19.6 Gbits/sec
[SUM]  0.0-30.0 sec   138 GBytes  39.5 Gbits/sec

Using an MTU of 1500 the maximum throughput I could get was 24.5Gb/s which translates to > 2,000,000 packets/sec

#2

Updated by Robert Mustacchi almost 4 years ago

Hmm, if MAC was allowing the ring processing thread at the same time as the interrupt thread, that would explain a good deal about what we're seeing with the cyclical performance, as we'd like end up seeing duplicate packets delivered to TCP which would end up killing the window at times, or worse.

#3

Updated by Paul Winder almost 4 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 90
#4

Updated by Paul Winder almost 4 years ago

  • % Done changed from 90 to 100
#5

Updated by Electric Monk almost 4 years ago

  • Status changed from In Progress to Closed

git commit 396505af9432aab52f4853cfde77ca834a9cce76

commit  396505af9432aab52f4853cfde77ca834a9cce76
Author: Paul Winder <paul.winder@tegile.com>
Date:   2017-02-23T14:46:40.000Z

    7865 Allow i40e to use multiple rings
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Dale Ghent <daleg@omniti.com>
    Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>

#6

Updated by Yuri Pankov over 3 years ago

  • Related to Bug #8318: i40e polling panics on debug after 7865 added

Also available in: Atom PDF