Project

General

Profile

Bug #12441

mlxcx default queue sizes are a bit on the small size

Added by Paul Winder 9 months ago. Updated 8 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

When stress testing (on 25G NICs), I noticed the driver was consistently blocking because of full send and receive queues.

With these larger sizes, we can still fill up the rings but it appears to have minimal effect on throughput.

#
# Sizing of event and completion queues.
#
# The number of entries on each queue will be (1 << *_size_shift) -- so
# a value of 10 would mean 1024 entries.
#
#eq_size_shift = 10;
#cq_size_shift = 12;

#
# Sizing of send and receive queues.
#
# Note that this determines the size of the RX and TX rings that mlxcx will
# advertise to MAC. It also determines how many packet buffers we will allocate
# when starting the interface.
#
#sq_size_shift = 13;
#rq_size_shift = 12;

Related issues

Related to illumos gate - Bug #12383: Slow down and lock up in mlxcx receive interrupt pathClosedPaul Winder

Actions
#1

Updated by Paul Winder 9 months ago

  • Related to Bug #12383: Slow down and lock up in mlxcx receive interrupt path added
#2

Updated by Paul Winder 8 months ago

Per review feedback, the defaults are set based on the max. speed supported by the port.

Event queue shift size is left at its current default of 9. It is the completion and work queues which take all the traffic.

CQ, SQ and RQ sizes are left at their original settings for max supported speed of 10Gb/s or less, and the values in the previous comment are used for speeds of 25Gb/s and up.

The issue manifests itself typically on the Rx side, when using a tool like iperf throughput rates are below expected and when using dtrace to monitor the mlxcx_rq_refill_task(), it is constantly firing. It will only fire when Rx processing is attempting to refill the receive ring, it fails to get any buffers off the free list and the receive ring is running out of unused entries.

#3

Updated by Electric Monk 8 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 80 to 100

git commit 22d052287ba7ed169757650e2eec25fedbae163a

commit  22d052287ba7ed169757650e2eec25fedbae163a
Author: Paul Winder <pwinder@racktopsystems.com>
Date:   2020-04-14T15:40:07.000Z

    12383 Slow down and lock up in mlxcx receive interrupt path
    12438 mlxcx should pass receive messages to mac layer more frequently
    12439 mlxcx send rings can overflow
    12440 mlxcx should not block in the send path
    12441 mlxcx default queue sizes are a bit on the small size
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Andy Stormont <astormont@racktopsystems.com>
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Approved by: Garrett D'Amore <garrett@damore.org>

Also available in: Atom PDF