Project

General

Profile

Bug #12958

i40e allocates large amounts of DMA

Added by Paul Winder 25 days ago. Updated 17 days ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

The i40e driver pre-allocates all its DMA for all its rings during the call to mc_start(9E).

Typically one instance will have 32 groups, with a total of 512 rings, each ring will ask for 0x600 DMA buffers for tx, and 0x800 DMA buffers for rx. With jumbo frame (9K) if you do the maths, this ends up being ~16GB of DMA.

I have being doing work alongside #12957 to reduce this and improve command line responsiveness.


Related issues

Related to illumos gate - Bug #12957: Some ipadm and dladm commands are slow on i40eClosed

Actions
Related to illumos gate - Bug #12262: Suboptimal vmem hash table slows down bootClosed

Actions

History

#1

Updated by Paul Winder 25 days ago

  • Related to Bug #12957: Some ipadm and dladm commands are slow on i40e added
#2

Updated by Paul Winder 25 days ago

Note: Since I originally added this, I realised I was running with 16 msix vector limit (the default is 8). The effect is the memory allocations would be ½ of the values below and since the majority of the time is in ring setup/teardown, I'd expect the times to be a bit over ½ those stated. The benefits in terms of percentage improvement will be similar.

To address this, I've changed the point at which the DMA is allocated and freed. I have moved it to allocate all rings for all groups in i40e_start(), to individual allocation for each ring as i40e_ring_start()/i40e_ring_stop() are called.

Based on this I have produced this table to illustrate how many calls are made into the DMA allocation and free routines and who long the command line utility takes to run the command.

In each case the tests were started with an un-configured NIC.

Aggregates

Before After
Command Alloc Free Time Alloc Free Time
create-aggr (2 i/fs) 3672064 37s 114752 1.2s
create-if aggr 0.1s 229504 229504 3.4s
create-addr 5s 114752 1.6s
set linkprop mtu 3672064 3672074 10min 114752 229504 5.2s
delete-aggr 3672064 20s 114752 0.6s

VNICs

Before After
Command Alloc Free Time Alloc Free Time
create-vnic 1836032 16s 114752 0.5s
create-if 0.1s 0.1s
create-addr 0.5s 0.5s

In the "Before" case, the large allocation is a one-off, subsequent VNICs created on the same link would not incur the same large allocation. But it does not detract that it would allocate ~16GB (with jumbo frames) of DMA for a single VNIC.

I40e

Before After
Command Alloc Free Time Alloc Free Time
create-if 1836032 20s 172128 114752 2.3s
create-addr 0.5s 57376 1.0s
delete-if 1836032 10.3s 114752 0.6s

So, if you have a single aggr, you will use 1GB vs 32GB, a single VNIC 500MB vs 16GB. Obviously, the more aggr/vnic/addr groups created the usage will increase per these tables, but in the most common scenarios there are significant memory savings.

Also, note that the memory usage estimates are for a single interface, the memory usage can be multiplied up for dual or quad port PCI adapters.

There are some notable timing improvements with this change alongside #12957

#3

Updated by Marcel Telka 25 days ago

  • Related to Bug #12262: Suboptimal vmem hash table slows down boot added
#4

Updated by Paul Winder 25 days ago

Tested along side #12957

#5

Updated by Electric Monk 22 days ago

  • Gerrit CR set to 802
#6

Updated by Electric Monk 17 days ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit aa2a44afcbfb9d08096ea5af01f0bb30d4b7f9a6

commit  aa2a44afcbfb9d08096ea5af01f0bb30d4b7f9a6
Author: Paul Winder <pwinder@racktopsystems.com>
Date:   2020-07-23T06:41:38.000Z

    12957 Some ipadm and dladm commands are slow on i40e
    12958 i40e allocates large amounts of DMA
    12972 Remove reference to deprecated ddi_power from i40e
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Reviewed by: Randy Fishel <randyf@sibernet.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF