Bug #13207
closedCreating an aggr with more than 128 Tx or Rx rings panics
100%
Description
The easiest way to trigger this is to create an aggr with more than 2 links, and at least 2 of those links are mlxcx. Eg
dladm create-aggr -l mlxcx0 -l mlxcx1 -l mlxcx2 aggr0
When you do this the system panic with an assertion failure:
assertion failed: aggr_add_pseudo_tx_group(port, &grp->lg_tx_group) == 0 (0x5 == 0x0), file: ../../common/io/aggr/aggr_grp.c, line: 1550
And then if the link was permanent (ie not created with the "-t" option) the system goes into panic<->reboot loop.
Related issues
Updated by Paul Winder almost 3 years ago
EIO
returned by aggr_add_pseudo_tx_ring()
. EIO
is returned when the MAX_RINGS_PER_GROUP
limit (128) is reached. Two issues:
- Why is the limit reached?
- The system does not need to panic.
The limit is reached because mlxcx has a default of 64 rings per group, so after 2 links in the aggr MAX_RINGS_PER_GROUP
is reached. That default is too high and will be addressed in another bug report.
The code in aggr_grp_create()
needs to be fixed to remove the VERIFY's.
This snippet:
/* * Attach each port if necessary. */ for (port = grp->lg_ports; port != NULL; port = port->lp_next) { /* * Create the pseudo ring for each HW ring of the * underlying port. Note that this is done after the * aggr registers its MAC. */ VERIFY3S(aggr_add_pseudo_tx_group(port, &grp->lg_tx_group), ==, 0); for (i = 0; i < grp->lg_rx_group_count; i++) { VERIFY3S(aggr_add_pseudo_rx_group(port, &grp->lg_rx_groups[i]), ==, 0); } if (aggr_port_notify_link(grp, port)) link_state_changed = B_TRUE; /* * Initialize the callback functions for this port. */ aggr_port_init_callbacks(port); }
Updated by Paul Winder almost 3 years ago
- Related to Bug #13208: Create aggr fails when underlying links have more than 128 Tx rings added
Updated by Paul Winder almost 3 years ago
I tested by creating an aggr with too many mlxcx links and (before fixing #13208), dladm now fails:
# dladm create-aggr -t -l mlxcx0 -l mlxcx1 -l mlxcx2 aggr0 dladm: create operation failed: I/O error
I also hard code an error to simulate the failure at what was the 2nd VERIFY to confirm that failure path worked.
Creating aggr with 2 mlxcx was succesful, and aggr with more than 2 of other NIC varieties was successful
Updated by Electric Monk almost 3 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit adc528899caad05412c9b8d397e277900adf786b
commit adc528899caad05412c9b8d397e277900adf786b Author: Paul Winder <paul@winder.uk.net> Date: 2020-12-08T18:36:43.000Z 13207 Creating an aggr with more than 128 Tx or Rx rings panics Reviewed by: Robert Mustacchi <rm@fingolfin.org> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Dan McDonald <danmcd@joyent.com>