Actions
Bug #14291
closedcxgbe: asserts when requesting more queues than available
Start date:
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
Description
Following assert is seen when trying to request more queues than available for the underlying Physical Function (PF). Need to query firmware on how many queues are actually available and restrict max queue allocation, accordingly.
# mdb vmcore.0 > ::msgbuf [...] PCIE-device: pci1425,0@0,4, t4nex3 PCI Express-device: pci1425,0@0,4, t4nex3 t4nex3 is /pci@0,0/pci8086,2f08@3/pci1425,0@0,4 NOTICE: t4nex3: Chelsio T540-LP-CR rev 1 NOTICE: t4nex3: S/N: PT11190200, P/N: 110123750A0 NOTICE: t4nex3: Firmware version: 1.26.3.0 NOTICE: t4nex3: Bootstrap version: 1.1.0.0 NOTICE: t4nex3: TP Microcode version: 0.1.4.9 NOTICE: t4nex3: No Expansion ROM loaded NOTICE: t4nex3: Serial Configuration version: e101000 NOTICE: t4nex3: VPD version: 82 NOTICE: t4nex3: (28 rxq, 160 txq total) 30 MSI-X interrupts. [...] NOTICE: cxgbe10: Multiple Rings Enabled NOTICE: cxgbe10 registered cxgbe10 is port 2 on t4nex3 cxgbe10 is /pci@0,0/pci8086,2f08@3/pci1425,0@0,4/cxgbe@2 NOTICE: cxgbe11: Multiple Rings Enabled NOTICE: cxgbe11 registered cxgbe11 is port 3 on t4nex3 cxgbe11 is /pci@0,0/pci8086,2f08@3/pci1425,0@0,4/cxgbe@3 NOTICE: cxgbe6: Multiple Rings Enabled NOTICE: cxgbe6 registered cxgbe6 is port 0 on t4nex3 cxgbe6 is /pci@0,0/pci8086,2f08@3/pci1425,0@0,4/cxgbe@0 NOTICE: cxgbe7: Multiple Rings Enabled NOTICE: cxgbe7 registered cxgbe7 is port 1 on t4nex3 cxgbe7 is /pci@0,0/pci8086,2f08@3/pci1425,0@0,4/cxgbe@1 [...] WARNING: cxgbe6: failed to create Ethernet egress queue: 12 WARNING: t4nex3: failed to allocate egress queue(2): 12 [...] panic[cpu24]/thread=fffffeb313104420: assertion failed: pi->flags & PORT_INIT_DONE, file: ../../../common/io/cxgbe/t4nex/t4_nexus.c, line: 2584 fffffe00f6115540 fffffffffbdcd415 () fffffe00f6115570 t4nex:port_full_uninit+2f () fffffe00f61155c0 t4nex:port_full_init+51 () fffffe00f6115620 t4nex:t4_init_synchronized+7a () fffffe00f6115660 t4nex:t4_mc_start+29 () fffffe00f61156b0 mac:mac_start+65 () fffffe00f6115710 dls:dls_open+fc () fffffe00f6115780 dld:dld_str_attach+150 () fffffe00f61157f0 dld:dld_str_open+dc () fffffe00f6115830 dld:dld_open+27 () fffffe00f61158e0 genunix:qattach+10e () fffffe00f6115a00 genunix:stropen+32c () fffffe00f6115ad0 specfs:spec_open+4d0 () fffffe00f6115b40 genunix:fop_open+a4 () fffffe00f6115ce0 genunix:vn_openat+208 () fffffe00f6115e50 genunix:copen+431 () fffffe00f6115e80 genunix:openat32+1a () fffffe00f6115eb0 genunix:open32+1c () fffffe00f6115f00 unix:brand_sys_sysenter+2d2 ()
Updated by Rahul Lakkireddy 8 months ago
Tested the fix and the max Txq allocation is appropriately restricted to 100 (25 per port on 4-port T540-LP-CR) available for the underlying PF.
# cat /etc/system.d/01-cxgb4 set ddi_msix_alloc_limit=32 # cat /usr/kernel/drv/t4nex.conf max-ntxq-10G-port=256; max-nrxq-10G-port=256; # cat /var/adm/messages PCIE-device: pci1425,0@0,4, t4nex3 PCI Express-device: pci1425,0@0,4, t4nex3 t4nex3 is /pci@0,0/pci8086,2f08@3/pci1425,0@0,4 NOTICE: t4nex3: Chelsio T540-LP-CR rev 1 NOTICE: t4nex3: S/N: PT11190200, P/N: 110123750A0 NOTICE: t4nex3: Firmware version: 1.26.4.0 NOTICE: t4nex3: Bootstrap version: 1.1.0.0 NOTICE: t4nex3: TP Microcode version: 0.1.4.9 NOTICE: t4nex3: No Expansion ROM loaded NOTICE: t4nex3: Serial Configuration version: e101000 NOTICE: t4nex3: VPD version: 82 t4nex3: (28 rxq, 100 txq total) 30 MSI-X interrupts. # kstat cxgbe:6:config module: cxgbe instance: 6 name: config class: net controller t4nex3 crtime 1074109.118369739 factory_mac_address 0007435196F0 first_rxq 0 first_txq 0 idx 0 nrxq 7 ntxq 25 snaptime 1074965.306722630 # kstat cxgbe:7:config module: cxgbe instance: 7 name: config class: net controller t4nex3 crtime 1074109.716031810 factory_mac_address 0007435196F8 first_rxq 7 first_txq 25 idx 1 nrxq 7 ntxq 25 snaptime 1074971.510631792 # kstat cxgbe:10:config module: cxgbe instance: 10 name: config class: net controller t4nex3 crtime 1074104.793409571 factory_mac_address 000743519700 first_rxq 14 first_txq 50 idx 2 nrxq 7 ntxq 25 snaptime 1074975.224074285 # kstat cxgbe:11:config module: cxgbe instance: 11 name: config class: net controller t4nex3 crtime 1074105.243791378 factory_mac_address 000743519708 first_rxq 21 first_txq 75 idx 3 nrxq 7 ntxq 25 snaptime 1074977.635534784
Updated by Electric Monk 5 months ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 77ac03cbec412857d39c4898c9ed10abb6061418
commit 77ac03cbec412857d39c4898c9ed10abb6061418 Author: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Date: 2022-03-14T16:53:32.000Z 14291 cxgbe: asserts when requesting more queues than available Reviewed by: Igor Kozhukhov <igor@dilos.org> Reviewed by: Ryan Zezeski <ryan@oxide.computer> Approved by: Robert Mustacchi <rm@fingolfin.org>
Actions