Actions
Bug #11358
closedi40e_alloc_ring_mem() unwinds when it shouldn't
Start date:
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
Description
While working on 11357 I was came across a panic during i40e initialization.
> ::status debugging crash dump /var/crash/volatile/vmcore.20 (64-bit) from sys76 operating system: 5.11 joyent_20180926T230144Z (i86pc) image uuid: (not set) panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe00068a2430 addr=0 occurred in module "i40e" due to a NULL pointer dereference dump content: kernel pages only > $C fffffe00068a2570 i40e_free_rx_dma+0x2a(0, 1) fffffe00068a25d0 i40e_free_ring_mem+0x89(fffffe053ce2b000, 1) fffffe00068a2610 i40e_alloc_ring_mem+0x8e(fffffe053ce2b000) fffffe00068a2650 i40e_start+0x110(fffffe053ce2b000, 1) fffffe00068a2690 i40e_m_start+0x60(fffffe053ce2b000) fffffe00068a26e0 mac_start+0x8e(fffffe053d2b89c0) fffffe00068a2730 dls_open+0x83(fffffe0535485bc8, fffffe0535487c48, fffffe05656237b8) fffffe00068a2790 dld_str_attach+0x1b0(fffffe05656237b8, 0) fffffe00068a2800 dld_str_open+0xad(fffffe0546a1e808, fffffe00068a2a58, 0) fffffe00068a2840 dld_open+0x37(fffffe0546a1e808, fffffe00068a2a58, 3, 0, fffffe0535a3fef0) fffffe00068a28f0 qattach+0x12b(fffffe0535a59000, fffffe00068a2a58, 3, fffffe0535a3fef0, 0, 0) fffffe00068a29f0 stropen+0x34f(fffffe05500a0200, fffffe00068a2a58, 3, fffffe0535a3fef0) fffffe00068a2ab0 spec_open+0x281(fffffe00068a2c68, 3, fffffe0535a3fef0, 0) fffffe00068a2b20 fop_open+0x9e(fffffe00068a2c68, 3, fffffe0535a3fef0, 0) fffffe00068a2cd0 vn_openat+0x235(8043570, 0, 3, 6a8, fffffe00068a2de0, 0, 12, 0, fffffe0000000007) fffffe00068a2e40 copen+0x214(ffd19553, 8043570, 3, fef546a8) fffffe00068a2e70 openat32+0x27(ffd19553, 8043570, 2, fef546a8) fffffe00068a2ea0 open32+0x25(8043570, 2, fef546a8) fffffe00068a2f00 _sys_sysenter_post_swapgs+0x253()
i40e_free_ring_mem() is passing a NULL pointer to i40e_free_rx_dma(). The NULL value comes from i40e_trqpairs[i].itra_rxdata. If i40e_alloc_rx_data() fails then i40e_alloc_ring_mem() calls cleanup code.
i40e_alloc_ring_mem() if (i40e_alloc_rx_data(i40e, &i40e->i40e_trqpairs[i]) == B_FALSE) goto unwind; ... unwind: i40e_free_ring_mem(i40e, B_TRUE); return (B_FALSE);
But if i40e_alloc_rx_data() fails early then itrq_rxdata is never set and will be NULL.
i40e_alloc_rx_data() static boolean_t i40e_alloc_rx_data(i40e_t *i40e, i40e_trqpair_t *itrq) { i40e_rx_data_t *rxd; rxd = kmem_zalloc(sizeof (i40e_rx_data_t), KM_NOSLEEP); if (rxd == NULL) return (B_FALSE); itrq->itrq_rxdata = rxd;
Furthermore, if i40e_alloc_rx_data() fails any other allocations it will free the rxd and set itrq_rxdata to NULL.
i40e_alloc_rx_data() cleanup: i40e_free_rx_data(rxd); itrq->itrq_rxdata = NULL; return (B_FALSE);
The solution is to not unwind in i40e_alloc_ring_mem() if i40e_alloc_rx_data() fails because it performs its own cleanup.
I tested this by verifying the panic doesn't happen anymore.
Updated by Electric Monk almost 3 years ago
- Status changed from New to Closed
git commit 09aee6126f680324a9b019f9b4c77309dc611bf9
commit 09aee6126f680324a9b019f9b4c77309dc611bf9 Author: Ryan Zezeski <rpz@joyent.com> Date: 2019-07-15T18:17:00.000Z 11356 Want Fortville TSO support 11357 want i40e multi-group support 11358 i40e_alloc_ring_mem() unwinds when it shouldn't 11359 Rework i40e transmit descriptor logic Portions contributed by: Rob Johnston <rob.johnston@joyent.com> Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Patrick Mooney <patrick.mooney@joyent.com> Reviewed by: Randy Fishel <randyf@sibernet.com> Approved by: Garrett D'Amore <garrett@damore.org>
Actions