Project

General

Profile

Bug #11358

i40e_alloc_ring_mem() unwinds when it shouldn't

Added by Robert Mustacchi 3 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

While working on 11357 I was came across a panic during i40e initialization.

> ::status
debugging crash dump /var/crash/volatile/vmcore.20 (64-bit) from sys76
operating system: 5.11 joyent_20180926T230144Z (i86pc)
image uuid: (not set)
panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe00068a2430 addr=0 occurred in module "i40e" due to a NULL pointer dereference
dump content: kernel pages only

> $C
fffffe00068a2570 i40e_free_rx_dma+0x2a(0, 1)
fffffe00068a25d0 i40e_free_ring_mem+0x89(fffffe053ce2b000, 1)
fffffe00068a2610 i40e_alloc_ring_mem+0x8e(fffffe053ce2b000)
fffffe00068a2650 i40e_start+0x110(fffffe053ce2b000, 1)
fffffe00068a2690 i40e_m_start+0x60(fffffe053ce2b000)
fffffe00068a26e0 mac_start+0x8e(fffffe053d2b89c0)
fffffe00068a2730 dls_open+0x83(fffffe0535485bc8, fffffe0535487c48, fffffe05656237b8)
fffffe00068a2790 dld_str_attach+0x1b0(fffffe05656237b8, 0)
fffffe00068a2800 dld_str_open+0xad(fffffe0546a1e808, fffffe00068a2a58, 0)
fffffe00068a2840 dld_open+0x37(fffffe0546a1e808, fffffe00068a2a58, 3, 0, fffffe0535a3fef0)
fffffe00068a28f0 qattach+0x12b(fffffe0535a59000, fffffe00068a2a58, 3, fffffe0535a3fef0, 0, 0)
fffffe00068a29f0 stropen+0x34f(fffffe05500a0200, fffffe00068a2a58, 3, fffffe0535a3fef0)
fffffe00068a2ab0 spec_open+0x281(fffffe00068a2c68, 3, fffffe0535a3fef0, 0)
fffffe00068a2b20 fop_open+0x9e(fffffe00068a2c68, 3, fffffe0535a3fef0, 0)
fffffe00068a2cd0 vn_openat+0x235(8043570, 0, 3, 6a8, fffffe00068a2de0, 0, 12, 0, fffffe0000000007)
fffffe00068a2e40 copen+0x214(ffd19553, 8043570, 3, fef546a8)
fffffe00068a2e70 openat32+0x27(ffd19553, 8043570, 2, fef546a8)
fffffe00068a2ea0 open32+0x25(8043570, 2, fef546a8)
fffffe00068a2f00 _sys_sysenter_post_swapgs+0x253()

i40e_free_ring_mem() is passing a NULL pointer to i40e_free_rx_dma(). The NULL value comes from i40e_trqpairs[i].itra_rxdata. If i40e_alloc_rx_data() fails then i40e_alloc_ring_mem() calls cleanup code.

i40e_alloc_ring_mem()

        if (i40e_alloc_rx_data(i40e, &i40e->i40e_trqpairs[i]) ==
            B_FALSE)
            goto unwind;
...
unwind:
    i40e_free_ring_mem(i40e, B_TRUE);
    return (B_FALSE);

But if i40e_alloc_rx_data() fails early then itrq_rxdata is never set and will be NULL.

i40e_alloc_rx_data()

static boolean_t
i40e_alloc_rx_data(i40e_t *i40e, i40e_trqpair_t *itrq)
{
    i40e_rx_data_t *rxd;

    rxd = kmem_zalloc(sizeof (i40e_rx_data_t), KM_NOSLEEP);
    if (rxd == NULL)
        return (B_FALSE);
    itrq->itrq_rxdata = rxd;

Furthermore, if i40e_alloc_rx_data() fails any other allocations it will free the rxd and set itrq_rxdata to NULL.

i40e_alloc_rx_data()

cleanup:
    i40e_free_rx_data(rxd);
    itrq->itrq_rxdata = NULL;
    return (B_FALSE);

The solution is to not unwind in i40e_alloc_ring_mem() if i40e_alloc_rx_data() fails because it performs its own cleanup.

I tested this by verifying the panic doesn't happen anymore.

History

#1

Updated by Electric Monk 3 months ago

  • Status changed from New to Closed

git commit 09aee6126f680324a9b019f9b4c77309dc611bf9

commit  09aee6126f680324a9b019f9b4c77309dc611bf9
Author: Ryan Zezeski <rpz@joyent.com>
Date:   2019-07-15T18:17:00.000Z

    11356 Want Fortville TSO support
    11357 want i40e multi-group support
    11358 i40e_alloc_ring_mem() unwinds when it shouldn't
    11359 Rework i40e transmit descriptor logic
    Portions contributed by: Rob Johnston <rob.johnston@joyent.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
    Reviewed by: Randy Fishel <randyf@sibernet.com>
    Approved by: Garrett D'Amore <garrett@damore.org>

Also available in: Atom PDF