Project

General

Profile

Actions

Bug #14758

closed

spurious presence detect change notification when enabling PCIe hotplug interrupt for the first time

Added by Rick Altherr 14 days ago. Updated about 14 hours ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
kernel
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

While chasing #14747, I noticed that the race was caused by a hotplug event occurring as soon as the hotplug interrupt was enabled by pcieb_attach. The specific event delivered was a slot presence detect change notification which is very suspect as no hardware changes were made during the entire boot. This was only happening on slots where a device is actually present. Even more confusing is that since a device is present in the slot, the PCIe link was fully up and placeholder device nodes were already created. This spurious hotplug event triggers a state change from ENABLED to PRESENT which is a downgrade that destroys the child device nodes, powers off the slot, and leaves the slot disconnected/unconfigured.

Looking at pciehpc_enable_intr, this shouldn't be happening. All of the interrupt status bits are cleared before enabling the notification sources and the top-level hotplug interrupt:

static int
pciehpc_enable_intr(pcie_hp_ctrl_t *ctrl_p)
{
    pcie_hp_slot_t    *slot_p = ctrl_p->hc_slots[0];
    pcie_bus_t    *bus_p = PCIE_DIP2BUS(ctrl_p->hc_dip);
    uint16_t    reg;

    /* clear any interrupt status bits */
    reg = pciehpc_reg_get16(ctrl_p,
        bus_p->bus_pcie_off + PCIE_SLOTSTS);
    pciehpc_reg_put16(ctrl_p,
        bus_p->bus_pcie_off + PCIE_SLOTSTS, reg);

    /* read the Slot Control Register */
    reg = pciehpc_reg_get16(ctrl_p,
        bus_p->bus_pcie_off + PCIE_SLOTCTL);

    /*
     * enable interrupts: power fault detection interrupt is enabled
     * only when the slot is powered ON
     */
    if (slot_p->hs_info.cn_state >= DDI_HP_CN_STATE_POWERED)
        pciehpc_reg_put16(ctrl_p, bus_p->bus_pcie_off +
            PCIE_SLOTCTL, reg | PCIE_SLOTCTL_INTR_MASK);
    else
        pciehpc_reg_put16(ctrl_p, bus_p->bus_pcie_off +
            PCIE_SLOTCTL, reg | (PCIE_SLOTCTL_INTR_MASK &
            ~PCIE_SLOTCTL_PWR_FAULT_EN));

    return (DDI_SUCCESS);
}

Setting a breakpoint at pciehpc_enable_intr, I continued until it fired on a slot with a device physically present, in this case pcie9 which has a corresponding PCIe root port at 40/1/1. At that point, the pcie_hp_slot_t's hs_info.cn_state shows DDI_HP_CN_STATE_ENABLED indicating the link was up and functioning as expected. Note that the PCIe capability is at config offset 0x58 for this root port and SLOTCTL is at offset 0x18 within the capability.

[1]> 58+18::rdpcicfg 40 1 1
5000c0

SLOTCTL currently only has bits 7:6 set to 0b11 which is turning the attention indicator off. SLOTSTS has only bits 4 (Command Completed) and 6 (Presence Detect State). This all makes sense so far.

SLOTSTS bits are either R1WC or RO so let's follow along with pciehp_enable_intr and clear any pending notifications.

[1]> 58+18::wrpcicfg 40 1 1 5000c0
[1]> 58+18::rdpcicfg 40 1 1
4000c0

Command Completed is cleared as expected. Now, what happens if we enable the interrupt sources all in one go:

[1]> 58+18::wrpcicfg 40 1 1 5010ff
[1]> 58+18::rdpcicfg 40 1 1
5810fa

SLOTCTL didn't accept setting Attention Button Pressed Enable or MRL Sensor Changed Enable which makes sense as this port doesn't support those features. SLOTSTS has Command Completed, Presence Detect State, and Present Detect Changed set. So, the interrupt is happening because the root port actually generated a presence detect changed notification except presence detect state didn't change?!?

What happens if we clear the enabled, clear the pending notifications, and only enable the presence detect changed notification:

[1]> 58+18::wrpcicfg 40 1 1 5800c0
[1]> 58+18::rdpcicfg 40 1 1
5000c0
[1]> 58+18::wrpcicfg 40 1 1 5000c8
[1]> 58+18::rdpcicfg 40 1 1
5000c8

Huh. The notification didn't fire this time. Let try again by starting over in a clean boot from power-on:

kmdb: stop at pcie`pciehpc_enable_intr
kmdb: target stopped at:
pcie`pciehpc_enable_intr:       pushq  %rbp
[1]> $r
%rax = 0xfffffffff7c43d70 pcie`pciehpc_enable_intr %r9  = 0xfffffb04506d3730
%rbx = 0xfffffdc58aefc5c8                 %r10 = 0x0000000000000001
%rcx = 0x0000000000000000                 %r11 = 0xfffffdc59fc8d040
%rdx = 0x0000000000060400                 %r12 = 0xfffffdc58ae99080
%rsi = 0x0000000000000001                 %r13 = 0xfffffdc58ae99080
%rdi = 0xfffffdc59f7432c0                 %r14 = 0x0000000000000000
%r8  = 0x0000000000000000                 %r15 = 0xfffffb04506d3960

%rip = 0xfffffffff7c43d70 pcie`pciehpc_enable_intr
%rbp = 0xfffffb04506d3950
%rsp = 0xfffffb04506d3928
%rflags = 0x00000246
  id=0 vip=0 vif=0 ac=0 vm=0 rf=0 nt=0 iopl=0x0
  status=<of,df,IF,tf,sf,ZF,af,PF,cf>

%cs = 0x0030    %ds = 0x0038    %es = 0x0038    %fs = 0x0000
%gs = 0x0000    %gsbase = 0xfffffdc59f730000    %kgsbase = 0x200000000
%trapno = 0x3   %err = 0x0      %cr2 = 0x0      %cr3 = 0x7fff1000
[1]> 0xfffffdc58aefc5c8::print struct dev_info devi_bus.port_up.priv_p | ::print pcie_bus_t bus_bdf bus_pcie_off bus_hp_ctrl
bus_bdf = 0x4009
bus_pcie_off = 0x58
bus_hp_ctrl = 0xfffffdc59f7432c0
[1]> 58+18::rdpcicfg 40 1 1
5000c0
[1]> 58+18::wrpcicfg 40 1 1 5000c8
[1]> 58+18::rdpcicfg 40 1 1
5800c8

Ok. So the very first time Presence Detect Changed Enable is set, it will fire if Presence Detect State is currently set. That's annoying.

Actions #1

Updated by Rick Altherr 14 days ago

Rearranging pciehpc_enable_intr to enable all the notification sources first, then clear the pending notifications, then enable the top-level interrupt should prevent the spurious interrupt from occurring. I tested this idea out in kmdb by manually enabling the notification sources and then letting pciehpc_enable_intr do it's normal process of clear the pending events before enabling the interrupt and it seemed to work:

kmdb: stop at pcie`pciehpc_enable_intr
kmdb: target stopped at:
pcie`pciehpc_enable_intr:       pushq  %rbp
[1]> $r
%rax = 0xfffffffff7c43d70 pcie`pciehpc_enable_intr %r9  = 0xfffffb04506d3730
%rbx = 0xfffffdc58aefc5c8                 %r10 = 0x0000000000000001
%rcx = 0x0000000000000000                 %r11 = 0xfffffdc59fc8d040
%rdx = 0x0000000000060400                 %r12 = 0xfffffdc58ae99080
%rsi = 0x0000000000000001                 %r13 = 0xfffffdc58ae99080
%rdi = 0xfffffdc59f7432c0                 %r14 = 0x0000000000000000
%r8  = 0x0000000000000000                 %r15 = 0xfffffb04506d3960

%rip = 0xfffffffff7c43d70 pcie`pciehpc_enable_intr
%rbp = 0xfffffb04506d3950
%rsp = 0xfffffb04506d3928
%rflags = 0x00000246
  id=0 vip=0 vif=0 ac=0 vm=0 rf=0 nt=0 iopl=0x0
  status=<of,df,IF,tf,sf,ZF,af,PF,cf>

%cs = 0x0030    %ds = 0x0038    %es = 0x0038    %fs = 0x0000
%gs = 0x0000    %gsbase = 0xfffffdc59f730000    %kgsbase = 0x200000000
%trapno = 0x3   %err = 0x0      %cr2 = 0x0      %cr3 = 0x7fff1000
[1]> 0xfffffdc58aefc5c8::print struct dev_info devi_bus.port_up.priv_p | ::print pcie_bus_t bus_bdf bus_pcie_off bus_hp_ctrl
bus_bdf = 0x4009
bus_pcie_off = 0x58
bus_hp_ctrl = 0xfffffdc59f7432c0
[1]> 58+18::rdpcicfg 40 1 1
5000c0
[1]> 58+18::wrpcicfg 40 1 1 0010df
[1]> 58+18::rdpcicfg 40 1 1
5810da
[1]> :c
.....
teapot console login: root
root@teapot:~# cfgadm -l
Ap_Id                          Type         Receptacle   Occupant     Condition
pcie0                          unknown      disconnected unconfigured unknown
pcie1                          unknown      empty        unconfigured unknown
pcie2                          unknown      empty        unconfigured unknown
pcie3                          unknown      empty        unconfigured unknown
pcie4                          unknown      empty        unconfigured unknown
pcie5                          unknown      empty        unconfigured unknown
pcie6                          unknown      empty        unconfigured unknown
pcie7                          unknown      empty        unconfigured unknown
pcie8                          unknown      empty        unconfigured unknown
pcie9                          nvme/hp      connected    configured   ok
pcie16                         unknown      disconnected unconfigured unknown
pcie17                         unknown      empty        unconfigured unknown
pcie18                         unknown      disconnected unconfigured unknown
pcie19                         unknown      empty        unconfigured unknown
root@teapot:~#

Actions #2

Updated by Electric Monk 14 days ago

  • Gerrit CR set to 2199
Actions #3

Updated by Rick Altherr 4 days ago

Tested by booting on Oxide platform known to exhibit this behavior. With a breakpoint set on pciehpc_intr, verified the breakpoint did not fire during a normal boot with no insertion/removal events. Removed device from a PCIe hotplug slot and verified interrupt fired to notify removal. Reinserted devices and verified interrupt fired to notify insertion.

Booted successfully on i86pc AMD Rome system but lack system with hotplug to verify interrupt firing.

Actions #4

Updated by Electric Monk about 14 hours ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit fe426563f5e7383c12abf6a347e131898d1a7f6c

commit  fe426563f5e7383c12abf6a347e131898d1a7f6c
Author: Rick Altherr <rick@oxidecomputer.com>
Date:   2022-07-05T21:07:04.000Z

    14758 spurious presence detect change notification when enabling PCIe hotplug interrupt for the first time
    Reviewed by: Robert Mustacchi <rm+illumos@fingolfin.org>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Actions

Also available in: Atom PDF