blown assert in i40e_intr_io_clear_cause()
As reported by sjorge in "illumos-joyent#189"https://github.com/joyent/illumos-joyent/issues/189; the following DEBUG assertion trips on shutdown. Robert Mustacchi and I have encountered this assertion before as well.
#ifdef DEBUG /* * Verify that the interrupt in question is disabled. This is a * prerequisite of modifying the data in question. */ reg = I40E_READ_REG(hw, I40E_PFINT_DYN_CTLN(i)); VERIFY0(reg & I40E_PFINT_DYN_CTLN_INTENA_MASK); #endif
I've spent a little bit of time finally looking at this and there are as a starting point a few problems. The first is that when we're iterating on debug bits to clear out the linked lists, we're doing it based on the number of queues. Unfortunately, this is incorrect. We should instead be doing this based on the number of interrupt vectors and properly indexing into them. I've also opted to remove this debug assertion as upon re-reading this, we've already theoretically disabled all the interrupts, so it should be fine here. Also, when we bring the device back up, we reset the PF anyways, so any lingering linked list entries should be fine.
To test this, I've rebooted several systems on debug bits. Systems which used to reliably hit this and trigger this no longer do so. A few other manual start and stops have seemed fine.
Updated by Electric Monk almost 3 years ago
- Status changed from New to Closed
commit 093e84535f35ec94776a855ada3dac96daf5d602 Author: Robert Mustacchi <firstname.lastname@example.org> Date: 2019-08-28T21:11:13.000Z 11577 blown assert in i40e_intr_io_clear_cause() Reviewed by: Dan McDonald <email@example.com> Reviewed by: Ryan Zezeski <firstname.lastname@example.org> Reviewed by: Randy Fishel <email@example.com> Reviewed by: Andrew Stormont <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>