apix may lose interrupts occuring while softint is running at same IPL
During development of the emlxs driver Bob Warning of ATTO found that hardware interrupts could go missing. He analyzed that issue and found a bug in apix. Here's what happens:
Assume a softint is currently being processed by apix_do_interrupt() at IPL x. While this is running a hardint comes in at the same IPL x and is put on the pending queue, which sets bit x in x_intr_pending. Eventually the softint finishes, and the hardint stays pending for an indefinite amount of time without being processed.
Now if a 2nd hardint comes in at IPL x the bit x in x_intr_pending will be cleared and the 2nd hardint will be processed. Once that finishes apix will process pending hardints, but the first hardint will be forgotten because the bit is already cleared.
The solution to this is to do the same processing of pending hardints after softints finish, just as is done for hardints already.
The problem has been seen with emlxs in target mode and with qlt. Apparently it's much easier to trigger it when using COMSTAR as that seems to trigger lots of softints. This bug may also be the cause of the lost interrupts that were reported with nvme, but I have no proof that this is actually the case.
Updated by Electric Monk almost 7 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit cb214887258e20b89cd275946a280fee9c4b47fa Author: Bob Warning <RWarning@atto.com> Date: 2017-01-10T09:48:14.000Z 7724 apix may lose interrupts occuring while softint is running at same IPL Reviewed by: Hans Rosenfeld <firstname.lastname@example.org> Reviewed by: Dan McDonald <email@example.com> Reviewed by: Robert Mustacchi <firstname.lastname@example.org> Approved by: Gordon Ross <email@example.com>