NVMe driver sporadically lost track of completed I/O request, which leads to zpool hanging and machine panic.
We tested the new set of NVMe fixes and encountered this blocking issue. The issue can be easily reproduced by doing two zfs send/recvs of large dataset simultaneously.
In summary, the issue is that NVMe SSD device indicates that an I/O is completed (by looking at the completion entry in the Completion Queue) but the driver never gets a chance to process it.
Please see attached txt for the analysis of two cases (we have a bunch of crash dumps but unfortunately we cannot share it publicly or privately without going through lengthy approval process).
By the way, our NVMe drives are as follows:
- 22 x Intel DC P3600 400GB NVMe PCIe 3.0, MLC 2.5" 20nm SSDPE2ME400G4
- 2 x Intel DC P3700 800GB NVMe PCIe 3.0 HET MLC 2.5" 20nm SSDPE2MD800G4