PCI hotplug probe doesn't properly handle ARI devices
If after enabling the hotplug service, one uses cfgadm unconfigure on an NVMe device and then try to configure it again, it will fail. The problem is pcicfg_configure(). In this case there is only one physical function that exists on this device. However, the device also supports ARI, this combination causes us to fail.
In this particular flow, we're looking at the first function and determine that we need to enable ARI mode. When ARI mode is enabled, we're able to programatically determine the next function to visit. In this case, there is no next function to visit on this hardware device, so we go out to the done part of the loop never taking another lap. However, the way that success is determined is based on the number of laps of the loop we've taken rather than the actual number of successfully visited functions. I suspect that this particular case hadn't been hit before.
The fix is easier. Instead of using the loop counter to determine success, keep track of how many functions we've successfully visited and use that instead.
With this in place, we end up being able to successfully use cfgadm configure and unconfigure on NVMe devices, a step on the way to full hotplug.
Updated by Electric Monk 5 months ago
- Status changed from New to Closed
commit 5c7348165122ba7c5485ccebd060a97705154b26 Author: Robert Mustacchi <email@example.com> Date: 2019-06-03T05:16:35.000Z 10938 PCI hotplug probe doesn't properly handle ARI devices Reviewed by: Jerry Jelinek <firstname.lastname@example.org> Reviewed by: Toomas Soome <email@example.com> Reviewed by: Peter Tribble <firstname.lastname@example.org> Reviewed by: Andy Fiddaman <email@example.com> Reviewed by: Gergő Doma <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>