mac_unregister() needs to mod_hash_remove() BEFORE holding the perimeter.
Digging through old workspaces, I found this one, that was reported to me by Circonus originally.
Essentially, there's a lock-entry inversion between mac_unregister(), and other parts of the mac/GLDv3 code. The mod_hash_remove() call has its own lock, which other mod_hash users acquire, notably mac_pool_update().
Reproducing this bug is straightforward. Get ifconfig(1M) plumb/unplumb looping, get modunload(1M) looping, and get poold starting and stopping. The three attached scripts all just need to be run in the background concurrently and eventually one of the ifconfig(1M) processes will hang, and a kernel coredump will show mac_unregister(), holding the mac perimeter, trying to acquire the mod_hash lock, while mac_pool_update() tries to hold the mac perimeter while holding mod_hash lock.
Updated by Electric Monk about 5 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit 8241ccbb39665a24ebedcca509f82ef3f0b6dd83 Author: Dan McDonald <email@example.com> Date: 2017-03-03T21:24:44.000Z 6470 mac_unregister() needs to mod_hash_remove() BEFORE holding the perimeter. Reviewed by: Ryan Zezeski <firstname.lastname@example.org> Reviewed by: Michael Speer <email@example.com> Approved by: Richard Lowe <firstname.lastname@example.org>