Serialize ZTHR operations to eliminate races
ZoL 61c3391acc9 Serialize ZTHR operations to eliminate races
Updated by Dan McDonald about 3 years ago
From ZoL commiter Serapheim Dimitropoulos:
This issue was filed in ZoL (https://github.com/zfsonlinux/zfs/issues/7744) but we've also hit it
internally within DelphixOS (illumos-variant) and thus affects illumos too.
The bug was discovered by an assertion failure during a ztest run. The specific assertion is the
`ASSERT;` statement located in `zthr_resume()` of zthr.c and can fail when
multiple threads are trying to destroy/export a pool and at least one of them encounters an
Timeline includes (4 threads: 3 [thread A, B, C] trying to destroy the pool and 1 [thread D] the actual zthr):
1] Thread A starts and sends a cancel request to Thread D waiting for it to be cancelled
2] Thread B starts and sends a cancel request to Thread D waiting for it to be cancelled
3] Thread D gets cancelled and both thread A and B wake up.
4] Thread A fails its destroy because Thread B has a reference to the pool, so it decided to restart thread D as part of its cleanup.
5] Thread C starts and sends a cancel request to Thread D waiting for it to be cancelled
6] Thread B fails its destroy because Thread C has a reference to the pool, so it decided to restart thread D which is being cancelled and thus hits the assertion.
After encountering this bug and spending close attention on how that piece of infrastructure
and its locking was organized together with the wider pool locking we found a few more
variants of the above race. The ZoL fix should get rid of all of the races of this class.
As for the impact of the actual bug; it is minimal. The case has only been reproduced within
ztest which actually avoids a lot of locks on its path (compared to issuing commands from
the userland utilities). In addition, if this bug were to come up in a non-debug build (no ASSERTS)
then the result would be one leaked thread that is sleeping. If you want to test whether your
system has been affected, do the math and ensure that you don't have more than the expected
If your version of ZFS supports condensing after removal and the pool checkpoint, and the split
in the arc reclaim jobs you should have 4 zthr threads per pool. If you see on the stack of mdb/kmdb
that there exists 5 zthrs then you must have hit this bug.
Updated by Electric Monk about 3 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit 6a316e1f6d32750bb8fcf2558dcb17b90ca580fd Author: Serapheim Dimitropoulos <firstname.lastname@example.org> Date: 2019-04-29T18:49:59.000Z 10844 Serialize ZTHR operations to eliminate races Portions contributed by: Jerry Jelinek <email@example.com> Reviewed by: Matt Ahrens <firstname.lastname@example.org> Reviewed by: Tom Caputi <email@example.com> Reviewed by: Brian Behlendorf <firstname.lastname@example.org> Reviewed by: Kody Kantor <email@example.com> Approved by: Dan McDonald <firstname.lastname@example.org>