Bug #9694
closedParallel dump hangs
100%
Description
Parallel Dump should is currently disabled because it hangs. Apparently the last of the dump threads misses the wakeup and hangs the dump.
The hang occurs when the main task thread is waiting for all of the helper threads to complete, and the helper threads are waiting for the main task thread to close the queue of requests (helperq).
There main task thread always opens the helperq. However, if there are no helpers being used (serial dump) it doesn't close it. This is not a problem if there really are no helper threads as nothing is waiting on that queue and it's eventual close. However, a follow on fix was added after the initial integration of parallel dump, where the main task thread checks to ensure that at least one helper thread has registered itself and if not it assumes that there are no helpers. The problem is that at the time of the check it is possible that there are helper threads but none of them have registered yet. The main task thread thus assumes (incorrectly) that there are no helper threads and does a serial lzjb dump, and thus does not close the open helperq on completion. However, there are helper threads and they are sitting waiting on the helperq, and the taskq waits for them. This results in a hang.
There are 2 aspects that need to be addressed:
1. The main task thread should close the helperq when there is no more data to be compressed. This will fix the hang.
2. The main task thread should allow a little time for the helper threads to register before assuming that there are no helpers and reverting to serial dump.
Updated by Electric Monk about 5 years ago
- Status changed from In Progress to Closed
- % Done changed from 50 to 100
git commit 6ccea42291d6cef3970fbb35ece075406851267f
commit 6ccea42291d6cef3970fbb35ece075406851267f Author: Joyce McIntosh <joyce.mcintosh@nexenta.com> Date: 2018-08-07T19:46:08.000Z 9694 Parallel dump hangs Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Evan Layton <evan.layton@nexenta.com> Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com> Reviewed by: John Levon <levon@movementarian.org> Approved by: Richard Lowe <richlowe@richlowe.net>