Project

General

Profile

Actions

Bug #9694

closed

Parallel dump hangs

Added by Yuri Pankov almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
kernel
Start date:
2018-08-02
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Parallel Dump should is currently disabled because it hangs. Apparently the last of the dump threads misses the wakeup and hangs the dump.

The hang occurs when the main task thread is waiting for all of the helper threads to complete, and the helper threads are waiting for the main task thread to close the queue of requests (helperq).
There main task thread always opens the helperq. However, if there are no helpers being used (serial dump) it doesn't close it. This is not a problem if there really are no helper threads as nothing is waiting on that queue and it's eventual close. However, a follow on fix was added after the initial integration of parallel dump, where the main task thread checks to ensure that at least one helper thread has registered itself and if not it assumes that there are no helpers. The problem is that at the time of the check it is possible that there are helper threads but none of them have registered yet. The main task thread thus assumes (incorrectly) that there are no helper threads and does a serial lzjb dump, and thus does not close the open helperq on completion. However, there are helper threads and they are sitting waiting on the helperq, and the taskq waits for them. This results in a hang.

There are 2 aspects that need to be addressed:
1. The main task thread should close the helperq when there is no more data to be compressed. This will fix the hang.
2. The main task thread should allow a little time for the helper threads to register before assuming that there are no helpers and reverting to serial dump.

Actions #1

Updated by Yuri Pankov almost 3 years ago

  • Description updated (diff)
Actions #2

Updated by Electric Monk almost 3 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 50 to 100

git commit 6ccea42291d6cef3970fbb35ece075406851267f

commit  6ccea42291d6cef3970fbb35ece075406851267f
Author: Joyce McIntosh <joyce.mcintosh@nexenta.com>
Date:   2018-08-07T19:46:08.000Z

    9694 Parallel dump hangs
    Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
    Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
    Reviewed by: Evan Layton <evan.layton@nexenta.com>
    Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
    Reviewed by: John Levon <levon@movementarian.org>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Actions

Also available in: Atom PDF