Bug #9079

race condition in starting and ending condesing thread for indirect vdevs

Added by Serapheim Dimitropoulos 11 days ago. Updated 7 days ago.

Status:ClosedStart date:2018-02-08
Priority:NormalDue date:
Assignee:Serapheim Dimitropoulos% Done:

100%

Category:zfs - Zettabyte File System
Target version:-
Difficulty:Medium Tags:needs-triage

Description

The timeline of the race condition is the following:
[1] Thread A is about to finish condesing the first vdev in spa_condense_indirect_thread(),
so it calls the spa_condense_indirect_complete_sync() sync task which sets the
spa_condensing_indirect field to NULL. Waiting for the sync task to finish, thread A
sleeps until the txg is done. When this happens, thread A will acquire spa_async_lock
and set spa_condense_thread to NULL.
[2] While thread A waits for the txg to finish, thread B which is running spa_sync() checks
whether it should condense the second vdev in vdev_indirect_should_condense() by checking
the spa_condensing_indirect field which was set to NULL by spa_condense_indirect_thread()
from thread A. So it goes on and tries to spawn a new condensing thread in
spa_condense_indirect_start_sync() and the aforementioned assertions fails because thread A
has not set spa_condense_thread to NULL (which is basically the last thing it does before
returning).

The main issue here is that we rely on both spa_condensing_indirect and spa_condense_thread to
signify whether a condensing thread is running. Ideally we would only use one throughout the
codebase. In addition, for managing spa_condense_thread we currently use spa_async_lock which
basically tights condensing to scrubing when it comes to pausing and resuming those actions
during spa export.

History

#1 Updated by Electric Monk 7 days ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit 667ec66f1b4f491d5e839644e0912cad1c9e7122

commit  667ec66f1b4f491d5e839644e0912cad1c9e7122
Author: Serapheim Dimitropoulos <serapheim@delphix.com>
Date:   2018-02-13T16:25:39.000Z

    9079 race condition in starting and ending condesing thread for indirect vdevs
    Reviewed by: Matt Ahrens <mahrens@delphix.com>
    Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
    Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>

Also available in: Atom