Project

General

Profile

Bug #9187

racing condition between vdev label and spa_last_synced_txg in vdev_validate

Added by Brad Lewis over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2018-02-23
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

ztest failed with uncorrectable IO error despite having the fix for #7163. Both sides of the mirror have CANT_OPEN_BAD_LABEL, which also distinguishes it from that issue.

Definitely seems like a racing condition between the vdev_validate and spa_sync:
1. Thread A (spa_sync): vdev label is updated to latest txg
2. Thread B (vdev_validate): vdev label's txg is compared to spa_last_synced_txg and is ahead.
3. Thread A (spa_sync): spa_last_synced_txg is updated to latest txg.

Solution: do not check txg in vdev_validate unless config lock is held.

History

#1

Updated by Electric Monk over 1 year ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit d1de72cfa29ab77ff80e2bb0e668a6afa5bccaf0

commit  d1de72cfa29ab77ff80e2bb0e668a6afa5bccaf0
Author: Pavel Zakharov <pavel.zakharov@delphix.com>
Date:   2018-03-19T18:21:36.000Z

    9187 racing condition between vdev label and spa_last_synced_txg in vdev_validate
    Reviewed by: George Wilson <george.wilson@delphix.com>
    Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com>
    Approved by: Robert Mustacchi <rm@joyent.com>

Also available in: Atom PDF