Project

General

Profile

Bug #12143

scan code should check the return value of zfs_btree_first

Added by Kody Kantor about 2 months ago. Updated about 1 month ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:

Description

Gernot reported on smartos-discuss that he has systems reliably panicking when running a zpool scrub. The message thread is here: https://smartos.topicbox.com/groups/smartos-discuss/T7028875b79636021/crash-while-scrubing

Gernot provided a dump with the given stack:

> ::stack
scan_io_queue_fetch_ext+0xdc(fffffe0bc70e8880)
scan_io_queues_run_one+0xb1(fffffe0bc70e8880)
taskq_thread+0x2cd(fffffe0ba85d7270)
thread_start+0xb()

At this point scan_io_queue_fetch_ext has just issued zfs_btree_first() which returned NULL. This is valid because the height of the btree is '-1':

> fffffe0bc70e8880::print dsl_scan_io_queue_t q_exts_by_size.bt_height | =D
                -1

zfs_btree_first returns NULL when the height is -1. Passing NULL into the following range tree inline functions leads to the panic we see in this dump.

It turns out that ZoL already has a fix for this in commit 'Function name and comment updates,' 516a83f8861269de9f795a96c471623e3bd67009. We should pull this commit into illumos.

History

#1

Updated by Robert Mustacchi about 2 months ago

  • Description updated (diff)
#2

Updated by Kody Kantor about 1 month ago

Gernot (the original reporter) said via email that he tested this change on his affected systems and it has resolved the issue.
Additionally, I ran a few manual zpool replace, zpool detach, and zpool scrub commands on a zpool on my OpenIndiana machine and didn't experience any panics or errors.

#3

Updated by Electric Monk about 1 month ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit bfb9edc9bd178b0ce7fa2fbe1fc66e18e316af4e

commit  bfb9edc9bd178b0ce7fa2fbe1fc66e18e316af4e
Author: Paul Dagnelie <pcd@delphix.com>
Date:   2020-01-06T19:44:07.000Z

    12143 scan code should check the return value of zfs_btree_first
    Portions contributed by: Kody Kantor <kody.kantor@joyent.com>
    Reviewed by: Sara Hartse <sara.hartse@delphix.com>
    Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed by: Matt Ahrens <matt@delphix.com>
    Reviewed by: Jason King <jason.king@joyent.com>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF