Project

General

Profile

Bug #7328

the zpool command hangs when a drive is retired from the system

Added by Pavel Cahyna about 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2016-08-26
Due date:
% Done:

0%

Estimated time:
Difficulty:
Tags:
needs-triage
Gerrit CR:

Description

System version omnios-c91bcdf

When testing bug #7327, I created a zpool on a disk and then let fmd retire it. I then ran zpool status:

# /usr/lib/fm/fmd/fminject over-temperature.errlog
# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Aug 26 09:10:39 3a218a20-72cd-6f65-a829-be7e1bac817d  DISK-8000-12   Major     injected

Host        : nfsserv2
Platform    : X10DRH-LN4        Chassis_id  : 123456789
Product_sn  : 

Fault class : fault.io.disk.over-temperature
Affects     : dev:///:devid=id1,sd@n5000c50057fb9faf//scsi_vhci/disk@g5000c50057fb9faf
                  faulted and taken out of service
FRU         : "Slot 07" (hc://:product-id=LSI-CORP-SAS2X36:server-id=:chassis-id=50030480016c8c3f:serial=Z1Z33W7Y00009424GJQS:part=SEAGATE-ST4000NM0023:revision=0003/ses-enclosure=0/bay=6/disk=0)
                  faulty

Description : A disk's temperature exceeded the limits established by
              its manufacturer.
              Refer to http://illumos.org/msg/DISK-8000-12 for more
              information.

Response    : None.

Impact      : Performance degradation is likely and continued disk operation
              beyond the temperature threshold can result in disk
              damage and potential data loss.

Action      : Ensure that the system is properly cooled, that all fans are
              functional, and that there are no obstructions of airflow to the
              affected
              disk.

# zpool status test
  pool: test
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: none requested
^C^Z
^Z^C

The zpool command became unkillable, even with -9 or -STOP. It should display a reasonable explanation, like saying that the vdev does not exist, it is inaccessible... and certainly should not lock up this way.


Related issues

Related to illumos gate - Bug #7329: zpool unusable on all pools when one pool has a problemNew2016-08-26

Actions

History

#1

Updated by Pavel Cahyna about 4 years ago

  • Related to Bug #7329: zpool unusable on all pools when one pool has a problem added

Also available in: Atom PDF