Project

General

Profile

Actions

Bug #14022

open

zpool online -e breaks access to pool

Added by Andy Fiddaman 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

Attempting to expand a ZFS pool in one operation using zpool online -e does not work, causing pool access (and the machine) to hang.
If the partition is expanded first using format, however, then it succeeds.

root@omnios:~# zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c1t0d0    ONLINE       0     0     0

errors: No known data errors

root@omnios:~# zpool get expandsz rpool
NAME   PROPERTY    VALUE     SOURCE
rpool  expandsize  2G        -

root@omnios:~# prtvtoc /dev/dsk/c1t0d0
* /dev/dsk/c1t0d0 EFI partition map
*
* Dimensions:
*         512 bytes/sector
*    20971520 sectors
*    16777149 accessible sectors
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*         First       Sector      Last
*         Sector       Count      Sector
*            34         222         255
*
*                            First       Sector      Last
* Partition  Tag  Flags      Sector       Count      Sector  Mount Directory
       0     12    00          256      524288      524543
       1      4    00       524544    16236255    16760798
       8     11    00     16760799       16384    16777182

root@omnios:~# zpool online -e rpool c1t0d0

[0]> ::stacks -c vdev_online
THREAD           STATE    SOBJ                COUNT
fffffe064f725400 SLEEP    CV                      1
                 swtch+0x133
                 cv_wait+0x68
                 zfs`txg_wait_synced_impl+0xa5
                 zfs`txg_wait_synced+0xd
                 zfs`spa_vdev_state_exit+0xb9
                 zfs`vdev_online+0x23a
                 zfs`zfs_ioc_vdev_set_state+0x7b
                 zfs`zfsdev_ioctl+0x1fd
                 cdev_ioctl+0x2b
                 specfs`spec_ioctl+0x45
                 fop_ioctl+0x5b
                 ioctl+0x153

[0]> ::msgbuf
...
WARNING: Pool 'rpool' has encountered an uncorrectable I/O failure and has been suspended; `zpool clear` will be required before the pool can be written to.
[0]> ::zfs_dbgmsg
disk vdev '/dev/dsk/c1t0d0s1': vdev_disk_open: failed to get size
spa=rpool async request task=1
rpool: metaslab allocation failure: zio fffffe065a5b49f8, size 512, error 28
rpool: metaslab allocation failure: zio fffffe065a5d3dd0, size 4096, error 28
rpool: metaslab allocation failure: zio fffffe065a5c2228, size 4096, error 28
rpool: metaslab allocation failure: zio fffffe065a5c3508, size 4096, error 28
rpool: metaslab allocation failure: zio fffffe065a586cb0, size 49152, error 28
rpool: metaslab allocation failure: zio fffffe065a5d1200, size 4096, error 28
rpool: zil block allocation failure: size 36864, error 28
rpool: metaslab allocation failure: zio fffffe065a584930, size 4096, error 28

[0]> ::spa
ADDR                 STATE NAME
fffffe064a8eb000    ACTIVE rpool
[0]> fffffe064a8eb000::spa_vdevs
ADDR             STATE     AUX          DESCRIPTION
fffffe064b8ee000 CANT_OPEN NO_REPLICAS  root
fffffe064b8eb000 CANT_OPEN OPEN_FAILED    /dev/dsk/c1t0d0s1

This is a bhyve VM, so:

carolina# mdb /dev/zvol/rdsk/rpool/test/root
> ::load disk_label
> ::gpt
Signature: EFI PART (valid)
Revision: 1.0
HeaderSize: 92 bytes
HeaderCRC32: 0xfcd5dc82 (should be 0xfcd5dc82)
Reserved1: 0 (should be 0x0)
MyLBA: 1 (should be 1)
AlternateLBA: 20971519
FirstUsableLBA: 34
LastUsableLBA: 20971486
DiskGUID: 5897e0f9-b01a-4cd7-d30f-95c37f715c2f
PartitionEntryLBA: 2
NumberOfPartitionEntries: 9
SizeOfPartitionEntry: 0x80 bytes
PartitionEntryArrayCRC32: 0x59d5ed6a (should be 0x59d5ed6a)

PART TYPE                STARTLBA      ENDLBA        ATTR     NAME
0    EFI_SYSTEM          256           524543        0        loader
1    EFI_USR             524544        20955102      0        zfs
2    EFI_UNUSED
3    EFI_UNUSED
4    EFI_UNUSED
5    EFI_UNUSED
6    EFI_UNUSED
7    EFI_UNUSED
8    EFI_RESERVED        20955103      20971486      0
Actions #1

Updated by Toomas Soome 2 months ago

Andy Fiddaman wrote:

Attempting to expand a ZFS pool in one operation using zpool online -e does not work, causing pool access (and the machine) to hang.
If the partition is expanded first using format, however, then it succeeds.

EFI wholedisk or custom partitioning?

Actions #2

Updated by Andy Fiddaman 2 months ago

  • Description updated (diff)
Actions #3

Updated by Andy Fiddaman 2 months ago

  • Description updated (diff)
Actions #4

Updated by Andy Fiddaman 2 months ago

Toomas Soome wrote in #note-1:

EFI wholedisk or custom partitioning?

I've expanded the description - the reproducer I'm using is EFI whole disk, so that zpool online -e tries to expand it

Actions #5

Updated by Andy Fiddaman about 2 months ago

The pool ends up being suspended because ldi_get_size() returns -1 - called from vdev_disk_open() in the re-open case

dtrace -n 'fbt::ldi_get_size:entry{v = (struct ldi_handle *)arg0; print(*v); self->t++}' -n 'fbt::ldi*:entry/self->t/{}' -n 'fbt::ldi*:return/self->t/{trace(arg1)}' -F
CPU FUNCTION
  2  -> ldi_get_size                          struct ldi_handle {
    struct ldi_handle *lh_next = 0
    uint_t lh_ref = 0x1
    uint_t lh_flags = 0
    uint_t lh_type = 0x2
    struct ldi_ident *lh_ident = 0xfffffe064a4abbf8
    vnode_t *lh_vp = 0xfffffe064a4c3a40
    kmutex_t [1] lh_lock = [
        kmutex_t {
            void *[1] _opaque = [ 0 ]
        }
    ]
    struct ldi_event *lh_events = 0
}
  2   | ldi_get_size:entry
  2    -> ldi_get_otyp
  2    <- ldi_get_otyp                                        0
  2    -> ldi_prop_exists
  2    <- ldi_prop_exists                                     0
  2    -> ldi_prop_exists
  2    <- ldi_prop_exists                                     0
  2    -> ldi_prop_exists
  2    <- ldi_prop_exists                                     0
  2    -> ldi_prop_exists
  2    <- ldi_prop_exists                                     0
  2  <- ldi_get_size                                 4294967295
Actions

Also available in: Atom PDF