Bug #8565

zpool dumps core trying to destroy unavail/faulted pool

Added by Igor Kozhukhov 16 days ago. Updated 15 days ago.

Status:NewStart date:2017-08-07
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:zfs - Zettabyte File System
Target version:-
Difficulty:Medium Tags:needs-triage

Description

DilOS with latest merged changes from illumos
i'm using zfs/zpool with 64bit builds

create zpool
remove drives - zpool fault
try to destroy zpool - it failed with core

root@con3:~# mdb core
Loading modules: [ libumem.so.1 libc.so.1 libavl.so.1 libnvpair.so.1 libuutil.so.1 ld.so.1 ]
> $C
ffffbf7fffdfb8e0 libumem.so.1`free+0xe()
ffffbf7fffdfb900 libshare.so.1`sa_init_arg+0xb(4, ffffbf7fffdfb9d0)
ffffbf7fffdfb950 libzfs.so.1`zfs_init_libshare_impl+0xac(4be550, 4, ffffbf7fffdfb9d0)
ffffbf7fffdfb980 libzfs.so.1`zfs_init_libshare_arg+0x10(4be550, 4, ffffbf7fffdfb9d0)
ffffbf7fffdfba50 libzfs.so.1`zpool_disable_datasets+0x1af(4c1dd0, 1)
ffffbf7fffdfba90 zpool_do_destroy+0xc5(3, ffffbf7fffdffb40)
ffffbf7fffdffae0 main+0x15e(4, ffffbf7fffdffb38)
ffffbf7fffdffb10 _start_crt+0x83()
ffffbf7fffdffb20 _start+0x18()
> 

probably related to changes:
https://github.com/openzfs/openzfs/commit/d8eca23abd748aa0da443ec0816009f71ac10870

lines 1055 & 1057 with free()

History

#1 Updated by Igor Kozhukhov 16 days ago

  • Description updated (diff)

#2 Updated by Yuri Pankov 16 days ago

Yeah, it's reproducible, here's a bit better trace:

# zpool status
  pool: data
 state: UNAVAIL
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        data        UNAVAIL      0     0     0  insufficient replicas
          c4t0d0    REMOVED      0     0     0

# zpool destroy data
Segmentation Fault (core dumped)
# mdb /var/cores/core.zpool.100653.1502125643
Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 libuutil.so.1 ld.so.1 ]
> $C
08043588 libumem.so.1`process_free+0x22(baddcafe, 1, 0, fec58000)
080435a8 libumem.so.1`umem_malloc_free+0x1a(baddcafe, 815eaa0, 81993c8, 1, 0, 0)
08043a88 libshare.so.1`sa_init_impl+0x576(4, 8043b44, 8174f98, fedbc172)
08043aa8 libshare.so.1`sa_init_arg+0x11(4, 8043b44, 400, fedbc1bc, 0, fede0000)
08043ad8 libzfs.so.1`zfs_init_libshare_impl+0x5d(81410c8, 4, 8043b44, 8043b44)
08043af8 libzfs.so.1`zfs_init_libshare_arg+0x14(81410c8, 4, 8043b44, 80528f2, fef70548, fef70548)
08043b78 libzfs.so.1`zpool_disable_datasets+0x1ee(8143dc8, 0, 806730c, 8055344)
08043bb8 zpool_do_destroy+0x140(2, 8047c48, 80787c0, 801, 0, 0)
08047c08 main+0x12c(feed8147, fef53328, 8047c38, 80551a3, 3, 8047c44)
08047c38 _start+0x83(3, 8047d48, 8047d4e, 8047d56, 0, 8047d5b)
>

#3 Updated by Yuri Pankov 16 days ago

  • Category set to zfs - Zettabyte File System
  • Subject changed from zpool destroy produce segmentation fault to zpool dumps core trying to destroy unavail/faulted pool

#4 Updated by Igor Kozhukhov 16 days ago

  • Description updated (diff)

#5 Updated by Matthew Ahrens 15 days ago

It looks like sa_get_one_zfs_share() doesn’t fill in all the paths[] when it gets the SA_SYSTEM_ERR error
Probably we need to free all the paths and set `*path_len=0` in the error case.

Thanks for reporting this, Igor. We'll get on it.

Also available in: Atom