Actions
Bug #8565
closedzpool dumps core trying to destroy unavail/faulted pool and try export it
Status:
Closed
Priority:
Normal
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2017-08-07
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:
Description
DilOS with latest merged changes from illumos
i'm using zfs/zpool with 64bit builds
create zpool
remove drives - zpool fault
try to destroy zpool - it failed with core
root@con3:~# mdb core Loading modules: [ libumem.so.1 libc.so.1 libavl.so.1 libnvpair.so.1 libuutil.so.1 ld.so.1 ] > $C ffffbf7fffdfb8e0 libumem.so.1`free+0xe() ffffbf7fffdfb900 libshare.so.1`sa_init_arg+0xb(4, ffffbf7fffdfb9d0) ffffbf7fffdfb950 libzfs.so.1`zfs_init_libshare_impl+0xac(4be550, 4, ffffbf7fffdfb9d0) ffffbf7fffdfb980 libzfs.so.1`zfs_init_libshare_arg+0x10(4be550, 4, ffffbf7fffdfb9d0) ffffbf7fffdfba50 libzfs.so.1`zpool_disable_datasets+0x1af(4c1dd0, 1) ffffbf7fffdfba90 zpool_do_destroy+0xc5(3, ffffbf7fffdffb40) ffffbf7fffdffae0 main+0x15e(4, ffffbf7fffdffb38) ffffbf7fffdffb10 _start_crt+0x83() ffffbf7fffdffb20 _start+0x18() >
probably related to changes:
https://github.com/openzfs/openzfs/commit/d8eca23abd748aa0da443ec0816009f71ac10870
lines 1055 & 1057 with free()
example of core with 'zpoo export'
root@con1:~# mdb core Loading modules: [ libumem.so.1 libc.so.1 libavl.so.1 libnvpair.so.1 libuutil.so.1 ld.so.1 ] > $C ffffbf7fffdfb210 libc.so.1`_lwp_kill+0xa() ffffbf7fffdfb240 libc.so.1`raise+0x1e(6) ffffbf7fffdfb250 libumem.so.1`umem_do_abort+0x44() ffffbf7fffdfb350 0xffffbf7fff0288f9() ffffbf7fffdfb3a0 libumem.so.1`process_free+0xa5(482a58, 1, 0) ffffbf7fffdfb3c0 libumem.so.1`umem_malloc_free+0x1a(482a58) ffffbf7fffdfb8d0 libshare.so.1`sa_init_impl+0x44b(4, ffffbf7fffdfb9c0) ffffbf7fffdfb8f0 libshare.so.1`sa_init_arg+0xb(4, ffffbf7fffdfb9c0) ffffbf7fffdfb940 libzfs.so.1`zfs_init_libshare_impl+0xac(484550, 4, ffffbf7fffdfb9c0) ffffbf7fffdfb970 libzfs.so.1`zfs_init_libshare_arg+0x10(484550, 4, ffffbf7fffdfb9c0) ffffbf7fffdfba40 libzfs.so.1`zpool_disable_datasets+0x1af(487dd0, 0) ffffbf7fffdfbaa0 zpool_do_export+0xbb(2, ffffbf7fffdffb50) ffffbf7fffdffaf0 main+0x15e(3, ffffbf7fffdffb48) ffffbf7fffdffb20 _start_crt+0x83() ffffbf7fffdffb30 _start+0x18() > root@con1:~# pstack core core 'core' of 1716: zpool export tstpool3 ffffbf7fff2ab1fa _lwp_kill () + a ffffbf7fff240d6e raise (6) + 1e ffffbf7fff0286c4 umem_do_abort () + 44 ffffbf7fff0288f9 ???????? () ffffbf7fff02b525 process_free (482a58, 1, 0) + a5 ffffbf7fff02b66a umem_malloc_free (482a58) + 1a ffffbf7ffe3fefeb sa_init_impl (4, ffffbf7fffdfb9c0) + 44b ffffbf7ffe3ff17b sa_init_arg (4, ffffbf7fffdfb9c0) + b ffffbf7ffe4b8d0c zfs_init_libshare_impl (484550, 4, ffffbf7fffdfb9c0) + ac ffffbf7ffe4b8d60 zfs_init_libshare_arg (484550, 4, ffffbf7fffdfb9c0) + 10 ffffbf7ffe4b9bff zpool_disable_datasets (487dd0, 0) + 1af 000000000040a0fb zpool_do_export (2, ffffbf7fffdffb50) + bb 0000000000411f2e main (3, ffffbf7fffdffb48) + 15e 00000000004084c3 _start_crt () + 83 0000000000408428 _start () + 18
Updated by Yuri Pankov over 5 years ago
Yeah, it's reproducible, here's a bit better trace:
# zpool status pool: data state: UNAVAIL status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scan: none requested config: NAME STATE READ WRITE CKSUM data UNAVAIL 0 0 0 insufficient replicas c4t0d0 REMOVED 0 0 0 # zpool destroy data Segmentation Fault (core dumped) # mdb /var/cores/core.zpool.100653.1502125643 Loading modules: [ libumem.so.1 libc.so.1 libtopo.so.1 libavl.so.1 libnvpair.so.1 libuutil.so.1 ld.so.1 ] > $C 08043588 libumem.so.1`process_free+0x22(baddcafe, 1, 0, fec58000) 080435a8 libumem.so.1`umem_malloc_free+0x1a(baddcafe, 815eaa0, 81993c8, 1, 0, 0) 08043a88 libshare.so.1`sa_init_impl+0x576(4, 8043b44, 8174f98, fedbc172) 08043aa8 libshare.so.1`sa_init_arg+0x11(4, 8043b44, 400, fedbc1bc, 0, fede0000) 08043ad8 libzfs.so.1`zfs_init_libshare_impl+0x5d(81410c8, 4, 8043b44, 8043b44) 08043af8 libzfs.so.1`zfs_init_libshare_arg+0x14(81410c8, 4, 8043b44, 80528f2, fef70548, fef70548) 08043b78 libzfs.so.1`zpool_disable_datasets+0x1ee(8143dc8, 0, 806730c, 8055344) 08043bb8 zpool_do_destroy+0x140(2, 8047c48, 80787c0, 801, 0, 0) 08047c08 main+0x12c(feed8147, fef53328, 8047c38, 80551a3, 3, 8047c44) 08047c38 _start+0x83(3, 8047d48, 8047d4e, 8047d56, 0, 8047d5b) >
Updated by Yuri Pankov over 5 years ago
- Subject changed from zpool destroy produce segmentation fault to zpool dumps core trying to destroy unavail/faulted pool
- Category set to zfs - Zettabyte File System
Updated by Matthew Ahrens over 5 years ago
It looks like sa_get_one_zfs_share() doesn’t fill in all the paths[] when it gets the SA_SYSTEM_ERR error
Probably we need to free all the paths and set `*path_len=0` in the error case.
Thanks for reporting this, Igor. We'll get on it.
Updated by Igor Kozhukhov over 5 years ago
- Subject changed from zpool dumps core trying to destroy unavail/faulted pool to zpool dumps core trying to destroy unavail/faulted pool and try export it
- Description updated (diff)
Updated by Electric Monk over 5 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 569c04941e3e2bd40e738ef19af79436a7e24503
commit 569c04941e3e2bd40e738ef19af79436a7e24503 Author: Daniel Hoffman <dj.hoffman@delphix.com> Date: 2017-10-23T23:32:26.000Z 8565 zpool dumps core trying to destroy unavail/faulted pool and try export it Reviewed by: Matt Ahrens <mahrens@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Richard Lowe <richlowe@richlowe.net>
Actions