panic when scrub a v10 pool
While expanding stored pools, we ran into a panic using an old pool.
Steps to reproduce:
jkennedy-origin% sudo zpool create -o version=2 test c2t1d0
jkennedy-origin% sudo cp /etc/passwd /test/foo
jkennedy-origin% sudo zpool attach test c2t1d0 c2t2d0
We'll get this panic:
ffffff000fc0e5e0 unix:real_mode_stop_cpu_stage2_end+b27c () ffffff000fc0e6f0 unix:trap+dc8 () ffffff000fc0e700 unix:cmntrap+e6 () ffffff000fc0e860 zfs:dsl_scan_visitds+1ff () ffffff000fc0ea20 zfs:dsl_scan_visit+fe () ffffff000fc0ea80 zfs:dsl_scan_sync+1b3 () ffffff000fc0eb60 zfs:spa_sync+435 () ffffff000fc0ec20 zfs:txg_sync_thread+23f () ffffff000fc0ec30 unix:thread_start+8 ()
The problem is a bad trap accessing a NULL pointer. We're looking for the dp_origin_snap of a dsl_pool_t, but version 2 didn't have that.
The system will go into a reboot loop at this point, and the dump won't be accessible except by removing the cache file from within the recovery environment.
This impacts any sort of scrub or resilver on version <11 pools, e.g.:
zpool create -o version=10 test c2t1d0
zpool scrub test
Updated by Electric Monk almost 2 years ago
- % Done changed from 0 to 100
- Status changed from New to Closed
commit bb1f424574ac8e08069d0ba993c2a41ffe796794 Author: Matthew Ahrens <firstname.lastname@example.org> Date: 2018-05-01T20:00:33.000Z 9443 panic when scrub a v10 pool Reviewed by: Serapheim Dimitropoulos <email@example.com> Reviewed by: George Wilson <firstname.lastname@example.org> Reviewed by: Andriy Gapon <avg@FreeBSD.org> Reviewed by: Igor Kozhukhov <email@example.com> Approved by: Dan McDonald <firstname.lastname@example.org>