Bug #14003
openpanic: zfs dbuf code destroys buffer still held by others on encrypted pool
100%
Description
We had a SmartOS CN panic like this:
> ::status debugging crash dump ./vmcore.0 (64-bit) from sdc-cb8030 operating system: 5.11 joyent_20210623T002805Z (i86pc) git branch: eait-20210623 git rev: f12a9dd262eb31f6cd4765ca82ad60d64defee9c image uuid: (not set) panic message: assertion failed: remove_reference(hdr, ((void *)0), tag) == 0 (0x3 == 0x0), file: ../../common/fs/zfs/arc.c, line: 3773 dump content: kernel pages only > $C fffffb04504c27e0 vpanic() fffffb04504c2870 0xfffffffffba436e1() fffffb04504c28b0 arc_buf_destroy+0xe2(fffffdc9f1eb3768, fffffdc8e9811680) fffffb04504c2900 dbuf_issue_final_prefetch_done+0x47(fffffed81e424080, fffffed81e424080, fffffed81e424128, fffffdc9f1eb3768, fffffdc8e9811680) fffffb04504c2990 arc_read_done+0x229(fffffed81e424080) fffffb04504c2a30 zio_done+0x542(fffffed81e424080) fffffb04504c2a60 zio_execute+0xa7(fffffed81e424080) fffffb04504c2b10 taskq_thread+0x2cd(fffffdc57b003088) fffffb04504c2b20 thread_start+0xb()
It looks like the arc_buf_hdr_t
being destroyed still had 3 other arc_buf_t
holding it (4 total):
> fffffdc9f1eb3768::print -t arc_buf_t arc_buf_t { arc_buf_hdr_t *b_hdr = 0xfffffe68e26ecaa8 arc_buf_t *b_next = 0xfffffe79af57c820 kmutex_t b_evict_lock = { void *[1] _opaque = [ 0 ] } void *b_data = 0xfffffdc93424f000 arc_buf_flags_t b_flags = 0x6 (ARC_BUF_FLAG_{COMPRESSED|ENCRYPTED}) } > 0xfffffe68e26ecaa8::print -t arc_buf_hdr_t arc_buf_hdr_t { dva_t b_dva = { uint64_t [2] dva_word = [ 0, 0 ] } uint64_t b_birth = 0 arc_buf_contents_t b_type = 2 (ARC_BUFC_METADATA) arc_buf_hdr_t *b_hash_next = 0 arc_flags_t b_flags = 0x2168010 (ARC_FLAG_{L2CACHE|PROTECTED|BUFC_METADATA|HAS_L1HDR|COMPRESSED_ARC|COMPRESS_1}) uint16_t b_psize = 0x20 uint16_t b_lsize = 0x20 uint64_t b_spa = 0x579d5b1f1bf19e55 l2arc_buf_hdr_t b_l2hdr = { l2arc_dev_t *b_dev = 0xfffffdc61ed15000 uint64_t b_daddr = 0x1a8c42000 arc_state_type_t b_arcs_state = 0 (ARC_STATE_ANON) list_node_t b_l2node = { struct list_node *list_next = 0 struct list_node *list_prev = 0 } } l1arc_buf_hdr_t b_l1hdr = { kmutex_t b_freeze_lock = { void *[1] _opaque = [ 0 ] } zio_cksum_t *b_freeze_cksum = 0 arc_buf_t *b_buf = 0xfffffdca00f70a90 uint32_t b_bufcnt = 0x4 kcondvar_t b_cv = { ushort_t _opaque = 0 } uint8_t b_byteswap = 0xa arc_state_t *b_state = ARC_anon multilist_node_t b_arc_node = { struct list_node *list_next = 0 struct list_node *list_prev = 0 } clock_t b_arc_access = 0 zfs_refcount_t b_refcnt = { uint64_t rc_count = 0x3 } arc_callback_t *b_acb = 0 abd_t *b_pabd = 0 } arc_buf_hdr_crypt_t b_crypt_hdr = { abd_t *b_rabd = 0xfffffdcde7c9fa00 dmu_object_type_t b_ot = 0t10 (DMU_OT_DNODE) uint32_t b_ebufcnt = 0x3 uint64_t b_dsobj = 0x1fc34 uint8_t [8] b_salt = [ 0x18, 0xec, 0xe5, 0x12, 0x50, 0xa1, 0x75, 0x40 ] uint8_t [12] b_iv = [ 0x64, 0x9c, 0x7, 0x33, 0x4, 0xc, 0xa6, 0x7d, 0xaa, 0x73, 0xf, 0x2e ] uint8_t [16] b_mac = [ 0xa1, 0xef, 0xdb, 0x5d, 0x7e, 0xce, 0x4d, 0x4c, 0x98, 0x60, 0xa8, 0x19, 0xd5, 0x62, 0x64, 0xa5 ] } }
It's a metadata buf, owned by a dbuf in the metadnode for a dataset on the pool. These are normally owned and managed by the dbuf cache, and with compressed ARC enabled, never make it onto the main ARC lists (and so nothing other than the dbuf cache should ever find them or use them). This one is marked ARC_anon, which is consistent with that.
However, we can see that the ARC_FLAG_PREFETCH
flag is no longer on the header, even though this must have been created in dbuf_issue_final_prefetch
(which sets that flag unconditionally). We usually only unset this flag in arc.c when we find a prefetched buffer on the MRU or MFU lists (which this buf should never have been on!) or the buf_hash (which it also should never have been on!).
Also, the flags imply that this buffer was in the L2ARC, which buffers owned by the dbuf cache should never be (they're never on the main ARC lists for the L2ARC writer to find them).
Updated by Alex Wilson over 2 years ago
https://github.com/openzfs/zfs/issues/6881 looks similar, but was closed by the stale bot.
same with https://github.com/openzfs/zfs/issues/10085
there are some other open issues which seem to mention similar panics in passing, but none have a resolution that I can see
Updated by Alex Wilson over 2 years ago
Got a new set of panics across 5 different machines which look closely related:
> ::status debugging crash dump vmcore.0 (64-bit) from sdc-cb8028 operating system: 5.11 joyent_20210623T002805Z (i86pc) git branch: eait-20210623 git rev: fae973e1f1706a5aaf6ce807bc437fe3aafa6e2b image uuid: (not set) panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffb045398c630 addr=4 occurred in module "zfs" due to a NULL pointer dereference dump content: kernel pages only > $C fffffb045398c730 abd_iter_advance+0xd(fffffb045398c748, 0) fffffb045398c7c0 abd_iterate_func+0x4a(0, 0, 4000, fffffffff7c8dbe0, fffffb045398c7d0) fffffb045398c800 abd_copy_to_buf_off+0x39(fffffdcc487bb000, 0, 0, 4000) fffffb045398c880 arc_buf_fill+0x29d(fffffdca2e22abe8, fffffdc571e36000, fffffdcc49d06568, 7) fffffb045398c900 arc_buf_alloc_impl+0x14d(fffffdcc49d0aba0, fffffdc571e36000, fffffdcc49d06568, fffffdcc55ec0c30, 1, 1, 0, fffffdc500000001, fffffdcc49d06550) fffffb045398c990 arc_read_done+0x12a(fffffdc8a0a7a968) fffffb045398ca30 zio_done+0x542(fffffdc8a0a7a968) fffffb045398ca60 zio_execute+0xa7(fffffdc8a0a7a968) fffffb045398cb10 taskq_thread+0x2cd(fffffdc61ea5f1c0) fffffb045398cb20 thread_start+0xb()
once again we have an ARC buf belonging to an encrypted, compressed metadnode, and once again the flags look a bit weird:
> fffffdca2e22abe8::print -t arc_buf_t arc_buf_t { arc_buf_hdr_t *b_hdr = 0xfffffdcc49d0aba0 arc_buf_t *b_next = 0xfffffdcb2169b618 kmutex_t b_evict_lock = { void *[1] _opaque = [ 0 ] } void *b_data = 0xfffffdcc487bb000 arc_buf_flags_t b_flags = 0x6 (ARC_BUF_FLAG_{COMPRESSED|ENCRYPTED}) } > 0xfffffdcc49d0aba0::print -t arc_buf_hdr_t arc_buf_hdr_t { dva_t b_dva = { uint64_t [2] dva_word = [ 0, 0 ] } uint64_t b_birth = 0 arc_buf_contents_t b_type = 2 (ARC_BUFC_METADATA) arc_buf_hdr_t *b_hash_next = 0 arc_flags_t b_flags = 0x2168110 (ARC_FLAG_{L2CACHE|IO_IN_PROGRESS|PROTECTED|BUFC_METADATA|HAS_L1HDR|COMPRESSED_ARC|COMPRESS_1}) uint16_t b_psize = 0x20 uint16_t b_lsize = 0x20 uint64_t b_spa = 0xd6b85d7b4a5b89bb l2arc_buf_hdr_t b_l2hdr = { l2arc_dev_t *b_dev = 0 uint64_t b_daddr = 0 arc_state_type_t b_arcs_state = 0 (ARC_STATE_ANON) list_node_t b_l2node = { struct list_node *list_next = 0 struct list_node *list_prev = 0 } } l1arc_buf_hdr_t b_l1hdr = { kmutex_t b_freeze_lock = { void *[1] _opaque = [ 0 ] } zio_cksum_t *b_freeze_cksum = 0 arc_buf_t *b_buf = 0xfffffdca2e22abe8 uint32_t b_bufcnt = 0x2 kcondvar_t b_cv = { ushort_t _opaque = 0 } uint8_t b_byteswap = 0xa arc_state_t *b_state = ARC_anon multilist_node_t b_arc_node = { struct list_node *list_next = 0 struct list_node *list_prev = 0 } clock_t b_arc_access = 0 zfs_refcount_t b_refcnt = { uint64_t rc_count = 0x2 } arc_callback_t *b_acb = 0xfffffdcc49d06540 abd_t *b_pabd = 0 } arc_buf_hdr_crypt_t b_crypt_hdr = { abd_t *b_rabd = 0 dmu_object_type_t b_ot = 0t10 (DMU_OT_DNODE) uint32_t b_ebufcnt = 0x1 uint64_t b_dsobj = 0xbe uint8_t [8] b_salt = [ 0x39, 0x3a, 0x87, 0x5d, 0xd6, 0xc4, 0x28, 0x8a ] uint8_t [12] b_iv = [ 0x97, 0xe2, 0xc3, 0xa6, 0xf5, 0xc3, 0xbc, 0x96, 0x5a, 0x64, 0x41, 0xea ] uint8_t [16] b_mac = [ 0x45, 0x12, 0xda, 0x72, 0x72, 0x3f, 0xfa, 0xa0, 0xd, 0x44, 0x6c, 0xdd, 0x4a, 0xeb, 0x59, 0x1 ] } }
this time its b_crypt_hdr.b_rabd
is NULL, though, which is what causes this other panic
Updated by Alex Wilson over 2 years ago
Additional update: one of the CNs that hit this new panic is reporting permanent corruption in a dataset metadnode (looks like the one it panic'd on). persists after multiple scrubs.
Updated by Rich Ercolani over 2 years ago
I should remark that, at least with my test reproductions on https://github.com/openzfs/zfs/issues/12257, L2ARC is never involved, so presuming this is the same issue of a dbuf getting freed while someone still had a handle to it, L2ARC is not always required to end up sad in this way.
Updated by Alex Wilson over 2 years ago
new day, new stack!
panic[cpu15]/thread=fffffb044fd2bc20: assertion failed: remove_reference(hdr, ((void *)0), tag) == 0 (0x1 == 0x0), file: ../../common/fs/zfs/arc.c, line: 3773 Warning - stack not written to the dump buffer fffffb044fd2b6a0 fffffffffba436e1 () fffffb044fd2b6e0 zfs:arc_buf_destroy+e2 () fffffb044fd2b740 zfs:dbuf_destroy+38 () fffffb044fd2b840 zfs:dbuf_rele_and_unlock+d0 () fffffb044fd2b880 zfs:dbuf_rele+29 () fffffb044fd2b940 zfs:dnode_next_offset_level+122 () fffffb044fd2b9d0 zfs:dnode_next_offset+95 () fffffb044fd2ba90 zfs:dmu_object_next+ff () fffffb044fd2baf0 zfs:dmu_objset_space_upgrade+b7 () fffffb044fd2bb20 zfs:dmu_objset_id_quota_upgrade_cb+87 () fffffb044fd2bb60 zfs:dmu_objset_upgrade_task_cb+55 () fffffb044fd2bc00 genunix:taskq_d_thread+bc () fffffb044fd2bc10 unix:thread_start+b ()
Updated by Alex Wilson over 2 years ago
Update: our first permanent data corruption of significance has happened due to these panics. We have now lost customer data and had to pull some zones out of backups whose dataset metadnode has experienced permanent corruption -- they can no longer be mounted or sent.
Updated by Alex Wilson over 2 years ago
For anyone else experiencing this, if we haven't figured it out and fixed it by then, get your data off encrypted ZFS ASAP before it eats your metadnodes. Don't delay -- once you see the first panic, evacuate the machine.
Updated by Alex Wilson over 2 years ago
- Subject changed from panic: dbuf_issue_final_prefetch_done destroys buffer still held by others to panic: zfs dbuf code destroys buffer still held by others
Updated by Alex Wilson over 2 years ago
- Subject changed from panic: zfs dbuf code destroys buffer still held by others to panic: zfs dbuf code destroys buffer still held by others on encrypted pool
Updated by Joshua M. Clulow over 2 years ago
I spent some time reproducing this on Friday. The conditions that seem particularly relevant are:
- older ZFS send stream, that would require some set of file system-level upgrades, was received into an encrypted dataset
- more than one clone of that received snapshot is created simultaneously or almost simultaneously
An upgrade that involves dmu_objset_id_quota_upgrade_cb()
and dmu_objset_space_upgrade()
(etc) needs to write to the pool. We can't upgrade the contents of the snapshot as those are frozen, so we end up upgrading the cloned copy. It seems like that cloned copy can/does continue to refer to some of the blocks in the original snapshot.
By using a SmartOS platform with DEBUG bits, I was able to get a much crisper panic:
panic[cpu17]/thread=fffffe00f7f1ac20: assertion failed: !HDR_IO_IN_PROGRESS(hdr), file: ../../common/fs/zfs/arc.c, line: 6392 fffffe00f7f19eb0 fffffffffbdcf334 () fffffe00f7f19f60 zfs:arc_release+830 () fffffe00f7f19f90 zfs:dbuf_release_bp+64 () fffffe00f7f1a0e0 zfs:dbuf_write+b8 () fffffe00f7f1a200 zfs:dbuf_sync_leaf+1c8 () fffffe00f7f1a250 zfs:dbuf_sync_list+bb () fffffe00f7f1a300 zfs:dbuf_sync_indirect+172 () fffffe00f7f1a350 zfs:dbuf_sync_list+ac () fffffe00f7f1a400 zfs:dbuf_sync_indirect+172 () fffffe00f7f1a450 zfs:dbuf_sync_list+ac () fffffe00f7f1a500 zfs:dbuf_sync_indirect+172 () fffffe00f7f1a550 zfs:dbuf_sync_list+ac () fffffe00f7f1a600 zfs:dbuf_sync_indirect+172 () fffffe00f7f1a650 zfs:dbuf_sync_list+ac () fffffe00f7f1a700 zfs:dbuf_sync_indirect+172 () fffffe00f7f1a750 zfs:dbuf_sync_list+ac () fffffe00f7f1a7e0 zfs:dnode_sync+507 () fffffe00f7f1a920 zfs:dmu_objset_sync+1a5 () fffffe00f7f1a990 zfs:dsl_dataset_sync+b2 () fffffe00f7f1aa30 zfs:dsl_pool_sync+b2 () fffffe00f7f1aab0 zfs:spa_sync_iterate_to_convergence+f7 () fffffe00f7f1ab30 zfs:spa_sync+272 () fffffe00f7f1ac00 zfs:txg_sync_thread+29f () fffffe00f7f1ac10 unix:thread_start+b ()
Looking at select frames from the panic stack:
fffffe00f7f19f60 arc_release+0x830( arc_buf_t fffffeb30345d300, void *tag fffffeb3143da060) fffffe00f7f19f90 dbuf_release_bp+0x64( dmu_buf_impl_t fffffeb3143da060) fffffe00f7f1a0e0 dbuf_write+0xb8( dbuf_dirty_record_t fffffeb30bc44580, | > fffffeb30bc44580::print dbuf_dirty_record_t dr_dbuf | dr_dbuf = 0xfffffeb3143da060 | > 0xfffffeb3143da060::dbuf | addr object lvl blkid holds os | fffffeb3143da060 mdn 0 34 33 dynamite/plant/0000 | | > 0xfffffeb3143da060::print dmu_buf_impl_t db_state db_level | db_state = 4 (DB_CACHED) | db_level = 0 | > 0xfffffeb39a7bd860::print dnode_t dn_type | dn_type = 0t10 (DMU_OT_DNODE) arc_buf_t fffffeb30345d300, dmu_tx_t fffffeb3cbe57900) ... fffffe00f7f1a7e0 dnode_sync+0x507( dnode_t fffffeb39a7bd860, | > fffffeb39a7bd860::print dnode_t | dn_objset->os_dsl_dataset->ds_dir->dd_myname | dn_objset->os_dsl_dataset->ds_dir->dd_myname = [ "0000" ] dmu_tx_t fffffeb3cbe57900) ... fffffe00f7f1ab30 spa_sync+0x272(fffffeb36e344000, 1389a)
There is also an upgrade task waiting on a ZIO:
> fffffe00f3fdcc20::findstack -v stack pointer for thread fffffe00f3fdcc20 (tq:z_upgrade): fffffe00f3fdc730 [ fffffe00f3fdc730 _resume_from_idle+0x12b() ] fffffe00f3fdc760 swtch+0x18a() fffffe00f3fdc790 cv_wait+0x89(fffffeb35ee6b4f0, fffffeb35ee6b4e8) fffffe00f3fdc7d0 zio_wait+0xbb(fffffeb35ee6b120) fffffe00f3fdc860 dbuf_read+0x30d(fffffeb37a73ecf0, 0, 46) fffffe00f3fdc940 dnode_next_offset_level+0x508(fffffeb35cbb3c08, 0, fffffe00f3fdca48, 0, 20, 0) fffffe00f3fdc9e0 dnode_next_offset+0xb2(fffffeb35cbb3c08, 0, fffffe00f3fdca48, 0, 20, 0) fffffe00f3fdcaa0 dmu_object_next+0xea(fffffeb309c77880, fffffe00f3fdcab8, 0, 0) fffffe00f3fdcb00 dmu_objset_space_upgrade+0x71(fffffeb309c77880) fffffe00f3fdcb30 dmu_objset_id_quota_upgrade_cb+0xb5(fffffeb309c77880) fffffe00f3fdcb70 dmu_objset_upgrade_task_cb+0x51(fffffeb309c77880) fffffe00f3fdcc00 taskq_d_thread+0xfb(fffffeb369d8c0f8) fffffe00f3fdcc10 thread_start+0xb()
The first child zio in the tree that the upgrade task is waiting on has the arc_read_done() callback, with the same arc_buf_hdr_t we're trying to chuck out:
> fffffeb35ee6b120::zio -c -r ADDRESS TYPE STAGE WAITER TIME_ELAPSED fffffeb35ee6b120 NULL CHECKSUM_VERIFY fffffe00f3fdcc20 - fffffeb3b9e71710 <<< READ VDEV_IO_START - - fffffeb35a5a2ab0 READ VDEV_IO_START - - fffffeb3c27b84c8 READ VDEV_IO_START - 25776ms > fffffeb3b9e71710::print zio_t io_done io_private io_done = arc_read_done io_private = 0xfffffeb34d2e32e8 > 0xfffffeb34d2e32e8::print arc_buf_hdr_t b_l1hdr.b_acb b_l1hdr.b_acb = 0xfffffeb362ce1a60 > 0xfffffeb362ce1a60::print arc_callback_t acb_done acb_private acb_next acb_done = dbuf_read_done acb_private = 0xfffffeb37a73ecf0 acb_next = 0
1| dbuf from spa sync dbuf_write/dbuf_release_bp/arc_release: 1| 1| > 0xfffffeb3143da060::print dmu_buf_impl_t db_blkptr |::blkptr 1| DVA[0]=<0:6c2c3bce00:4000> 1| DVA[1]=<0:1875fcd600:4000> 1| salt=9a66415f5d7e43b4 iv=5d93388c6a7ae0b5:a4d685e4 1| [L0 DNODE] FLETCHER_4 OFF encrypted LE contiguous unique double 1| size=4000L/4000P birth=1345L/1345P fill=32 1| cksum=f76500ede3bc963c:f55159c2025849f:1f5bb4f444e36704:3c2398842a4679a7 1| 1| > 0xfffffeb3143da060::dbuf 1| addr object lvl blkid holds os 1| fffffeb3143da060 mdn 0 34 33 dynamite/plant/0000 2| dbuf from dbuf_read/zio_wait in upgrade task: 2| 2| > fffffeb37a73ecf0::print dmu_buf_impl_t db_blkptr | ::blkptr 2| DVA[0]=<0:6c2c3bce00:4000> 2| DVA[1]=<0:1875fcd600:4000> 2| salt=9a66415f5d7e43b4 iv=5d93388c6a7ae0b5:a4d685e4 2| [L0 DNODE] FLETCHER_4 OFF encrypted LE contiguous unique double 2| size=4000L/4000P birth=1345L/1345P fill=32 2| cksum=f76500ede3bc963c:f55159c2025849f:1f5bb4f444e36704:3c2398842a4679a7 2| 2| > fffffeb37a73ecf0::dbuf 2| addr object lvl blkid holds os 2| fffffeb37a73ecf0 mdn 0 34 2 dynamite/plant/0001
Same block pointer, presumably to the data in the snapshot we cannot change. But, two distinct read/write I/O operations.
It would seem that we are not correctly tracking references or holds on something here.
Updated by Rich Ercolani almost 2 years ago
Just a quick couple notes for anyone who comes upon this...
The above reproduction steps work practically instantly, in my experience, on illumos - you clone a couple sufficiently legacy snapshots at ~the same time in an encryptionroot, and within a few seconds you're watching the machine dump and reboot.
Notably though (for my purposes, at least), while the result appears to be the same as the observed failures for OpenZFS (metadnode refcount mangling => all sorts of fun panics), this specific reproducer does not seem to apply to OpenZFS-based platforms - you can clone as much as you want, and nothing* breaks down no matter how long you wait.
Updated by Rich Ercolani 2 months ago
Btw, lest anyone think this ever got fixed.
TIME UUID SUNW-MSG-ID Sep 19 2023 12:18:55.900943000 209117fd-9979-63e2-b01a-c05d9465ae62 SUNOS-8000-KL TIME CLASS ENA Sep 19 12:18:53.5679 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = 209117fd-9979-63e2-b01a-c05d9465ae62 code = SUNOS-8000-KL diag-time = 1695140335 898654 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/oivm/.209117fd-9979-63e2-b01a-c05d9465ae62 resource = sw:///:path=/var/crash/oivm/.209117fd-9979-63e2-b01a-c05d9465ae62 savecore-succcess = 0 os-instance-uuid = 209117fd-9979-63e2-b01a-c05d9465ae62 panicstr = BAD TRAP: type=e (#pf Page fault) rp=fffffe0007cef630 addr=4 occurred in module "zfs" due to a NULL pointer dereference panicstack = unix:die+c0 () | unix:trap+999 () | unix:cmntrap+e9 () | zfs:abd_iter_advance+1a () | zfs:abd_iterate_func+46 () | zfs:abd_copy_to_buf_off+45 () | zfs:arc_buf_fill+14a () | zfs:arc_buf_alloc_impl+1ab () | zfs:arc_read_done+122 () | zfs:zio_done+414 () | zfs:zio_execute+a7 () | genunix:taskq_thread+2a6 () | unix:thread_start+b () | crashtime = 1695140306 panic-time = Tue Sep 19 12:18:26 2023 EDT (end fault-list[0]) fault-status = 0x1 severity = Major __ttl = 0x1 __tod = 0x6509c9ef 0x35b34c98 rich@oivm:~$ uname -a SunOS oivm 5.11 illumos-676abcb77c i86pc i386 i86pc
#!/bin/bash set +e zfs destroy -R rpool/badidea zfs destroy -R rpool/encidea zfs create -o version=3 -o compression=lz4 -o encryption=on -o keyformat=raw -o keylocation=file:///badkey rpool/badidea rsync -a ~/baseline/. /rpool/badidea/. zfs snapshot rpool/badidea@fire1 zfs create -o encryption=on -o keyformat=raw -o keylocation=file:///badkey rpool/encidea zfs send -wp rpool/badidea@fire1 | zfs receive rpool/encidea/baseline zfs load-key -a for i in `seq 1 50`;do zfs clone rpool/encidea/baseline@fire1 rpool/encidea/explosions${i} & done; zfs upgrade rpool/encidea/baseline
Where baseline is just a git checkout. It'll sometimes panic and sometimes just deadlock the system, on non-debug bits - no errors out, no CPU spinning, but it'll stop responding to existing or new logins.