ARC check for 'anon_size > arc_c/2' can stall the system
Seen in a test suite run of checkpoint_big_rewind which uses a nested pool. Does not appear to reproduce easily.
Current theory is that the upper pool’s dirty/anon data is preventing the lower pool from adding writes into its open context.
The upper pool cannot make progress (and clear its dirty data) until its write I/O sent to lower pool completes.
waiting for I/O to a vdev in testpool
lots of dirty/anon data but syncing I stalled
spa_syncing_txg is 162
vdev io is getting throttled
swtch+0x141() cv_wait+0x70(ffffff03385afb78, ffffff03385afb70) zio_wait+0xbb(ffffff03385af800) dsl_pool_sync+0xf9(ffffff033df2cb00, a2) spa_sync+0x456(ffffff0342c18000, a2) txg_sync_thread+0x260(ffffff033df2cb00) thread_start+8()
stalled in dmu_tx_assign
waiting for anon_size to shrink
spa_syncing_txg is 11,800,157 (in less than 30 minutes!)
typically a run of this test completes in under 200 TXGs
Stacks (per each file vdev used by upper):
swtch+0x141() cv_wait+0x70(ffffff035368549e, ffffff0353685458) txg_wait_open+0xcb(ffffff0353685280, b40e5f) dmu_tx_wait+0x1d8(ffffff0328bc87c0) dmu_tx_assign+0x8a(ffffff0328bc87c0, 1) zfs_write+0x561(ffffff0527861980, ffffff000cdf1a80, 0, ffffff0310e98db0, 0) fop_write+0x5b(ffffff0527861980, ffffff000cdf1a80, 0, ffffff0310e98db0, 0) vn_rdwr+0x27a(1, ffffff0527861980, ffffff032dfdc000, 800, ec0000, 1) vdev_file_io_strategy+0x65(ffffff033fd71380) taskq_d_thread+0xb7(ffffff0322000568) thread_start+8()