ZFS should not unconditionally read pool configuration
This issue was found by FreeBSD developer Alexander Motin <mav@FreeBSD.org>. The change was committed as FreeBSD r253993 ( http://svnweb.freebsd.org/changeset/base/253993 ), the commit message on FreeBSD was:
Block reporting of ZFS features for suspended pools. Before executing any subcommand, zpool tool fetches pools configuration from the kernel. Before features support was added, kernel was regenerating that configuration based on data always present in memory. Unfortunately, pool features list and activity counters are not such. They are stored in ZAP, that normally resides in ARC, but under heavy memory pressure may be swapped out. If pool is suspended at this point, there is no way to recover it back since any zpool command will stuck. This change has one predictable flaw: `zpool upgrade` always wish to upgrade suspended pools, but fortunately it can't do it due to the suspension.
Quote of original email to illumos-zfs@:
no_features.patch -- I've found a way to hang ZFS by device detach up to the state of freezing `zpool status`. Issue was introduced with pool features implementation. `zpool` tool reads pool configuration on every pool opening. Previously there was no problem since the configuration was recreated by kernel from data that seems always present in memory. But newly introduced feature counters may not. They are stored in ZAP and in normal case just live in ARC. But if system has strong ARC pressure, that information may be evicted and then, if we are loosing devices, we are stuck. I am not sure what would be a proper fix for this situation. I've created workaround that blocks features reporting to user-level if pool is suspended. That helps with one predictable flaw: `zpool upgrade` always wish to update suspended pools, but fortunately it can't due to the same suspension.