Bug #9738
closed9112 broke third block copy allocations within one metaslab group
100%
Description
metaslab_group_alloc_normal() sets tertiary variable for some allocation requests to push those allocations to different metaslab by calling for them find_valid_metaslab() instead of reusing cached secondary metaslab. The problem appears later when activation_weight is still set to METASLAB_WEIGHT_SECONDARY, and metaslab_activate_allocator() can not differentiate those tertiary allocations from secondary, and as result fails, since the allocator already has one secondary metaslab.
In practice I see under heavy write load, when, probably due to write throttling, all 3 copies of some blocks are regularly trying to allocate from the same metaslab group. It causes multiple metaslab_activate() errors, causing load of more metaslabs then necessary, that, combined with caused preloader misses, makes pool to hang regularly for several seconds waiting for metaslab load on demand.