9112 broke third block copy allocations within one metaslab group
metaslab_group_alloc_normal() sets tertiary variable for some allocation requests to push those allocations to different metaslab by calling for them find_valid_metaslab() instead of reusing cached secondary metaslab. The problem appears later when activation_weight is still set to METASLAB_WEIGHT_SECONDARY, and metaslab_activate_allocator() can not differentiate those tertiary allocations from secondary, and as result fails, since the allocator already has one secondary metaslab.
In practice I see under heavy write load, when, probably due to write throttling, all 3 copies of some blocks are regularly trying to allocate from the same metaslab group. It causes multiple metaslab_activate() errors, causing load of more metaslabs then necessary, that, combined with caused preloader misses, makes pool to hang regularly for several seconds waiting for metaslab load on demand.
Updated by Electric Monk almost 3 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
commit b86e7e3f0e50748bb5bb5cc91632d72ff17f08dd Author: Alexander Motin <mav@FreeBSD.org> Date: 2018-10-04T14:32:20.000Z 9738 9112 broke third block copy allocations within one metaslab group Reviewed by: Paul Dagnelie <firstname.lastname@example.org> Reviewed by: George Wilson <email@example.com> Approved by: Robert Mustacchi <firstname.lastname@example.org>