Project

General

Profile

Bug #6988

spa_sync() spends half its time in dmu_objset_do_userquota_updates

Added by Matthew Ahrens over 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Category:
zfs - Zettabyte File System
Start date:
2016-05-22
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Using a benchmark which creates 2 million files in one TXG, I observe that the thread running `spa_sync()` is on CPU almost the entire time we are syncing, and therefore can be a performance bottleneck. About 50% of the time in `spa_sync()` is in `dmu_objset_do_userquota_updates()`.
The problem is that `dmu_objset_do_userquota_updates()` calls `zap_increment_int(DMU_USERUSED_OBJECT)` once for every file that was modified (or created). In this benchmark, all the files are owned by the same user/group, so all 2 million calls to `zap_increment_int()` are modifying the same entry in the zap. The same issue exists for the DMU_GROUPUSED_OBJECT.
A simple solution to reduce the number of calls to `zap_increment_int()` would be to cache the space delta for the “current” user and group, and only call `zap_increment_int()` when the “current” user/group changes. In my benchmark, all files are owned by the same user/group, so this reduces the number of calls from 2,000,000 to 1.
However, for other workloads, the simple solution provides no help. For example, if the dirty files alternate between owners. In this case a slightly more involved solution is required. We should keep an in-memory map from user to space delta while we are syncing, and when we finish, iterate over the in-memory map and modify the ZAP once per entry. This reduces the number of calls to `zap_increment_int()` from “number of objects modified” to “number of owners/groups of modified files”.
A prototype of the simple solution reduced the time spent in spa_sync() by ~33%, from 11 seconds to 7 seconds.

The equivalent ZFSonLinux bug is https://github.com/zfsonlinux/zfs/issues/4642

History

#1

Updated by Electric Monk almost 3 years ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit 8b70546b0a686e4eee3d28878728f3b3ce980b06

commit  8b70546b0a686e4eee3d28878728f3b3ce980b06
Author: Matthew Ahrens <mahrens@delphix.com>
Date:   2016-09-28T18:03:33.000Z

    6988 spa_sync() spends half its time in dmu_objset_do_userquota_updates
    Reviewed by: George Wilson <george.wilson@delphix.com>
    Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
    Reviewed by: Ned Bass <bass6@llnl.gov>
    Reviewed by: Jinshan Xiong <jinshan.xiong@intel.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Also available in: Atom PDF