Project

General

Profile

Bug #5008

lock contention (rrw_exit) while running a read only load

Added by Matthew Ahrens about 5 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Category:
zfs - Zettabyte File System
Start date:
2014-07-15
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

We see lock contention on the z_teardown_lock when under a heavy cached-read workload.

I tested performance of these rrw lock changes by reading cached data with 28
CPUs, on illumos. I found no improvement, because lock contention from ARC's
add_reference/remove_reference() slows things down so much that the rrw lock is
not noticeable. However, after removing this source of contention, the rrw
lock changes provided a 20% performance improvement.

details, including flame graphs of kernel CPU time:

reads of 8KB of cached data
recordsize=8k; all other properties default
28 vCPUs on VMware ESX, 4 hardware sockets, each with 8 cores, each with 2
hyperthreads
28 processes
28 files (each process reads from its own file)
see fio config below.

Stock: ~350,000 IOPS
http://mahrens.org/flame/cached_reads_28_cpu_stock.svg
primary problem is arc_buf_add_ref() / arc_buf_remove_ref() (>75% of all CPU)

Applying these changes to break up the rrw lock: ~350,000 IOPS
http://mahrens.org/flame/cached_reads_28_cpu_rrw.svg
primary problem is still arc_buf_add_ref() / arc_buf_remove_ref() (>75% of all
CPU)

Fixing arc_buf_add/remove_ref() by creating per-cpu arcs_lists: 1,000,000 IOPS
http://mahrens.org/flame/cached_reads_28_cpu_agg.svg
rrw_enter/exit taking ~8% of all CPU

Both arc_buf_add/removeref and rrw lock fixes: 1,200,000 IOPS
http://mahrens.org/flame/cached_reads_28_cpu_agg_rrw.svg
rrw_enter/exit taking ~1.5% of all CPU

fio config file:
[global]

fallocate=none
ioengine=psync
numjobs=28
iodepth=1
time_based
runtime=10m
bs=8k
rw=randread
filesize=1g
size=1g
randrepeat=0
use_os_rand=1

[test]
directory=/test/fs

History

#1

Updated by Electric Monk about 5 years ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit c9030f6c93613fe30ee0c16f92b96da7816ac052

commit  c9030f6c93613fe30ee0c16f92b96da7816ac052
Author: Alexander Motin <mav@freebsd.org>
Date:   2014-07-18T16:53:38.000Z

    5008 lock contention (rrw_exit) while running a read only load
    Reviewed by: Matthew Ahrens <matthew.ahrens@delphix.com>
    Reviewed by: George Wilson <george.wilson@delphix.com>
    Reviewed by: Alex Reece <alex.reece@delphix.com>
    Reviewed by: Christopher Siden <christopher.siden@delphix.com>
    Reviewed by: Richard Yao <ryao@gentoo.org>
    Reviewed by: Saso Kiselkov <skiselkov.ml@gmail.com>
    Approved by: Garrett D'Amore <garrett@damore.org>

Also available in: Atom PDF