Project

General

Profile

Feature #5313

Allow I/Os to be aggregated across ZIO priority classes

Added by Andriy Gapon almost 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2014-11-11
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Combine reads and writes, irrespective of their priorities into unified, offset sorted, trees. Selection of the ZIO to issue is unchanged, but aggregation now uses the unified tree of the appropriate type so that aggregation across priority classes is possible.
One consequence of this change is that now sync writes can be aggregated with each one. This improves performance in configurations without a SLOG device and lots of WR_INDIRECT ZIL write record. In that case ZIL writes are interleaved with data block writes and both kinds of writes have sync write priority.

History

#2

Updated by Justin Gibbs over 4 years ago

This change was largely motivated by a client workload consisting of 100MB to 1GB files being written to a 10 drive RAIDZ2 with 128K record size over NFS. The client in this scenario performs async writes followed by an NFS commit with a ~1s period. The NFS commit converts any unscheduled asynchronous writes into synchronous writes. Prior to this change, synchronous writes were never aggregated by vdev_queue.c, either with themselves or with I/Os from other write classes. After this change 95% of synchronous writes written with this workload were aggregated and the average per-drive I/O size rose from just under 16K to 120K, yielding a 50% improvement in throughput (hit a bottleneck elsewhere in the system). I no longer have the original results, but they were measured with the following dtrace script and FreeBSD's iostat utility.

fbt:zfs:vdev_queue_io_to_issue:return
/ args1 != NULL / {
@size_distribution[args1->io_priority] = quantize(args1->io_size);
@agg_hitrate[args1->io_priority, (args1->io_flags & (1 << 29)) ? "hit" : "miss"] = count();
}

The ability to aggregate across I/O classes is a "free" side effect of the implementation. I didn't explicitly measure its performance benefit, but it should be visible when combining a fully asynchronous workload with another workload that periodically fsyncs files, resulting in WR_INDIRECT records pointing to blocks adjacent to blocks allocated by the asynchronous workload.

#3

Updated by Electric Monk over 4 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit fe319232d24f4ae183730a5a24a09423d8ab4429

commit  fe319232d24f4ae183730a5a24a09423d8ab4429
Author: Justin T. Gibbs <justing@spectralogic.com>
Date:   2015-01-09T18:28:18.000Z

    5313 Allow I/Os to be aggregated across ZIO priority classes
    Reviewed by: Andriy Gapon <avg@FreeBSD.org>
    Reviewed by: Will Andrews <willa@SpectraLogic.com>
    Reviewed by: Matt Ahrens <mahrens@delphix.com>
    Reviewed by: George Wilson <george@delphix.com>
    Approved by: Robert Mustacchi <rm@joyent.com>

Also available in: Atom PDF