Project

General

Profile

Bug #9993

zil writes can get delayed in zio pipeline

Added by Prakash Surya 8 months ago. Updated 8 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2018-11-20
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Looking at the data from a dtrace script that shows the cumulative time
as a zio progresses through the zio pipeline we see this:

:PIPELINE: zio_write_bp_init[12us]
          :zio_issue_async[30us]
          :zio_write_compress[132667us]
          :zio_checksum_generate[132691us]
          :zio_ready[132767us]
          :zio_vdev_io_start[132801us]
          :zio_vdev_io_done[132891us]
          :zio_vdev_io_done[134439us]
          :zio_vdev_io_assess[134467us]
          :zio_done[134502us]:

PIPELINE: zio_write_bp_init[4us]
          :zio_issue_async[13us]
          :zio_write_compress[129232us]
          :zio_checksum_generate[129244us]
          :zio_ready[129295us]
          :zio_vdev_io_start[129315us]
          :zio_vdev_io_done[129355us]
          :zio_vdev_io_done[130615us]
          :zio_vdev_io_assess[130643us]
          :zio_done[130682us]:

From the two samples above we can see that the zil zio took 134ms in case #1
and 130ms in case #2. In both cases we see that the zil zio is incurring a
delay when it switches from the zil_commit thread context to the async taskq
context as shown in this snippet:

zio_issue_async[30us]:zio_write_compress[132667us]

Gathering data on the issue taskq during this time shows that we have almost
900 tasks queued up in front of the zil zio. These are probably all lower
priority work like async writes so we get stuck behind and incur a long delay.

The ZFS high priority queues was introduced back in 2009:

commit 80eb36f241abf8c076119fb4c49a55fd61ebc710
Author: George Wilson <George.Wilson@Sun.COM>
Date: Sat Nov 21 22:51:29 2009 -0800

6826241 Sync write IOPS drops dramatically during TXG sync

But the io scheduler we inadvertently broke this functionality:
commit e80fef6bbc55f585a7cc106340ad2476ea759ade
Author: Matthew Ahrens <mahrens@delphix.com>
Date: Tue May 14 21:03:23 2013 +0000

19097 zfs i/o scheduler needs some work
19704 ZFS write throttle impedes performance
25526 ztest failure: assertion refcount_count(&tx->tx_space_written) +
      delta <= tx->tx_space_towrite

This commit made the following change in the zil:
if (lwb->lwb_zio == NULL) {
lwb->lwb_zio = zio_rewrite(zilog->zl_root_zio, zilog->zl_spa,
0, &lwb->lwb_blk, lwb->lwb_buf, BP_GET_LSIZE(&lwb->lwb_blk),
- zil_lwb_write_done, lwb, ZIO_PRIORITY_LOG_WRITE,
+ zil_lwb_write_done, lwb, ZIO_PRIORITY_SYNC_WRITE,
ZIO_FLAG_CANFAIL | ZIO_FLAG_DONT_PROPAGATE, &zb);
}

This commit also change the definition of the priorities:
/*
* ==========================================================================
- * I/O priority table
- * ==========================================================================
- /
uint8_t zio_priority_table[ZIO_PRIORITY_TABLE_SIZE] = {
0, / ZIO_PRIORITY_NOW /
- 0, / 1GZIO_PRIORITY_SYNC_READ /
- 0, / ZIO_PRIORITY_SYNC_WRITE /
- 0, / ZIO_PRIORITY_LOG_WRITE /
- 1, / ZIO_PRIORITY_CACHE_FILL /
- 1, / ZIO_PRIORITY_AGG /
- 4, / ZIO_PRIORITY_FREE /
- 4, / ZIO_PRIORITY_ASYNC_WRITE /
- 6, / ZIO_PRIORITY_ASYNC_READ /
- 10,1G / ZIO_PRIORITY_RESILVER /
- 20, / ZIO_PRIORITY_SCRUB /
- 2, / ZIO_PRIORITY_DDT_PREFETCH */
-};

The original code would map ZIO_PRIORITY_LOG_WRITE to the same value as
ZIO_PRIORITY_NOW but the new io scheduler now maps it to a unique value.
This had the side-effect of disabling the high priority taskq for the zil.

This fix will now explicitly check for ZIO_PRIORITY_SYNC_WRITE and put
dispatch them on the high priority taskq.


Related issues

Related to illumos gate - Bug #9984: zil writes can get delayed in zio pipelineNew2018-11-16

Actions

History

#1

Updated by Electric Monk 8 months ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 2258ad0b755b24a55c6173b1e6bb6188389f72dd

commit  2258ad0b755b24a55c6173b1e6bb6188389f72dd
Author: George Wilson <george.wilson@delphix.com>
Date:   2018-11-26T02:13:30.000Z

    9993 zil writes can get delayed in zio pipeline
    Reviewed by: Prakash Surya <prakash.surya@delphix.com>
    Reviewed by: Brad Lewis <brad.lewis@delphix.com>
    Reviewed by: Matt Ahrens <matt@delphix.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

#2

Updated by Brad Lewis 2 months ago

  • Related to Bug #9984: zil writes can get delayed in zio pipeline added

Also available in: Atom PDF