Project

General

Profile

Bug #8997

ztest assertion failure in zil_lwb_write_issue

Added by Prakash Surya over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2018-01-25
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

When dmu_tx_assign is called from zil_lwb_write_issue, it's possible
for either ERESTART or EIO to be returned.

If ERESTART is returned, this will cause an assertion to fail directly
in zil_lwb_write_issue, where the code assumes the return value is
EIO if dmu_tx_assign returns a non-zero value. This can occur if the
SPA is suspended when dmu_tx_assign is called, and most often occurs
when running zloop.

If EIO is returned, this can cause assertions to fail elsewhere in the
ZIL code. For example, zil_commit_waiter_timeout contains the
following logic:

lwb_t *nlwb = zil_lwb_write_issue(zilog, lwb);
ASSERT3S(lwb->lwb_state, !=, LWB_STATE_OPENED);

In this case, if dmu_tx_assign returned EIO from within
zil_lwb_write_issue, the lwb variable passed in will not be issued
to disk. Thus, it's lwb_state field will remain LWB_STATE_OPENED and
this assertion will fail. zil_commit_waiter_timeout assumes that after
it calls zil_lwb_write_issue, the lwb will be issued to disk, and
doesn't handle the case where this is not true; i.e. it doesn't handle
the case where dmu_tx_assign returns EIO.

History

#1

Updated by Electric Monk over 1 year ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit f864f99efe57685e1762590c1a880dd16bca6da9

commit  f864f99efe57685e1762590c1a880dd16bca6da9
Author: Prakash Surya <prakash.surya@delphix.com>
Date:   2018-01-26T17:50:36.000Z

    8997 ztest assertion failure in zil_lwb_write_issue
    Reviewed by: Matt Ahrens <mahrens@delphix.com>
    Reviewed by: Andriy Gapon <avg@FreeBSD.org>
    Approved by: Robert Mustacchi <rm@joyent.com>

Also available in: Atom PDF