Project

General

Profile

Bug #13118

Very slow reaping, possible deadlock in zfs_delmap

Added by Alex Wilson 3 months ago. Updated about 1 month ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

See original Joyent bug: https://smartos.org/bugview/OS-732

The missing sa_bulk_update call was fixed up in illumos in 2017 (#5379, commit 80e10fd), but the call to VOP_PUTPAGE inside zfs_delmap is still around.

This causes extreme slow-downs on systems reaping many processes with shared mappings of files on ZFS -- which we notice often with thousands of Samba smbd children calling exit() at once -- the system can process about 1-2 of them exiting per <i>second</i>, meaning that thousands of them take hours to shut down. We have had as many as 14,000 to 18,000 smbd processes on several of our fileservers, which takes 4-5 hours to shut down with that VOP_PUTPAGE there. Without the VOP_PUTPAGE it takes minutes.

According to the original analysis in OS-732 by Bryan Cantrill, it seems that this call can also be the source of a deadlock. I haven't personally observed that, so I'm not sure if that's still the case (other things may have changed in ZFS since then).

Quoting from the original bug:

in discussion with the engineer who originally did this work, the consensus is that these semantics are simply too expensive to implement; the fix here is to restore the semantics prior to their change to accommodate the scenario mentioned in the zfs_delmap() comment, above.


Related issues

Related to illumos gate - Bug #5379: modifying a mmap()-ed file does not update its timestampsClosed2014-12-02

Actions
#1

Updated by Electric Monk 3 months ago

  • Gerrit CR set to 900
#2

Updated by Joshua M. Clulow 3 months ago

  • Related to Bug #5379: modifying a mmap()-ed file does not update its timestamps added
#3

Updated by Joshua M. Clulow 3 months ago

This behaviour appears to have been introduced in:

commit b468a217b67dc26ce21da5d5a2ca09bb6249e4fa
Author: eschrock <none@none>
Date:   Sat Apr 8 23:33:38 2006 -0700

    6407791 bringover into ZFS results in s. files newer than extracted source
    6409927 failed DKIOCFLUSHWRITECACHE ioctls should not generate ereports
    6410371 need to reserve more pool names

Some of the information is available in the historical bug reports; of most relevance here appears to be 6407791

#4

Updated by Alex Wilson about 1 month ago

Testing:

  • This has been in SmartOS since 2011 and hasn't caused any follow-up trouble that I can find
  • We've also tested this on OmniOS, both as a source patch and hot-patch (replacing the VOP_PUTPAGE with NOPs in mdb during emergencies)
#5

Updated by Electric Monk about 1 month ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 94cc9d8febd5c99331fd191291b3b54435a1ef18

commit  94cc9d8febd5c99331fd191291b3b54435a1ef18
Author: Alex Wilson <alex@uq.edu.au>
Date:   2020-10-27T23:23:48.000Z

    13118 Very slow reaping, possible deadlock in zfs_delmap
    Portions contributed by: Bryan Cantrill <bryan@joyent.com>
    Reviewed by: Joshua M. Clulow <josh@sysmgr.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom PDF