Project

General

Profile

Bug #7020

sdev_cleandir can loop forever

Added by Alex Wilson over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2016-05-31
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

sdev_cleandir can currently hang forever when it encounters a child node that is busy, or when it is given a matching expr and the first entry on the list does not match.

The previous code (circa 2013) iterated over the children of the node using a for loop with SDEV_NEXT_ENTRY, which was then changed to a while ((dv = SDEV_FIRST_ENTRY(ddv)) { loop. Unfortunately the continue statements that previously made it skip over an entry were left as they were, which now result in an infinite busy-loop in the kernel.

You can trigger this pretty easily by setting up an sdev exclude rule in zonecfg.

Diagnosis: look for a runaway process consuming 100% CPU in kernel -- they have a distinctive stack:

# mdb -k
> 0t1234::pid2proc | ::walk thread | ::findstack -v
[ ffffd001efcd3310 _resume_from_idle+0x112() ]
  ffffd001efcd3360 apix_hilevel_intr_epilog+0xc1(ffffd001efcd33d0, 0)
  ffffd001efcd33c0 apix_do_interrupt+0x34a(ffffd001efcd33d0, 0)
  ffffd001efcd33d0 _sys_rtt_ints_disabled+8()
  ffffd001efcd3550 rw_enter+0x58()
  ffffd001efcd35e0 sdev_cleandir+0x60(ffffd0631b6d75d8, 0, 0)
  ffffd001efcd3630 devzvol_prunedir+0xec(ffffd0631b6d76e8)
  ffffd001efcd36d0 devzvol_readdir+0x150(ffffd06333250e00, ffffd001efcd3790, ffffd062dc990e18, ffffd001efcd37dc, 0, 0)
  ffffd001efcd3760 fop_readdir+0x6b(ffffd06333250e00, ffffd001efcd3790, ffffd062dc990e18, ffffd001efcd37dc, 0, 0)
  ffffd001efcd3830 walk_dir+0xee(ffffd06333250e00, ffffd0669e4483c8, fffffffffbbdf410)
  ffffd001efcd3850 prof_make_names_walk+0x2e(ffffd0669e4483c8, fffffffffbbdf410)
  ffffd001efcd38b0 prof_make_names+0xfc(ffffd0669e4483c8)
  ffffd001efcd38e0 prof_filldir+0x8b(ffffd0669e4483c8)
  ffffd001efcd3940 prof_lookup+0xf8(ffffd0669e449200, ffffd084a32e47a0, ffffd001efcd3ab0, ffffd062dc990e18)
  ffffd001efcd39d0 devzvol_lookup+0x166(ffffd0669e449200, ffffd084a32e47a0, ffffd001efcd3ab0, 0, 0, 0, ffffd062dc990e18, 0, 0, 0)
  ffffd001efcd3a80 fop_lookup+0xa2(ffffd0669e449200, ffffd084a32e47a0, ffffd001efcd3ab0, 0, 0, 0, ffffd062dc990e18, 0, 0, 0)
  ffffd001efcd3b00 devzvol_create_pool_dirs+0xd0(ffffd0669e449200)
  ffffd001efcd3ba0 devzvol_readdir+0xc0(ffffd0669e449200, ffffd001efcd3d90, ffffd064ea6330e0, ffffd001efcd3ddc, 0, 0)
  ffffd001efcd3c30 fop_readdir+0x6b(ffffd0669e449200, ffffd001efcd3d90, ffffd064ea6330e0, ffffd001efcd3ddc, 0, 0)
  ffffd001efcd3c90 lxd_readdir+0xa7(ffffd07f380afa00, ffffd001efcd3d90, ffffd064ea6330e0, ffffd001efcd3ddc, 0, 0)
  ffffd001efcd3d20 fop_readdir+0x6b(ffffd07f380afa00, ffffd001efcd3d90, ffffd064ea6330e0, ffffd001efcd3ddc, 0, 0)
  ffffd001efcd3e40 lx_getdents_common+0x28f(5, 140d0, 8000, 18, fffffffff82dab80)
  ffffd001efcd3e70 lx_getdents_64+0x25(5, 140d0, 8000)
  ffffd001efcd3ef0 lx_syscall_enter+0x16f()
  ffffd001efcd3f10 sys_syscall+0x16c()

(note: this example stack comes in through our LX brand, but this is not LX-specific at all and can be triggered the same way from regular zones)

#1

Updated by Electric Monk over 4 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 45b1747515a17db45e8971501ee84a26bdff37b2

commit  45b1747515a17db45e8971501ee84a26bdff37b2
Author: Alex Wilson <alex.wilson@joyent.com>
Date:   2016-05-31T18:56:33.000Z

    7019 zfsdev_ioctl skips secpolicy when FKIOCTL is set
    7020 sdev_cleandir can loop forever
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Richard Lowe <richlowe@richlowe.net>
    Reviewed by: Matthew Ahrens <mahrens@delphix.com>
    Approved by: Dan McDonald <danmcd@omniti.com>

Also available in: Atom PDF