Project

General

Profile

Actions

Bug #14867

closed

panic in smb_request_alloc / smb_oplock_ind_break during shutdown

Added by Gordon Ross 2 months ago. Updated 10 days ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
cifs - CIFS server and client
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:
racktop:BSR-9838

Description

During an SMB service restart with combined NFS and SMB work underway, observed a panic like this:

 smb_request_alloc
 smb_oplock_ind_break
 smb_oplock_break_cmn
 smb_oplock_break_OPEN
 smb_fem_oplock_open
 vhead_open
 fop_open
 rfs4_do_open
 rfs4_do_openfh
 rfs4_op_open

There's a fix for this from nexenta@1da432d9ca360b955f549d36acdd87d52f257b6a

Actions #1

Updated by Gordon Ross 2 months ago

  • Status changed from New to In Progress

Don't have the crash dump handy anymore, but from notes:
(and what one can see in the fix on github/Nexenta)

The FEM hook for causing an oplock break is trying to allocate an smb request on a session that's going away.
This appears to have happened because we're holding an ofile ref. and the server is in the process of
getting rid of all those (sessions, ofiles, nodes). That part of the shutdown path does not expect any
new smb requests to be possible on the session, but the oplock break path can try to create one.
(by this point during shutdown, no more taskq jobs should be possible.)

The fix is to make sure a running oplock break task deals correctly with failure to allocate a request
(which is normal and expected on a session or server that's shutting down) and
make sure oplock break tasks jobs can be killed off promptly during shutdown.
Also make sure the FEM hooks are gone away during node destruction.
(That part is done with #14866)

Actions #2

Updated by Electric Monk 2 months ago

  • Gerrit CR set to 2273
Actions #3

Updated by Gordon Ross 2 months ago

  • Category set to cifs - CIFS server and client
Actions #4

Updated by Gordon Ross 22 days ago

Ref. BSR-9838

Actions #5

Updated by Gordon Ross 20 days ago

Tested while shutting down a server that's under heavy load.

The smbsrv-tests show no difference in output:
cd /var/tmp/test_results/smbsrv-tests
diff smbtor-smb2-20220911T163540.summary smbtor-smb2-20220911T164600.summary
(no output)

Actions #6

Updated by Gordon Ross 20 days ago

  • Status changed from In Progress to Pending RTI
Actions #7

Updated by Electric Monk 18 days ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 0 to 100

git commit 525641e8e46b2b8beffe3f067f910178ffa06377

commit  525641e8e46b2b8beffe3f067f910178ffa06377
Author: Gordon Ross <gordon.ross@tintri.com>
Date:   2022-09-13T23:13:26.000Z

    14867 panic in smb_request_alloc / smb_oplock_ind_break during shutdown
    Portions contributed by: Prashanth Badari <prashanth.badari@tegile.com>
    Reviewed by: Prashanth Badari <prbadari@tintri.com>
    Reviewed by: Suresh Jayaraman <sjayaraman@tintri.com>
    Reviewed by: Matt Barden <mbarden@tintri.com>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Approved by: Robert Mustacchi <rm@fingolfin.org>

Actions #8

Updated by Gordon Ross 10 days ago

  • External Bug set to racktop:BSR-9838
Actions

Also available in: Atom PDF