Project

General

Profile

Bug #10967

Deleting directory over SMB2 fails after visiting in explorer

Added by Gordon Ross 5 months ago. Updated 5 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
2019-05-14
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

History

#1

Updated by Gordon Ross 5 months ago

  • Subject changed from Deleting directory over CIFS SMB2 fails after visiting in explorer to Deleting directory over SMB2 fails after visiting in explorer
  • Status changed from New to Pending RTI

OK, reproduced the problem as described. The key to reproduction is to visit every level of the directory hierarchy in Windows Explorer, so it will run an SMB2 notify in each directory, before backing out to the top and trying the delete. If during the traversal back to the top of the hierarchy we neglect to tell the client about some open directory handles, the delete will fail on any of those directories because there are still handles open.

Here's what the server-side state looks like after this happens.
(There are way too many open handles for an idle client.)

# mdb -k 0
Loading modules: [ ... ]
> ::smblist
SERVER           ZONE STATE                            
ffffff015103bb00 0    RUNNING                          
  SESSION          IP_ADDR          PORT     DIALECT  STATE       
  ffffff0151847008 10.10.1.45       48971    0x210    NEGOTIATED
    USER             UID   ACCOUNT                         
    ffffff0157d6fe78 1     GWR12NS4\smb                    
    TREE             TID   SHARE NAME       RESOURCE                        
    ffffff0157db0060 2     IPC$             IPC$                            
    ffffff0157db0748 1     data_one         /volumes/data/one               
      OFILE            FID   SMB NODE         CRED            
      ffffff0157e557e8 3     ffffff0151039b48 ffffff0155807818
      ffffff0157e552a8 5     ffffff0151039b48 ffffff0155807818
      ffffff0157e55548 6     ffffff01510395a8 ffffff0155807818
      ffffff0157e55a88 8     ffffff01510395a8 ffffff0155807818
      ffffff0157e94a90 10    ffffff01510395a8 ffffff0155807818
      ffffff0157e55d28 11    ffffff015103a958 ffffff0155807818
      ffffff0157e94550 13    ffffff01510395a8 ffffff0155807818
      ffffff0157e947f0 14    ffffff015103a958 ffffff0155807818

#2

Updated by Gordon Ross 5 months ago

When we turn off the "related request" flag in smb2sr_go_async, that causes the code in smb2sr_lookup_fid to drop the open file when we begin handling the "async" part of the notify request. After that happens, smb2sr_lookup_fid fails, returning STATUS_FILE_CLOSED, and the client assumes the handle opened earlier in this compound is now gone.
Unfortunately, that handle is alive and well, but the client now doesn't know about it. That's a "handle leak".

When you have leaked directory handles, delete-on-close (DoC) fails because the delete happens only after the last handle is closed, and we never get rid of those leaked handles until the client disconnects from the share.

#3

Updated by Gordon Ross 5 months ago

Testing: Visit directories in explorer as described, then delete the hierarchy.
Fix has been in production since early 2016.

#4

Updated by Gordon Ross 5 months ago

  • Status changed from Pending RTI to In Progress
#5

Updated by Electric Monk 5 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 58ccc3dc6cf31bbb97afd9f13137fe67844f1c95

commit  58ccc3dc6cf31bbb97afd9f13137fe67844f1c95
Author: Gordon Ross <gwr@nexenta.com>
Date:   2019-05-19T23:20:04.000Z

    10967 Deleting directory over SMB2 fails after visiting in explorer
    Reviewed by: Kevin Crowe <kevin.crowe@nexenta.com>
    Reviewed by: Matt Barden <Matt.Barden@nexenta.com>
    Approved by: Joshua M. Clulow <josh@sysmgr.org>

Also available in: Atom PDF