Project

General

Profile

Actions

Bug #13514

closed

smbsrv panic in smb2_dh_read_nvlist

Added by Gordon Ross over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

panic after failover with a pool under user payload from 20 Windows10 VMs on Hyper-V.

# mdb -k 0
Loading modules: [ ... ]
> ::status
debugging crash dump vmcore.0 (64-bit) from x11-61
operating system: 5.11 ...:... (i86pc)
image uuid: ...
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffc102e50f57c0 addr=38 occurred in module "smbsrv" due to a NULL pointer dereference
dump content: kernel pages only
> ::stack
smb2_dh_read_nvlist+0x1cb(ffffc19944fcfd08, ffffc19765eb7640, ffffc102e50f5a70)
smb2_dh_import_handle+0x94(ffffc19944fcfd08, ffffc19765eb7640, e7b948ca4ed761bb)
smb2_dh_import_share+0x1a8(ffffc19944fcfd08)
taskq_d_thread+0xb7(ffffc195ef919f18)
thread_start+8()

Steps to Reproduce:
1) Configure cluster with 2 pools (one for iscsi lun - for VMs storage, the second for NFS and SMB shares)
2) Run VDBench payload from VMs (Ubuntu18 over NFS v3 and v4; Windows10 over SMB 3.02)
3) After some time perform manual failovers of a pool with shares.

Expected Results:
All wrks fine

Actual Results:
panic

Actions #1

Updated by Gordon Ross over 1 year ago

  • Status changed from New to In Progress
  • Assignee set to Gordon Ross

Trying to print an error right after the nvlist_unpack call.
Let's look at: shr = sr->arg.tcon

> ffffc19808d45018 ::smbreq
REQUEST MSG_ID WORKER STATE 
COMMAND 
ffffc19808d45018 0x0 0 ACTIVE 
SMB_COM_CREATE_DIRECTORY
> ffffc19808d45018 ::print smb_request_t arg.tcon
arg.tcon = {
 arg.tcon.name = 0
 arg.tcon.path = 0
 arg.tcon.service = 0
 arg.tcon.pwdlen = 0
 arg.tcon.password = 0
 arg.tcon.flags = 0
 arg.tcon.optional_support = 0
 arg.tcon.si = 0
}

Uh oh, it's all zero.

I don't think we've ever tested the error handling code path
where we have an nvlist file for which import fails.
The sr->arg member is cleared in smb2_dh_import_handle.
Need a different way to pass that in.

This is probably reproducible by importing/sharing any
ZFS dataset that contains a corrupted $CA file.

These are just error printing statements.
We can use tree->t_resource instead.

Actions #2

Updated by Electric Monk over 1 year ago

  • Gerrit CR set to 1227
Actions #3

Updated by Gordon Ross over 1 year ago

Testing:

Create a persistent handle storage file (and nvlist) that's intentionally corrupt.
Stop and start the SMB server so that it will import persistent handles.
Before the fix that panics, after it ignores the corrupt file.

Actions #4

Updated by Electric Monk over 1 year ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 174aa483b26ab13af096f2d478f7c15afdaf9784

commit  174aa483b26ab13af096f2d478f7c15afdaf9784
Author: Gordon Ross <gwr@nexenta.com>
Date:   2021-02-20T20:36:57.000Z

    13514 smbsrv panic in smb2_dh_read_nvlist
    Reviewed by: Roman Strashkin <roman.strashkin@nexenta.com>
    Reviewed by: Evan Layton <evan.layton@nexenta.com>
    Reviewed by: Matt Barden <matt.barden@nexenta.com>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Andy Fiddaman <andy@omnios.org>
    Reviewed by: Paul Winder <paul@winder.uk.net>
    Approved by: Robert Mustacchi <rm@fingolfin.org>

Actions

Also available in: Atom PDF