Project

General

Profile

Actions

Bug #14831

closed

NULL-pointer dereference in smb1_oplock_break_notification during lease break

Added by Matt Barden 2 months ago. Updated 11 days ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
cifs - CIFS server and client
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:
racktop:BSR-9720

Description

The following panic was reported in SMB:

ffffd001ea154ac0 smb1_oplock_break_notification+0x3d(ffffd0fb884ea968, 3)
ffffd001ea154b20 smb_oplock_send_brk+0x124(ffffd0fb884ea968)
ffffd001ea154b60 smb_oplock_async_break+0x64(ffffd0fb884ea968)
ffffd001ea154c20 taskq_thread+0x2d0(ffffd06329423980)
ffffd001ea154c30 thread_start+8()

panic message: 
BAD TRAP: type=e (#pf Page fault) rp=ffffd001ea154990 addr=19c occurred in module "smbsrv" due to a NULL pointer dereference
dump content: kernel pages only

Oddly, the session in question is actually an SMB2 connection, and the file was leased, not oplock'd. We nevertheless ended up in smb1_oplock_break_notification() because the oplock's 'og_dialect' is 0, and that takes the 'else' branch in smb_oplock_send_brk() into the SMB1 path. That function expects the ofile's f_tree to be non-NULL, because an SMB1 ofile can never have a NULL f_tree (that only happens with SMB2 durable handles).

og_dialect() should only be 0 if the oplock in question had been zeroed some time after the taskq has been dispatched, as it is initialized when the oplock/lease is first acquired. That zeroing happens when the 'owner' of a lease is closed; as part of transferring the oplock state to a 'new' owner, the old owner is zero'd (see smb_oplock_move() checks).

This reveals a race between smb_oplock_send_brk() and smb_oplock_move(): If smb_oplock_send_brk() is dispatched (with the 'lease owner' as the argument), but the 'lease owner' is closed and its state moved before smb_oplock_send_brk() can make sufficient progress, then the argument ofile's oplock state will be 0, and we'll take the SMB1 path. Additionally, if that file is a durable handle, it may have been orphaned before smb_oplock_send_brk() executes, and f_tree will be NULL during smb1_oplock_break_notification(). This leads to a NULL-pointer derefence.

To fix this, we can check whether a file is leased and always take the leasing path if so, falling back to og_dialect if it is not. For leased ofiles, we can store the state necessary for break logic on the lease object itself, instead of the on the oplock owner, so that we know it is valid even if the owner changes.

The following steps need to occur for this race to happen:
  1. Take a lease on a file.
  2. Open another handle, on a separate connection, with the same lease.
  3. Orphan the first (owner) handle (e.g. via disconnecting).
  4. Cause a lease break on the first handle (e.g. open another handle, or write to it, etc).
  5. After the hold on the handle is taken, but before smb_oplock_send_brk() begins processing, close the ofile (such as smb2_dh_close_my_orphans())
  6. Continue smb_oplock_send_brk().

This window can be expanded via `dtrace -wn 'fbt::smb_oplock_send_brk:entry { chill(50000000);}'.`

Attached is an smbtorture patch that can be used for testing/reproduction.


Files

Actions #1

Updated by Matt Barden 2 months ago

  • Category set to cifs - CIFS server and client
Actions #2

Updated by Gordon Ross 2 months ago

  • Status changed from New to In Progress
  • Assignee changed from Matt Barden to Gordon Ross

Also ref. BSR-9720 panic in smb_oplock_send_brk

Actions #3

Updated by Electric Monk 2 months ago

  • Gerrit CR set to 2270
Actions #4

Updated by Gordon Ross about 2 months ago

Re-tested per. the description using smbtorture smb2.durable_break_h
Has been in production nearly a year.

Actions #5

Updated by Electric Monk about 2 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 6f8336c5540e2283d7e24887cc87118206eb3e99

commit  6f8336c5540e2283d7e24887cc87118206eb3e99
Author: Gordon Ross <gwr@racktopsystems.com>
Date:   2022-08-07T15:53:04.000Z

    14831 NULL-pointer dereference in smb1_oplock_break_notification during lease break
    Reviewed by: Albert Lee <trisk@forkgnu.org>
    Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
    Reviewed by: Jerry Jelinek <gjelinek@gmail.com>
    Reviewed by: Matt Barden <mbarden@tintri.com>
    Approved by: Robert Mustacchi <rm@fingolfin.org>

Actions #6

Updated by Gordon Ross 11 days ago

  • External Bug set to racktop:BSR-9720
Actions

Also available in: Atom PDF