Bug #7483
closedSMB flush on pipe triggers NULL pointer dereference in module smbsrv
100%
Description
A machine sharing a ZFS dataset using the kernel SMB implementation stops with this panic:
> ::panicinfo cpu 3 thread ffffff007a2e6c40 message BAD TRAP: type=e (#pf Page fault) rp=ffffff007a2e6900 addr=11c occurred in module "smbsrv" due to a NULL pointer dereference rdi ffffff1a885cc560 rsi ffffff1a885cc560 rdx 0 rcx ffffff1a7027b0d0 r8 0 r9 0 rax ffffff1a61da6940 rbx ffffff1a785ae9c8 rbp ffffff007a2e6a10 r10 fffffffffb854334 r11 ffffff007a2e6c40 r12 ffffff1a885cc560 r13 ffffff1a785ae9e0 r14 0 r15 ffffff1a785aea58 fsbase fffffd7ffe8f3240 gsbase ffffff195b524a80 ds 4b es 4b fs 0 gs 0 trapno e err 0 rip fffffffff84d12b2 cs 30 rflags 10282 rsp ffffff007a2e69f0 ss 38 gdt_hi 0 gdt_lo a000ffff idt_hi 0 idt_lo 9000ffff ldt 0 task 70 cr0 8005003b cr2 11c cr3 c400000 cr4 426f8
The stack during panic is this:
> $C ffffff007a2e6a10 smb_flush_file+0x32(ffffff1a785ae9c8, ffffff1a885cc560) ffffff007a2e6a60 smb_com_flush+0x54(ffffff1a785ae9c8) ffffff007a2e6b50 smb1sr_work+0x5ad(ffffff1a785ae9c8) ffffff007a2e6b90 smb1_tq_work+0xa0(ffffff1a785ae9c8) ffffff007a2e6c20 taskq_d_thread+0xb7(ffffff1b49369d70) ffffff007a2e6c30 thread_start+8()
So the kernel needs to flush writes to an opened pipe (ofile->f_ftype == SMB_FTYPE_MESG_PIPE).
> ffffff1a885cc560::print -t struct smb_ofile struct smb_ofile { list_node_t f_lnd = { struct list_node *list_next = 0xffffff1a7027b098 struct list_node *list_prev = 0xffffff1a7027b098 } list_node_t f_nnd = { struct list_node *list_next = 0 struct list_node *list_prev = 0 } uint32_t f_magic = 0x4f464c45 kmutex_t f_mutex = { void *[1] _opaque = [ 0 ] } smb_ofile_state_t f_state = 0 (SMB_OFILE_STATE_OPEN) struct smb_server *f_server = 0xffffff198bbe0ac0 smb_session_t *f_session = 0xffffff1a13a09a78 smb_user_t *f_user = 0xffffff1a700c22b0 smb_tree_t *f_tree = 0xffffff1a7027b038 smb_node_t *f_node = 0 smb_odir_t *f_odir = 0 smb_opipe_t *f_pipe = 0xffffff19ea56c9b0 uint32_t f_uniqid = 0x6c1ac7 uint32_t f_refcnt = 0x1 uint64_t f_seek_pos = 0xa0 uint32_t f_flags = 0 uint32_t f_granted_access = 0x120087 uint32_t f_share_access = 0x7 uint32_t f_create_options = 0 uint32_t f_opened_by_pid = 0x14f4 uint16_t f_fid = 0x1 uint16_t f_ftype = 0x2 uint64_t f_llf_pos = 0 int f_mode = 0 cred_t *f_cr = 0xffffff1a61da6940 pid_t f_pid = 0 smb_attr_t f_pending_attr = { uint_t sa_mask = 0 vattr_t sa_vattr = { uint_t va_mask = 0 vtype_t va_type = 0 (VNON) mode_t va_mode = 0 uid_t va_uid = 0 gid_t va_gid = 0 dev_t va_fsid = 0 u_longlong_t va_nodeid = 0 nlink_t va_nlink = 0 u_offset_t va_size = 0 timestruc_t va_atime = { time_t tv_sec = 0 long tv_nsec = 0 } timestruc_t va_mtime = { time_t tv_sec = 0 long tv_nsec = 0 } timestruc_t va_ctime = { time_t tv_sec = 0 long tv_nsec = 0 } dev_t va_rdev = 0 uint_t va_blksize = 0 u_longlong_t va_nblocks = 0 uint_t va_seq = 0 } uint32_t sa_dosattr = 0 timestruc_t sa_crtime = { time_t tv_sec = 0 long tv_nsec = 0 } u_offset_t sa_allocsz = 0 } boolean_t f_written = 0 (0) char [256] f_quota_resume = [ '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\ 0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', '\0', ... ] smb_oplock_grant_t f_oplock_grant = { list_node_t og_lnd = { struct list_node *list_next = 0 struct list_node *list_prev = 0 } uint32_t og_magic = 0 uint8_t og_breaking = 0 uint8_t og_level = 0 uint16_t og_fid = 0 uint16_t og_tid = 0 uint16_t og_uid = 0 struct smb_session *og_session = 0 struct smb_ofile *og_ofile = 0 } }
The code that triggers the panic is this:
/* * smb_flush_file * * If writes on this file are not synchronous, flush it using the NFSv3 * commit interface. */ static void smb_flush_file(struct smb_request *sr, struct smb_ofile *ofile) { sr->user_cr = smb_ofile_getcred(ofile); if ((ofile->f_node->flags & NODE_FLAGS_WRITE_THROUGH) == 0) (void) smb_fsop_commit(sr, sr->user_cr, ofile->f_node); }
Here the ofile->f_node is NULL, so trying to read flags will trigger the panic on NULL dereference.
Other information:
1. The kernel version that was running includes the SMB2 code, but had it disabled (smbd/max_protocol => 1).
2. I can provide access to the full kernel dump if necessary for further investigation.
Files
Updated by Gordon Ross over 5 years ago
That's interesting. SMB ofiles on pipes don't have an SMB node, so it's expected that ofile->f_node is NULL here.
How was this reproduced? Do you have a capture of the SMB request that triggered it?
My guess would be that this function needs a switch on the ofile->f_ftype, as we see in many smb_ofile_functions.
On inspection, this appears to be a problem in SMB2 as well.
For a fix, I'd suggest introducing a new function: smb_ofile_flush, placed after smb_ofile_seek or thereabouts,
and have that contain a switch statement on ofile->f_type as you see in other smb_ofile_... functions.
switch (of->f_ftype) { case SMB_FTYPE_DISK: ...
For now, I think we can implement just the FTYPE_DISK case,
and have that call smb_fsop_commit as smb_flush_file did.
Then both the SMB1 handler (smb_flush_file) and the
SMB2 handler (smb2_flush) should call smb_ofile_flush.
Updated by Dan Vatca over 5 years ago
Unfortunately I don't have a network trace. This has happened just once in production.
The only information I have is that the users of this share are using Solidworks to edit CAD files.
Updated by Dan Vatca over 5 years ago
- File illumos-7483-1.patch illumos-7483-1.patch added
How about the attached patch?
Should I submit a CR?
Updated by Gordon Ross over 5 years ago
That looks pretty good, but I think in the new function
smb_ofile_flush you should use of->f_cr rather than
sr->user_cr (which is only set in SMB1 and is sort of
"the old way" for handling SMB credentials).
Oh, and as long as you have a default case in that
new function, I'd ditch the cases other than "DISK".
You'll also need a declaration, i.e. after
smb_ofile_set_flags() in smb_kproto.h
If you'd like to create a review request, use:
https://www.illumos.org/rb/r/new/
This is a terrible vulnerability! Thanks for finding it!
Updated by Dan Vatca over 5 years ago
I am creating a CR right now and will incorporate your feedback and copyright.
Updated by Dan Vatca over 5 years ago
Review board: https://www.illumos.org/rb/r/240
After publishing it, I realised I missed the f_cr -> user_cr change and will update it later. If you like, you can add your further comments on RB.
Updated by Electric Monk over 5 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 6d1c73b5858fefc6161c7d686345f0dc887ea799
commit 6d1c73b5858fefc6161c7d686345f0dc887ea799 Author: Dan Vatca <dan.vatca@gmail.com> Date: 2016-10-21T00:21:54.000Z 7483 SMB flush on pipe triggers NULL pointer dereference in module smbsrv Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Matt Barden <matt.barden@nexenta.com> Reviewed by: Evan Layton <evan.layton@nexenta.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Gordon Ross <gwr@nexenta.com>