Project

General

Profile

Actions

Bug #9873

closed

SMB logon fails during 1st second after service start

Added by Gordon Ross almost 5 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
2018-10-06
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:

Description

There are some startup sequencing problems just after smbd learns about a new domain controller.
If SMB logons happen during the first second or so after we find a new DC, those logons may fail
(typically with ACCESS_DENIED, but also sometimes with DOMAIN_TRUST_INCONSISTENT)

Actions #1

Updated by Gordon Ross almost 5 years ago

From our internal bug tracker:

Discovered during SMB3 persistent handle work.
During SMB3 tests, we restart the server (and/or fail-over the pool to another cluster head).
Clients with persistent handles open will eagerly try to reconnect, and may attempt to run SMB authentication requests before smd is entirely ready, so those requests fail.

The service startup should synchronize initialization activities so that SMB logon requests wait until the authentication service is ready before those requests are processed.

Oh, and from the internal change review, here are the various parts of the fix:

Return meaningful status codes from session setup
Let smb_domain_getinfo() wait for DC info.
No longer need smb_ddiscover_wait
Retry named pipe opens (similar to how MS clients do)
Be more selective about when to mark the DC as failed
Rework netlogon loops (had two levels both looping)

Actions #2

Updated by Vitaliy Gusev almost 5 years ago

Gordon, is it possible to add a pcap/snoop with traffic. It would help to understand and maybe recall problem in the future.

Actions #3

Updated by Gordon Ross almost 5 years ago

Sorry, we fixed that way back in January. I did not find any captures.

Actions #4

Updated by Electric Monk almost 5 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 975041dd3b571af240661f84d186e0cd0e36217b

commit  975041dd3b571af240661f84d186e0cd0e36217b
Author: Gordon Ross <gwr@nexenta.com>
Date:   2018-10-20T12:42:08.000Z

    9873 SMB logon fails during 1st second after service start
    Reviewed by: Matt Barden <matt.barden@nexenta.com>
    Reviewed by: Evan Layton <evan.layton@nexenta.com>
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF