Project

General

Profile

Bug #849

domain controller "hot fail over" can take forever

Added by Gordon Ross over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Start date:
2011-03-23
Due date:
% Done:

0%

Estimated time:
Difficulty:
Tags:
Gerrit CR:

Description

Recent testing at Nexenta uncovered a flaw in the fix for:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6779186
(need domain controller hot failover)

The new smbd_dc_monitor thread appears to get stuck,
taking a long time to give up on the current DC and
pick a new one. Can take hours.

History

#1

Updated by Gordon Ross over 9 years ago

  • Subject changed from domain controller "hot fail over" fails to domain controller "hot fail over" sometimes fails
#2

Updated by Gordon Ross over 9 years ago

It appears the cause of the smbd_dc_monitor thread getting stuck is
contention for the mutex smbrdr_screate_mtx in
usr/src/lib/smbsrv/libsmbrdr/common/smbrdr_session.c
Anything attempting to use this connection will hold that mutex
for the duration of the connection setup attempt, which may take
several minutes before it fails and gives up the mutex.

#3

Updated by Gordon Ross over 9 years ago

The things most often blocking progress of smb_ddiscover_main were:
1: libmlsvc: netlogon_logon(), trying to open a pipe to the current DC.
2: smbd: smbd_dc_monitor() / dssetup_check_service() ditto, and
3: occasionally: mlsvc_timecheck(), opening /pipe/srvsvc to the DC.

For now, make all of these "give way" to the dc location thread,
because that will sometimes change who these other threads
will try to communicate with.

http://cr.illumos.org/view/xmjid46h/

#4

Updated by Gordon Ross over 9 years ago

  • Subject changed from domain controller "hot fail over" sometimes fails to domain controller "hot fail over" can take forever
#5

Updated by Gordon Ross over 9 years ago

  • Status changed from New to Resolved

changeset: 13328:2f33da224406

Also available in: Atom PDF