Bug #849
closeddomain controller "hot fail over" can take forever
0%
Description
Recent testing at Nexenta uncovered a flaw in the fix for:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6779186
(need domain controller hot failover)
The new smbd_dc_monitor thread appears to get stuck,
taking a long time to give up on the current DC and
pick a new one. Can take hours.
Updated by Gordon Ross about 12 years ago
- Subject changed from domain controller "hot fail over" fails to domain controller "hot fail over" sometimes fails
Updated by Gordon Ross about 12 years ago
It appears the cause of the smbd_dc_monitor thread getting stuck is
contention for the mutex smbrdr_screate_mtx in
usr/src/lib/smbsrv/libsmbrdr/common/smbrdr_session.c
Anything attempting to use this connection will hold that mutex
for the duration of the connection setup attempt, which may take
several minutes before it fails and gives up the mutex.
Updated by Gordon Ross about 12 years ago
The things most often blocking progress of smb_ddiscover_main were:
1: libmlsvc: netlogon_logon(), trying to open a pipe to the current DC.
2: smbd: smbd_dc_monitor() / dssetup_check_service() ditto, and
3: occasionally: mlsvc_timecheck(), opening /pipe/srvsvc to the DC.
For now, make all of these "give way" to the dc location thread,
because that will sometimes change who these other threads
will try to communicate with.
Updated by Gordon Ross about 12 years ago
- Subject changed from domain controller "hot fail over" sometimes fails to domain controller "hot fail over" can take forever
Updated by Gordon Ross about 12 years ago
- Status changed from New to Resolved
changeset: 13328:2f33da224406