Actions
Bug #4872
closedsystem crash after nlm_gc hits bogus mutex
Start date:
2014-05-20
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
Description
While preparing to start an upgrade, we discovered that the DE was rebooting. Post reboot we found a crash dump. That crash dump shows that nlm_gc encountered a bogus mutex: panic[cpu1]/thread=ffffff00b9695c40: mutex_enter: bad mutex, lp=ffffff19c1813d28 owner=ffffff00b9695c40 thread=ffffff00b9695c40 ffffff00b9695b70 unix:mutex_panic+73 () ffffff00b9695bd0 unix:mutex_vector_enter+446 () ffffff00b9695c20 klmmod:nlm_gc+ab () ffffff00b9695c30 unix:thread_start+8 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel > $c vpanic() mutex_panic+0x73(fffffffffb961b90, ffffff19c1813d28) mutex_vector_enter+0x446(ffffff19c1813d28) nlm_gc+0xab(ffffff19a746b8c0) thread_start+8() > ffffff19c1813d28::mutex ADDR TYPE HELD MINSPL OLDSPL WAITERS mdb: 0xffffff19c1813d28: invalid adaptive mutex (-f to dump anyway)
Diagnosis and fix described in detail at http://blog.delphix.com/pdagnelie/2014/05/19/nlms-garbage-collection-race/
Updated by Marcel Telka over 7 years ago
- Category set to nfs - NFS server and client
Updated by Electric Monk over 7 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit 096e63b2c66f47e2a2d213edc199cdb082d8b2d6
commit 096e63b2c66f47e2a2d213edc199cdb082d8b2d6 Author: Paul Dagnelie <paul.dagnelie@delphix.com> Date: 2015-01-19T17:58:35.000Z 4872 system crash after nlm_gc hits bogus mutex Reviewed by: Adam Leventhal <ahl@delphix.com> Reviewed by: Christopher Siden <christopher.siden@delphix.com> Reviewed by: Eric Schrock <eric.schrock@delphix.com> Reviewed by: Jeremy Jones <jeremy@delphix.com> Approved by: Garrett D'Amore <garrett@damore.org>
Actions