Bug #8870
openCPU hung for ever at 100 percent system time on kcs_wait_for_obf()
0%
Description
Hello,
On some of our OmniOS r151014 boxes, we are experiencing the following issue :
- losing communication on a specific VLAN link
- mpstat shows one vcore locked on 100% system time
- stacktraces show that a thead were pre-empted with RNRN flag (even if it is PRI 60 < 99) and stuck on the CPU
- looks like the thread is waiting for ever "kcs_wait_for_obf()" which seems like "ipmitool" invocations (used in production)
- invoking "psradm f <addLockedCpuID>" caused the locked thread to jump over another CPU invoking "psradm -i <addLockedCpuID>" did not solve
FreeBSD commiter "bsdjhb" already encountered the issue and fixed via https://github.com/freebsd/freebsd/commit/8ad8a2c4a4abef9af95f1033b18eecffe4860896#diff-3b4913a6b21dfc65efab7f1fdc2258e4
Files
Updated by Benjamin MONTHOUËL over 4 years ago
lockstat -A does not seem to display interrupts. When the next case occurs, I'll try to provide an additional export.
Updated by Benjamin MONTHOUËL over 4 years ago
Possibly linked cases from Ubuntu and RedHat communities :
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1383921
https://bugzilla.redhat.com/show_bug.cgi?id=1090619