Project

General

Profile

Bug #8870

CPU hung for ever at 100 percent system time on kcs_wait_for_obf()

Added by Benjamin MONTHOUËL over 2 years ago. Updated over 2 years ago.

Status:
New
Priority:
High
Category:
driver - device drivers
Start date:
2017-11-29
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Hello,

On some of our OmniOS r151014 boxes, we are experiencing the following issue :
- losing communication on a specific VLAN link
- mpstat shows one vcore locked on 100% system time
- stacktraces show that a thead were pre-empted with RNRN flag (even if it is PRI 60 < 99) and stuck on the CPU
- looks like the thread is waiting for ever "kcs_wait_for_obf()" which seems like "ipmitool" invocations (used in production)
- invoking "psradm f <addLockedCpuID>" caused the locked thread to jump over another CPU
invoking "psradm -i <addLockedCpuID>" did not solve

FreeBSD commiter "bsdjhb" already encountered the issue and fixed via https://github.com/freebsd/freebsd/commit/8ad8a2c4a4abef9af95f1033b18eecffe4860896#diff-3b4913a6b21dfc65efab7f1fdc2258e4


Files

lockstatA.out (926 KB) lockstatA.out lockstat -A sleep 10 Benjamin MONTHOUËL, 2017-11-29 05:01 PM
mdb.txt (5.97 KB) mdb.txt mdb live when a VLAN is not reachable Benjamin MONTHOUËL, 2017-11-29 05:16 PM
0001-Explicitly-treat-timeouts-when-waiting-for-IBF-or-OB.patch (2.07 KB) 0001-Explicitly-treat-timeouts-when-waiting-for-IBF-or-OB.patch Benjamin MONTHOUËL, 2017-11-30 09:33 AM

Also available in: Atom PDF