Project

General

Profile

Bug #875

missing lwp_exit() in kcfpool_svc() induces panic in prchoose()

Added by Bryan Cantrill over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Category:
kernel
Start date:
2011-03-31
Due date:
% Done:

100%

Estimated time:
2.00 h
Difficulty:
Tags:
Gerrit CR:

Description

We have recently seen a spate of panics in prchoose():

> $c
prchoose+0x72(ffffff0d3761e008)
prgetpsinfo32+0x2b(ffffff0d3761e008, ffffff006a372b00)
pr_read_psinfo_32+0x4e(ffffff0d420b8640, ffffff006a372e20)
prread+0x5c(ffffff0d420b4c80, ffffff006a372e20, 0, ffffff0d52489480, 0)
fop_read+0xc9(ffffff0d420b4c80, ffffff006a372e20, 0, ffffff0d52489480, 0)
read+0x2b8(4, 8047af0, 150)
read32+0x22(4, 8047af0, 150)
_sys_sysenter_post_swapgs+0x149()

We seem to be dying on a stale p_tlist, which should generally be impossible. Interestingly, the proc_t in question is always kcfpoold:

> ffffff0d3761e008::ps
S PID PPID PGID SID UID FLAGS ADDR NAME
R 4 0 0 0 0 0x00020001 ffffff0d3761e008 kcfpoold

And indeed, kcfpool and the kcfpoold proc_t have wildly divergent ideas of how many threads are associated with kcfpoold:

@> *kcfpool::print kcf_pool_t kp_threads
kp_threads = 0x1

::pgrep kcfpoold | ::print proc_t p_lwpcnt

p_lwpcnt = 0x11@

The problem appears to be in kcfpool_svc(), which is what the in-kernel (i.e., synthetic) kcfpoold sets its LWPs to run: this routine simply returns when the size of the thread pool exceeds available work – but it in fact needs to grab its own p_lock and call lwp_exit(), lest the LWP state associated with the process become stale.


Files

kcf-fix.patch (789 Bytes) kcf-fix.patch Garrett D'Amore, 2011-04-01 10:48 AM
cryptotest.c (4.55 KB) cryptotest.c Garrett D'Amore, 2011-04-01 10:48 AM

History

#1

Updated by Bryan Cantrill over 9 years ago

  • Assignee set to Garrett D'Amore

Bryan Cantrill wrote:

We have recently seen a spate of panics in prchoose():

> $c
prchoose+0x72(ffffff0d3761e008)
prgetpsinfo32+0x2b(ffffff0d3761e008, ffffff006a372b00)
pr_read_psinfo_32+0x4e(ffffff0d420b8640, ffffff006a372e20)
prread+0x5c(ffffff0d420b4c80, ffffff006a372e20, 0, ffffff0d52489480, 0)
fop_read+0xc9(ffffff0d420b4c80, ffffff006a372e20, 0, ffffff0d52489480, 0)
read+0x2b8(4, 8047af0, 150)
read32+0x22(4, 8047af0, 150)
_sys_sysenter_post_swapgs+0x149()

We seem to be dying on a stale p_tlist, which should generally be impossible. Interestingly, the proc_t in question is always kcfpoold:

> ffffff0d3761e008::ps
S PID PPID PGID SID UID FLAGS ADDR NAME
R 4 0 0 0 0 0x00020001 ffffff0d3761e008 kcfpoold

And indeed, kcfpool and the kcfpoold proc_t have wildly divergent ideas of how many threads are associated with kcfpoold:

@> *kcfpool::print kcf_pool_t kp_threads
kp_threads = 0x1

::pgrep kcfpoold | ::print proc_t p_lwpcnt

p_lwpcnt = 0x11@

The problem appears to be in kcfpool_svc(), which is what the in-kernel (i.e., synthetic) kcfpoold sets its LWPs to run: this routine simply returns when the size of the thread pool exceeds available work – but it in fact needs to grab its own p_lock and call lwp_exit(), lest the LWP state associated with the process become stale.

#2

Updated by Garrett D'Amore over 9 years ago

  • Project changed from site to illumos gate
#3

Updated by Garrett D'Amore over 9 years ago

See the attachment for the fix, and for a test program to exercise kcf. (The test program is really a kernel module... just compile it and do modload ./cryptotest -- it will fire off a large number -- 1000 in this case -- of kernel jobs which submit crypto jobs for processing.)

Note that to monitor kcf, you have to be root to see kcf stats.

#4

Updated by Garrett D'Amore over 9 years ago

  • Category set to kernel
  • Status changed from New to Resolved
  • % Done changed from 0 to 100
  • Estimated time set to 2.00 h

Resolved in:

changeset: 13317:bd2d2a5ed3e4
tag: tip
user: Garrett D'Amore <>
date: Sun Apr 03 07:44:01 2011 -0700
description:
875 missing lwp_exit() in kcfpool_svc() induces panic in prchoose()
Reviewed by: Bryan Cantrill <>
Reviewed by: Dan McDonald <>
Approved by: Gordon Ross <>

Also available in: Atom PDF