Project

General

Profile

Bug #10055

recursive mutex enter in ahci

Added by Michal Nowak 7 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Category:
kernel
Start date:
2018-12-09
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Suspend to RAM on Lenovo X220 running illumos-863275a46b failed with:

panic[cpu3]/thread=ffffff05649ec080:
recursive mutex_enter, lp=ffffff0552906560 owner=ffffff05649ec080 thread=ffffff0
5649ec080

ffffff001798d900 unix:mutex_panic+58 ()
ffffff001798d970 unix:mutex_vector_enter+2b7 ()
ffffff001798d9a0 ahci:ahci_em_quiesce+26 ()
ffffff001798d9d0 ahci:ahci_em_suspend+23 ()
ffffff001798da10 ahci:ahci_detach+15c ()
ffffff001798da80 genunix:devi_detach+a7 ()
ffffff001798db30 cpr:cpr_suspend_devices+c5 ()
ffffff001798dbe0 cpr:cpr_suspend_devices+40 ()
ffffff001798dc90 cpr:cpr_suspend_devices+40 ()
ffffff001798dcc0 cpr:cpr_suspend+19c ()
ffffff001798dd40 cpr:cpr_main+112 ()
ffffff001798dd70 cpr:cpr+1a1 ()
ffffff001798ddb0 unix:stubs_common_code+51 ()
ffffff001798de40 genunix:kadmin+198 ()
ffffff001798dec0 genunix:uadmin+16d ()
ffffff001798df10 unix:brand_sys_sysenter+1c6 ()

dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
stack pointer for thread ffffff05649ec080 (uadmin/1): ffffff001798d7c0
  ffffff001798d890 apix_get_pending_spl+0x21()
  ffffff001798d8a0 pf_fini+0xa1(fffffffff78d5141, ffffff001798d8a0)
  ffffff001798d8d0 0xfffffffffb96df61()
  ffffff001798d960 0xfffffffffb96df44()
  ffffff001798d9a0 0xffffff0552906560()
  ffffff001798d9d0 ahci_em_suspend+0x23(ffffff0552906280)
  ffffff001798da10 ahci_detach+0x15c(ffffff05526847f8, 1)
  ffffff001798da80 devi_detach+0xa7(ffffff05526847f8, 1)
  ffffff001798db30 cpr_suspend_devices+0xc5(ffffff05513e4558)
  ffffff001798dbe0 cpr_suspend_devices+0x40(ffffff05527c6000)
  ffffff001798dc90 cpr_suspend_devices+0x40(ffffff05513e4d50)
  ffffff001798dcc0 cpr_suspend+0x19c(3)
  ffffff001798dd40 cpr_main+0x112(3)
  ffffff001798dd70 cpr+0x1a1(14, 0)
  ffffff001798ddb0 stubs_common_code+0x51()
  ffffff001798de40 kadmin+0x198(3, 14, 0, ffffff0561c4b5d8)
  ffffff001798dec0 uadmin+0x16d(3, 14, 0)
  ffffff001798df10 _sys_sysenter_post_swapgs+0x149()

Further information in the attached log. (Information were gathered according to https://wiki.illumos.org/display/illumos/How+To+Report+Problems.)


Files

crash.0 (612 KB) crash.0 Michal Nowak, 2018-12-09 02:06 PM
crash.1 (619 KB) crash.1 Michal Nowak, 2019-04-24 06:50 AM

Related issues

Related to illumos gate - Bug #11049: XHCI runtime reset required in xhciNew2019-05-18

Actions

History

#1

Updated by Hans Rosenfeld 4 months ago

  • Subject changed from Suspend to RAM fails to recursive mutex enter in ahci
#2

Updated by Hans Rosenfeld 4 months ago

  • Assignee set to Hans Rosenfeld
#3

Updated by Hans Rosenfeld 4 months ago

Both ahci_em_quiesce() and ahci_em_suspend() try to grab ahcictl_mutex. The code in ahci_detach() already holds it when it calls ahci_em_suspend().

Webrev: https://grumpf.hope-2000.org/illumos-10055/

#4

Updated by Michal Nowak 3 months ago

Thanks for the patch, Hans.

With the patch from #3 I got a bit further (to some other problem):

System is being suspended
WARNING: Unable to suspend device pci17aa,21fa@14.
WARNING: Device is busy or does not support suspend/resume.
WARNING: xhci0: abort command timed out: resetting device

^Mpanic[cpu2]/thread=ffffff001e695c40:
XHCI runtime reset required

ffffff001e695b60 xhci:xhci_soft_state+37d7a47a ()
ffffff001e695c20 genunix:taskq_thread+2d0 ()
ffffff001e695c30 unix:thread_start+8 ()

Rest of the info attached.

#5

Updated by Electric Monk 2 months ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit a3380248e34d78eb55b8f65ccf1f0d8a6f7e7bbf

commit  a3380248e34d78eb55b8f65ccf1f0d8a6f7e7bbf
Author: Hans Rosenfeld <hans.rosenfeld@joyent.com>
Date:   2019-05-15T21:37:25.000Z

    10055 recursive mutex enter in ahci
    Reviewed by: Dan McDonald <danmcd@joyent.com>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Randy Fishel <randyf@sibernet.com>
    Approved by: Gordon Ross <gordon.w.ross@gmail.com>

#6

Updated by Michal Nowak about 2 months ago

  • Related to Bug #11049: XHCI runtime reset required in xhci added

Also available in: Atom PDF