Project

General

Profile

Actions

Bug #14185

open

fmtopo pauses for a long time interrogating ipmi parts of a motherboard

Added by Michael Hicks about 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
topology
Gerrit CR:

Description

running

/usr/lib/fm/fmd/fmtopo -dpV
hangs. Running with debug options:
TOPO_DEBUG=all pfexec /usr/lib/fm/fmd/fmtopo -dpV

shows it hangs at the ipmi parts of the motherboard interrogation

The BMC in this Dell system may be bad or just hung.

$ mdb -ke '::pgrep fmtopo | ::walk thread | ::findstack -v; ::stacks -m ipmi'
stack pointer for thread fffffe25e7449840 (fmtopo/1): fffffcc27545fc40
[ fffffcc27545fc40 _resume_from_idle+0x12b() ]
  fffffcc27545fc70 swtch+0x141()
  fffffcc27545fd00 cv_wait_sig_swap_core+0x1b9(ffffffd70eeda20a, ffffffd70eeda1d0, 0)
  fffffcc27545fd20 cv_wait_sig_swap+0x17(ffffffd70eeda20a, ffffffd70eeda1d0)
  fffffcc27545fd50 cv_timedwait_sig_hrtime+0x35(ffffffd70eeda20a, ffffffd70eeda1d0, ffffffffffffffff)
  fffffcc27545fdf0 poll_common+0x288(ffffffcd699ac510, 8044fd0, 1, 0, fffffcc27545fe5c)
  fffffcc27545feb0 pollsys+0x2dc(8044fd0, 1, 0, 0)
  fffffcc27545ff10 _sys_sysenter_post_swapgs+0x153()
THREAD           STATE    SOBJ                COUNT
fffffcc26fe1ec20 ONPROC   <NONE>                  1
                 tsc_read+3
                 gethrtime+0xa
                 drv_usecwait+0x47
                 kcs_wait_for_ibf+0x9e
                 kcs_start_write+0x51
                 kcs_polled_request+0x2f
                 kcs_loop+0x61
                 taskq_thread+0x2d0
                 thread_start+8

Eventaully a long time later the top output continued with and error then the rest of the output

libtopo DEBUG: failed to refresh IPMI sdr repository: failed to read response from /dev/ipmi0: I/O error
libtopo DEBUG: facility enumeration method failed on node motherboard=0 (method failed)
libtopo DEBUG: step_child: walk through node motherboard=0 to 3.3V PG=0
...

This long run time cascades into diskinfo taking hours to finish running
[mhicks@headnode ~]$ time pfexec diskinfo -cH
SCSI    c0t5000C5008623447Fd0   SEAGATE ST1200MM0088    Z400N5K1        1117.81 GiB     ----    [0] Drive-Slot-01
SCSI    c0t5000C5008623D4A3d0   SEAGATE ST1200MM0088    Z400PSLZ        1117.81 GiB     ----    [0] Drive-Slot-02
SCSI    c0t5000C50086236EBBd0   SEAGATE ST1200MM0088    Z400PTM1        1117.81 GiB     ----    [0] Drive-Slot-03
...
-       c1t5000CCA0496F8511d0   HGST    HUSMH8010BSS204 0HWZA90A          93.16 GiB     ---S    [0] Drive-Slot-00
SCSI    c2t0d0  Kingston        DataTraveler 2.0        -         14.44 GiB     ??R-    -

real    103m37.307s
user    0m0.225s
sys     0m0.923s


Files

topo_debug_out_hung_ipmi.log (385 KB) topo_debug_out_hung_ipmi.log Michael Hicks, 2021-10-20 06:57 PM
Actions #1

Updated by Michael Hicks about 2 months ago

  • Description updated (diff)
Actions #2

Updated by Michael Hicks about 2 months ago

  • Description updated (diff)
Actions #3

Updated by Michael Hicks about 2 months ago

  • Description updated (diff)
Actions #4

Updated by Michael Hicks about 2 months ago

  • Description updated (diff)
Actions #5

Updated by Michael Hicks about 2 months ago

  • Description updated (diff)
Actions #6

Updated by Michael Hicks about 2 months ago

  • Description updated (diff)
Actions

Also available in: Atom PDF