Bug #14279
opendiskinfo sits quietly for up to a minute before exiting
0%
Description
It would seem that libdiskmgmt responds to device addition sysevents by triggering a device tree walk to look at the new devices. This makes sense for long running consumers, but perhaps does not interact well with one-shot use such as in diskinfo
or format
.
Sometimes, especially on a system using the iSCSI initiator, disks do arrive while diskinfo
is running. This triggers walk_devtree()
which creates a thread to run the walker()
routine in lib/libdiskmgt/common/events.c
. Unfortunately this routine begins with an immediate sleep of WALK_WAIT_TIME
-- 60 seconds! -- ostensibly to allow events to coalesce and reduce the number of walks we need to do to converge.
The sleep is effectively uninterruptible from another thread so when we go to exit, libdiskmgt_fini()
blocks waiting for it to complete. This can take up to a minute in this state during which diskinfo
appears to be hung:
grandcentral # pstack 100813 100813: diskinfo --------------------- thread# 1 / lwp# 1 --------------------- feca6b0b lwp_park (0, 0, 0) fec9f794 cond_wait_queue (fee463a0, fee463b0, 0) + 5f fec9faf1 cond_wait_common (fee463a0, fee463b0, 0) + 2ed fec9ff08 __cond_wait (fee463a0, fee463b0) + 78 feca000c cond_wait (fee463a0, fee463b0) + 2e fee2bc97 libdiskmgt_fini () + 93 fee347bb _fini (f477dbe0, 0, fee70018, f, f477daec, 0) + 1b f4750a13 call_fini (f477dbe0, fd900018, 0) + 94 f4750c16 atexit_fini () + 125 fec0ea10 __cxa_finalize (0) + 69 fec0eaac _exithandle () + 37 fec02a82 exit (1, 803d840, f4750af1, 0, 0, 0) + 12 0805197a _start (1, 803d90c, 0, 803d915, 803d929, 803d945) + 1a --------------------- thread# 2 / lwp# 2 --------------------- fecab890 door (fe40f7f8, 4, 0, fe40fe00, f5f00, a) fef34eb9 event_deliver_service (8a34168, fe40f840, 5c0, 0, 0) + 122 fecab8b0 __door_return () + 40 --------------------- thread# 4 / lwp# 4 --------------------- fecab890 door (fdb6f7f8, 4, 0, fdb6fe00, f5f00, a) fef34eb9 event_deliver_service (8a34168, fdb6f840, 5c0, 0, 0) + 122 fecab8b0 __door_return () + 40 --------------------- thread# 5 / lwp# 5 --------------------- fecaa827 nanosleep (fdc9af98, fdc9af90) fec9411c sleep (3c) + 31 fee2b75a walker (0) + 20 feca6821 _thrp_setup (fe771a40) + 81 feca6ad0 _lwp_start (fe771a40, 0, 0, 0, 0, 0) --------------------- thread# 7 / lwp# 7 --------------------- feca6b0b lwp_park (0, fd5aef28, 0) fec9f794 cond_wait_queue (fea2a818, fea2a838, fd5aef28) + 5f fec9faf1 cond_wait_common (fea2a818, fea2a838, fd5aef28) + 2ed fec9fbaf __cond_timedwait (fea2a818, fea2a838, fd5aefa4) + 66 fec9fc8a cond_timedwait (fea2a818, fea2a838, fd5aefa4) + 35 fe9f5559 umem_update_thread (0) + 20c feca6821 _thrp_setup (fe772a40) + 81 feca6ad0 _lwp_start (fe772a40, 0, 0, 0, 0, 0)
Updated by Andy Fiddaman 9 months ago
See also https://github.com/omniosorg/illumos-omnios/commit/0307587db7989cb2ee157c6e2aeccb79e7bdb553 which was contributed to OmniOS, but not yet upstreamed.