Bug #5297
closedmptsas refhash replacement on reset can cause hang
100%
Description
Want to integrate the one outstanding bugfix in the SmartOS tree for mpt_sas that hasn't made it upstream.
The relevant commit can be found at https://github.com/joyent/illumos-joyent/commit/d8b8daca0baacf10f4535dddb56338d0afc3b335; I'll try to ask il-dev to review a (basically unmodified) respin of the patch against il-gate HEAD, and then kick it off for RTI.
To verbatim copy from its bug report (in the event that one of these becomes unavailable before the other):
From Zhiwen Zhang: > The scenario may be the following: > > During mptsas_restart_ioc(), we already hold mpt->m_mutex and > mpt->m_in_reset set to TRUE, > so new commands cannot be added to active queue and wait queue, but can be > added to tx wait queue, > which does not require to hold mpt->m_mutex to be manipulated. So during > ioc restart, the mptsas_target_t s > can still be actively used, e.g. cmd->cmd_tgt_addr = ptgt in > mptsas_scsi_init_pkt(). The IOC reset path does not need to replace the refhash. In addition to causing a memory leak, this approach leads to potential hangs in the case described above. Instead, we should allocate this once in the attach path and then leave it alone.
Updated by Rich Ercolani over 8 years ago
http://cr.illumos.org/~webrev/rincebrain/refhash
Keith's patch just directly applied sanely, so that was simple enough.
Full build running now, then I'll ask -dev for reviewers, try to find an advocate, and then hopefully RTI quickly enough...
Updated by Igor Kozhukhov over 8 years ago
Rich Ercolani wrote:
http://cr.illumos.org/~webrev/rincebrain/refhash
Keith's patch just directly applied sanely, so that was simple enough.
Full build running now, then I'll ask -dev for reviewers, try to find an advocate, and then hopefully RTI quickly enough...
my +1 here - i have tested with my tree.
thanks for this work.
i thought about it too, but i just wanted to test it first on my env with LSI 9211-8i
Updated by Electric Monk over 8 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit abdbe11c7877311202d2870b53d7c76264121b2c
commit abdbe11c7877311202d2870b53d7c76264121b2c Author: Keith M Wesolowski <wesolows@foobazco.org> Date: 2014-12-05T19:34:17.000Z 5297 mptsas refhash replacement on reset can cause hang Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Reviewed by: Andy Giles <illumos@ang.homedns.org> Approved by: Richard Lowe <richlowe@richlowe.net>