Project

General

Profile

Bug #7392

remove event channel support from lofi and implement lofi_devlink_cache.

Added by Toomas Soome about 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
driver - device drivers
Start date:
2016-09-18
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

As it appeared, the event channels are local zone aware and since local zones do load lofi module automatically, it will create race in eventchannel teardown on zone halt. Therefore we can not use events directly from lofi driver, but have to extract device ADD/REMOVE events early enough to make sure this extraction happens in global zone, and let lofi module to access this information by simple query.

Since devfsadm events are received by modctl, I did implement the device link cache in modctl (as private interface), and allow lofi to access it.

https://www.illumos.org/rb/r/218/

History

#1

Updated by Toomas Soome about 3 years ago

  • Subject changed from remove event channel support from lofi and implement mod_devlink_cache. to remove event channel support from lofi and implement lofi_devlink_cache.
#2

Updated by Electric Monk about 3 years ago

  • Status changed from New to Closed
  • % Done changed from 90 to 100

git commit 8ae05c101a3c849364fa53a66ec87aa59823326a

commit  8ae05c101a3c849364fa53a66ec87aa59823326a
Author: Toomas Soome <tsoome@me.com>
Date:   2016-09-30T01:16:59.000Z

    7392 remove event channel support from lofi and implement lofi_devlink_cache.
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Approved by: Dan McDonald <danmcd@omniti.com>

#3

Updated by Yuri Pankov about 3 years ago

How did the problem manifest itself so we could make sure we are (were) seeing the same? Was there a panic (what stack)?

#4

Updated by Toomas Soome about 3 years ago

Yuri Pankov wrote:

How did the problem manifest itself so we could make sure we are (were) seeing the same? Was there a panic (what stack)?

panic or hung system when while lofi unload and zone halt was done at the same time was the scenario I had, the stack from panic had ev_* related bits - bad mutex was once, but it was a bit random depending on exact sequence and timing, but always related to zone shutdown/reboot. Since lofi does not use events directly any more, the root cause is removed...

Also available in: Atom PDF