Bug #176

assertion failed: kept_info, file: ../../common/os/sunpm.c, line: 5219

Added by Garrett D'Amore over 3 years ago. Updated over 3 years ago.

Status:Resolved Start date:2010-09-09
Priority:High Due date:
Assignee:Garrett D'Amore % Done:

50%

Category:kernel Spent time: 1.00 hour
Target version:- Estimated time:2.00 hours
Difficulty:Medium Tags:needs-triage

Description

This is from 6956016, and I just hit it on my Thinkpad with illumos. Notably, I don't see this problem under b145. So something is different.

Category
kernel
Sub-Category
pm-devfs
Description
When I boot the resulting BE on a Lenovo ThinkPad T61, it panics with an assertion failure:
panic[cpu1]/thread=ffffff0005d6dc40:
assertion failed: kept_info, file: ../../common/os/sunpm.c, line: 5219
ffffff0005d6d800 genunix:assfail+7e ()
ffffff0005d6d880 genunix:pm_set_keeping+374 ()
ffffff0005d6d8e0 genunix:pm_kept+393 ()
ffffff0005d6d910 genunix:pm_kept_walk+3b ()
ffffff0005d6d980 genunix:walk_devs+4f ()
ffffff0005d6d9f0 genunix:walk_devs+ff ()
ffffff0005d6da60 genunix:walk_devs+ff ()
ffffff0005d6dad0 genunix:walk_devs+ff ()
ffffff0005d6db40 genunix:walk_devs+ff ()
ffffff0005d6db90 genunix:ddi_walk_devs+7f ()
ffffff0005d6dbc0 genunix:pm_process_dep_request+231 ()
ffffff0005d6dc20 genunix:pm_dep_thread+116 ()
ffffff0005d6dc30 unix:thread_start+8 ()
Frequency
Always
Regression
solaris_10
Steps to Reproduce
Build ON as described above, create new BE and boot that.
Expected Result
Works.
Actual Result
Assertion failure.
Error Message(s)

Test Case

Workaround

Additional configuration information
I've just built ON as of changeset 7bae494c02d3 with debug closed and crypto
binaries as of 20100522 and upgraded from the current BE running snv_134 using
onu.
I'm marking incomplete - no one has reported (SunSolve search) and issue
seen with non-ON submitter built bits. Hopefully submitter will investigate
further soon and provide more info and core file. See
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/epm.h#651
for all pm_debug flags.

(excerpt from on-discuss)
-------- Original Message --------
Subject: Re: [on-discuss] Assertion failure in sunpm.c booting snv_142+ on
Date: Fri, 28 May 2010 00:08:19 +0200

Randy Fishel < > writes:

This is part of the PM dependency check (don't PM components that are required by some other component). To get more info, in mdb (at boot) set:

pm_debug/W0x0800000

This will likely generate a lot of information, but make sure to add at least the last several lines to the CR.

Unfortunately, it didn't: at first I only got the message

pm debug output will be to log only

and nothing in ::msgbuf output. Even after I set pm_debug_to_console to
1, nothing either (provided it didn't scroll away too fast).

Please note that I won't be able to investigate further before wednesday
next week since I'm going to be away for an extended weekend.

Thanks.
Rainer

-

Workaround from bugs.opensolaris.org:

The most likely scenario for this assertion to occur is that there is a device dependency on the removable-media property. So commenting ot the line in /etc/power.conf:

device-dependency-property removable-media /dev/fb

will likely keep the offending driver quiet.
Non-debug builds seem to work fine.

History

Updated by Garrett D'Amore over 3 years ago

  • Category set to kernel
  • Status changed from New to In Progress
  • Assignee set to Garrett D'Amore
  • % Done changed from 0 to 50
  • Estimated time set to 2.00

I'm about 99% sure the problem is that some devices (such as blkdev) are being treated as "power manageable", but lack suitable power(9e) entry points. (In fact, I'm about 99% sure the problem is tied to blkdev... it exposes SDcard slots as removable media, and the framework then decides that these ought to be pm capable devices. I have some corroborating e-mail from Randy F. from about 3 or 4 months ago.)

Essentially, blkdev (or other devices) cannot be "kept up" by another device if they don't support device power management.

Rather than fix blkdev (which we should also do separately), we should just provide an escape hatch to refuse the set_keeping call if the kept device isn't itself power manageable. (We already do that for "keeping" devices.)

The diff looks like this:

 1 diff -r e072bd4baed8 usr/src/uts/common/os/sunpm.c
 2 --- a/usr/src/uts/common/os/sunpm.c    Mon Oct 11 16:24:58 2010 -0700
 3 +++ b/usr/src/uts/common/os/sunpm.c    Tue Oct 12 23:43:33 2010 -0700
 4 @@ -5196,7 +5196,6 @@
 5  pm_set_keeping(dev_info_t *keeper, dev_info_t *kept)
 6  {
 7      PMD_FUNC(pmf, "set_keeping")
 8 -    pm_info_t *kept_info;
 9      int j, up = 0, circ;
10      void prdeps(char *);
11 
12 @@ -5215,8 +5214,15 @@
13              "power managed\n", pmf, PM_DEVICE(keeper)))
14          return (0);
15      }
16 -    kept_info = PM_GET_PM_INFO(kept);
17 -    ASSERT(kept_info);
18 +    if (PM_GET_PM_INFO(kept) == NULL) {
19 +        cmn_err(CE_CONT, "!device %s@%s(%s#%d) keeps up device " 
20 +            "%s@%s(%s#%d), but the latter is not power managed",
21 +            PM_DEVICE(keeper), PM_DEVICE(kept));
22 +        PMD((PMD_FAIL | PMD_KEEPS), ("%s: kept %s@%s(%s#%d) is not" 
23 +            "power managed\n", pmf, PM_DEVICE(kept)))
24 +        return (0);
25 +    }
26 +
27      PM_LOCK_POWER(keeper, &circ);
28      for (j = 0; j < PM_NUMCMPTS(keeper); j++) {
29          if (PM_CURPOWER(keeper, j)) {

Updated by Garrett D'Amore over 3 years ago

  • Status changed from In Progress to Resolved

I just integrated this change.

Also available in: Atom PDF