Project

General

Profile

Actions

Bug #5060

closed

Assertion failure in iprb during watchdog reset

Added by Richard PALO almost 8 years ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
2014-08-01
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:

Description

possibly exposed via 4123?

I'm experiencing assertion failure panics on a debug kernel on line 978 of iprb.c in the iprb_start routine:

     /* Send a NOP.  This will be the first command seen by the device. */
     cb = iprb_cmd_next(ip);
     ASSERT(cb);
     if (iprb_cmd_submit(ip, CB_CMD_NOP) != DDI_SUCCESS)
         return (DDI_FAILURE);

here is an extract:
 panic[cpu0]/thread=ffffff003d0e3c40:
 assertion failed: cb, file: ../../common/io/iprb/iprb.c, line: 978

 ffffff003d0e3ab0 genunix:process_type+1626c0 ()
 ffffff003d0e3ae0 iprb:iprb_start+2c8 ()
 ffffff003d0e3b20 iprb:iprb_periodic+115 ()
 ffffff003d0e3b60 genunix:periodic_execute+c9 ()
 ffffff003d0e3c20 genunix:taskq_thread+318 ()
 ffffff003d0e3c30 unix:thread_start+8 ()

This Compaq NC3131 (2x Intel 82555) is a secondary card that has no connections yet, only the interfaces created via ipadm create-if...

I'm noticing numerous "iprb: [ID 614678 kern.info] CU stalled, resetting."
messages for about 15 minutes prior to the panic.

Had to remove the card.


Related issues

Has duplicate illumos gate - Bug #8358: NULL pointer dereference in iprb moduleDuplicate2017-06-09

Actions
Has duplicate illumos gate - Bug #14078: null pointer dereference crashes from iprb nicsDuplicate

Actions
Actions #1

Updated by Juan Jose Presa Rodal over 5 years ago

Hi,

exactly the same problem here.

Is there any news?
Can I help you in any way?

Actions #2

Updated by Marcel Telka almost 5 years ago

  • Related to Bug #8358: NULL pointer dereference in iprb module added
Actions #3

Updated by Andy Fiddaman 8 months ago

  • Related to deleted (Bug #8358: NULL pointer dereference in iprb module)
Actions #4

Updated by Andy Fiddaman 8 months ago

  • Has duplicate Bug #8358: NULL pointer dereference in iprb module added
Actions #5

Updated by Andy Fiddaman 8 months ago

  • Has duplicate Bug #14078: null pointer dereference crashes from iprb nics added
Actions #6

Updated by Andy Fiddaman 8 months ago

From the information posted in the duplicate #14078:

fffffe2ce7ddc0b8 uint16_t cmd_head = 0x1
fffffe2ce7ddc0ba uint16_t cmd_last = 0
fffffe2ce7ddc0bc uint16_t cmd_tail = 0
fffffe2ce7ddc0be uint16_t cmd_count = 0x81

The cmd_count has overflowed. What's happening is that the periodic task has detected a hung card (watchdog timeout) and is trying to reset it. As part of that it resets the circular command buffer, but fails to reset cmd_count to 0.

Actions #7

Updated by Andy Fiddaman 8 months ago

  • Subject changed from > assertion failed: cb, file: ../../common/io/iprb/iprb.c, line: 978 to Assertion failure in iprb during watchdog reset
  • Category set to driver - device drivers
  • Status changed from New to In Progress
  • Assignee set to Andy Fiddaman
  • Difficulty changed from Medium to Bite-size
  • Tags deleted (needs-triage)
Actions #8

Updated by Andy Fiddaman 8 months ago

  • Gerrit CR set to 1712
Actions #9

Updated by Andy Fiddaman 6 months ago

An OmniOS user with the same problem was able to verify that the attached patch stops the panic. The network interface is still resetting very often for some reason, but at least the system is now staying up.

Actions #10

Updated by Electric Monk 6 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit 9ca05893faec45ccbe9cfa6b59b1a79960d9f7a7

commit  9ca05893faec45ccbe9cfa6b59b1a79960d9f7a7
Author: Andy Fiddaman <omnios@citrus-it.co.uk>
Date:   2021-12-09T21:49:56.000Z

    5060 Assertion failure in iprb during watchdog reset
    Reviewed by: Yuri Pankov <ypankov@tintri.com>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF