Bug #5060
closedAssertion failure in iprb during watchdog reset
100%
Description
possibly exposed via 4123?
I'm experiencing assertion failure panics on a debug kernel on line 978 of iprb.c in the iprb_start routine:
/* Send a NOP. This will be the first command seen by the device. */ cb = iprb_cmd_next(ip); ASSERT(cb); if (iprb_cmd_submit(ip, CB_CMD_NOP) != DDI_SUCCESS) return (DDI_FAILURE);
here is an extract:
panic[cpu0]/thread=ffffff003d0e3c40: assertion failed: cb, file: ../../common/io/iprb/iprb.c, line: 978 ffffff003d0e3ab0 genunix:process_type+1626c0 () ffffff003d0e3ae0 iprb:iprb_start+2c8 () ffffff003d0e3b20 iprb:iprb_periodic+115 () ffffff003d0e3b60 genunix:periodic_execute+c9 () ffffff003d0e3c20 genunix:taskq_thread+318 () ffffff003d0e3c30 unix:thread_start+8 ()
This Compaq NC3131 (2x Intel 82555) is a secondary card that has no connections yet, only the interfaces created via ipadm create-if...
I'm noticing numerous "iprb: [ID 614678 kern.info] CU stalled, resetting."
messages for about 15 minutes prior to the panic.
Had to remove the card.
Related issues
Updated by Juan Jose Presa Rodal almost 7 years ago
Hi,
exactly the same problem here.
Is there any news?
Can I help you in any way?
Updated by Marcel Telka over 6 years ago
- Related to Bug #8358: NULL pointer dereference in iprb module added
Updated by Andy Fiddaman about 2 years ago
- Related to deleted (Bug #8358: NULL pointer dereference in iprb module)
Updated by Andy Fiddaman about 2 years ago
- Has duplicate Bug #8358: NULL pointer dereference in iprb module added
Updated by Andy Fiddaman about 2 years ago
- Has duplicate Bug #14078: null pointer dereference crashes from iprb nics added
Updated by Andy Fiddaman about 2 years ago
From the information posted in the duplicate #14078:
fffffe2ce7ddc0b8 uint16_t cmd_head = 0x1 fffffe2ce7ddc0ba uint16_t cmd_last = 0 fffffe2ce7ddc0bc uint16_t cmd_tail = 0 fffffe2ce7ddc0be uint16_t cmd_count = 0x81
The cmd_count has overflowed. What's happening is that the periodic task has detected a hung card (watchdog timeout) and is trying to reset it. As part of that it resets the circular command buffer, but fails to reset cmd_count to 0.
Updated by Andy Fiddaman about 2 years ago
- Subject changed from > assertion failed: cb, file: ../../common/io/iprb/iprb.c, line: 978 to Assertion failure in iprb during watchdog reset
- Category set to driver - device drivers
- Status changed from New to In Progress
- Assignee set to Andy Fiddaman
- Difficulty changed from Medium to Bite-size
- Tags deleted (
needs-triage)
Updated by Andy Fiddaman almost 2 years ago
An OmniOS user with the same problem was able to verify that the attached patch stops the panic. The network interface is still resetting very often for some reason, but at least the system is now staying up.
Updated by Electric Monk almost 2 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
git commit 9ca05893faec45ccbe9cfa6b59b1a79960d9f7a7
commit 9ca05893faec45ccbe9cfa6b59b1a79960d9f7a7 Author: Andy Fiddaman <omnios@citrus-it.co.uk> Date: 2021-12-09T21:49:56.000Z 5060 Assertion failure in iprb during watchdog reset Reviewed by: Yuri Pankov <ypankov@tintri.com> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Dan McDonald <danmcd@joyent.com>