iprb transmit watchdog somewhat overzealous
iprb driver attempts to detect a transmit stall by keeping the transmit time of the last outstanding frame submitted to the controller. Each time we submit a command to the controller, we update
tx_wdog with the current time. We also update
tx_wdog each time we reclaim a command that the controller has completed. In
iprb_periodic(), we check to make sure this value is not more than 15 seconds in the past, and if it is we reset the controller.
Unfortunately, we only seem to trigger the reclaim routine when we're trying to transmit another packet. If the link is not particularly busy, and no data is transmitted for 15 seconds, it seems like we will spuriously reset the controller which causes a variety of other issues on a system that would have otherwise been working correctly.
A report from the field suggests that by pinging the computer with
iprb from another computer, the problem goes away; presumably because we are sending an echo reply each second and keeping the watchdog at bay.
One way to fix this is to call
iprb_cmd_reclaim() in the periodic routine, right before we check the watchdog value, to ensure that things are moving along. Indeed, an
iprb with this patch applied improves the behaviour.
Another question is: in
iprb_cmd_submit(), we set the
CB_CMD_I flag on the submitted command iff we believe it is the last available command; i.e., if we have dispatched 128 commands to the controller and heard nothing back. I can't see how this would trigger a reclaim, though, unless there was already another packet about to be sent out -- which in some cases on idle systems there may not be.
Updated by Joshua M. Clulow 7 months ago
I built the
iprb module with this change and gave it to a user on IRC who confirms that it makes networking work as they expect on their machine.
I have also done the regular RTI build with the same diff, though I have not installed the bits on a machine with the hardware as it is a rare and old part, and I don't have one available myself.
Updated by Electric Monk 7 months ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit ab3f6e90e6b1d8edee27c66beb8e53bc6033fb2a Author: Joshua M. Clulow <email@example.com> Date: 2022-01-25T20:49:19.000Z 14419 iprb transmit watchdog somewhat overzealous Reviewed by: Andy Fiddaman <firstname.lastname@example.org> Reviewed by: Garrett D'Amore <email@example.com> Reviewed by: Robert Mustacchi <firstname.lastname@example.org> Approved by: Dan McDonald <email@example.com>