Project

General

Profile

Bug #7572

vioif panic: qe->qe_indirect_next < qe->qe_queue->vq_indirect_num

Added by Daniel Kimmel almost 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
driver - device drivers
Start date:
2016-11-11
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

During testing for a fix to enable MSI-x interrupts on KVM virtio devices, I ran across the following panic:

panic[cpu0]/thread=ffffff0007acbc40:
assertion failed: qe->qe_indirect_next < qe->qe_queue->vq_indirect_num, file: ../../common/io/virtio/virtio.c, line: 657

The stack trace is:

::stack

vpanic()
0xfffffffffba8bd94()
virtio_ve_add_indirect_buf+0xc5(ffffff026625a680, 109deb7a0, 40, 1)
virtio_ve_add_cookie+0x54()
vioif_send+0x288(ffffff02552fe900, ffffff025381a1a0)
vioif_tx+0x4d(ffffff02552fe900, ffffff025381a1a0)
mac_tx+0x594(ffffff024e9645c8, ffffff025381a1a0, e04b50e9, 1, 0)
str_mdata_fastpath_put+0x53(ffffff0250b4abe8, ffffff025381a1a0, e04b50e9, 1)
ip_xmit+0x9ed(ffffff025381a1a0, ffffff02508fccc0, b80036061, 2888, e04b50e9, 0)
ire_send_wire_v4+0x401(ffffff024f4862d8, ffffff025381a1a0, ffffff025381b510, ffffff0256d6b0c0, ffffff024ece6dc0)
conn_ip_output+0x2bc(ffffff025381a1a0, ffffff0256d6b0c0)
tcp_send_data+0x80(ffffff025a8755c0, ffffff025381a1a0)
tcp_send+0x6d8(ffffff025a8755c0, 51c, 34, 20, 0, ffffff0007acaed4)
tcp_wput_data+0x686(ffffff025a8755c0, 0, 0)
tcp_input_data+0xa54(ffffff025a874dc0, ffffff026c89d360, ffffff024ec5ea40, ffffff0007acb500)
squeue_enter+0x963(ffffff024ec5ea40, ffffff026c89d360, ffffff026c89d360, 1, ffffff0007acb500, 4)
ip_fanout_v4+0xd8f(ffffff026c89d360, ffffff026c89b17e, ffffff0007acb500)
ip_input_local_v4+0x16e(ffffff024f4866f8, ffffff026c89d360, ffffff026c89b17e, ffffff0007acb500)
ire_recv_local_v4+0x172(ffffff024f4866f8, ffffff026c89d360, ffffff026c89b17e, ffffff0007acb500)
ill_input_short_v4+0x568(ffffff026c89d360, ffffff026c89b17e, ffffff026c89b18e, ffffff0007acb500, ffffff0007acb690)
ip_input_common_v4+0x3ba(ffffff024eab91a8, ffffff024f140040, ffffff026c89d360, 0, 0, 0)
ip_input+0x2b(ffffff024eab91a8, ffffff024f140040, ffffff026c89d360, 0)
mac_rx_soft_ring_process+0x19a(ffffff024e9645c8, ffffff025367f040, ffffff026c89d360, ffffff026c89d360, 1, 0)
mac_rx_srs_proto_fanout+0x29a(ffffff0267842300, ffffff026c89d360)
mac_rx_srs_drain+0x34c(ffffff0267842300, 800)
mac_rx_srs_process+0x3ce(ffffff0250b6abb0, ffffff0267842300, ffffff026c89d360, 0)
mac_rx_common+0x143(ffffff0250b6abb0, 0, ffffff026c89d360)
mac_rx+0xb6(ffffff0250b6abb0, 0, ffffff026c89d360)
vioif_process_rx+0x111(ffffff02552fe900)
vioif_rx_handler+0x20(ffffff02552fe908, 0)
av_dispatch_autovect+0x91(1b)
dispatch_hardint+0x36(1b, 0)
ffffff0007a05a70 [stack frame pointer is invalid]

This seems to be very reproducible through the following test case:

ssh delphix@<ip addr of vioif device>
yes

Something strange is going on in this code:

644 unsigned int
645 virtio_ve_indirect_available(struct vq_entry *qe)
646 {
647 return (qe->qe_queue->vq_indirect_num - (qe->qe_indirect_next - 1));
648 }

In this code, when the two values are equal, the return value is 1, presumably indicating that there is still a space in the array. Yet, this is a false assertion, because when accessing an index of the array that is equal to its length, the element accessed is out of bounds.

The one place this function is used is in vioif_tx_external, which is inlined into vioif_send in the stack trace in the bug description. Right after this function is called is the call to virtio_ve_add_cookie which eventually causes the panic.
I used mdb to fix the off-by-1 error in the function and could not reproduce the panic. We need to change virtio_ve_indirect_available to return one less than it did before, thus accurately reflecting the status of the two values it uses as the length of the array and the index being used.

History

#1

Updated by Electric Monk over 2 years ago

  • % Done changed from 0 to 100
  • Status changed from New to Closed

git commit 20ee95858720e9df048b9d31b30aeb303e0685c9

commit  20ee95858720e9df048b9d31b30aeb303e0685c9
Author: Dan Kimmel <dan.kimmel@delphix.com>
Date:   2017-04-14T04:23:04.000Z

    7572 vioif panic: qe->qe_indirect_next < qe->qe_queue->vq_indirect_num
    Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
    Reviewed by: Prachetaa Raghavan <prachetaa.raghavan@delphix.com>
    Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
    Reviewed by: George Wilson <george.wilson@delphix.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
    Approved by: Dan McDonald <danmcd@omniti.com>

Also available in: Atom PDF