Bug #7186

xnf: panic on Xen 4.x

Added by Andreas Pflug over 1 year ago. Updated 2 months ago.

Status:ClosedStart date:2016-07-17
Priority:NormalDue date:
Assignee:Yuri Pankov% Done:

100%

Category:driver - device drivers
Target version:-
Difficulty:Medium Tags:

Description

I'm trying to install OI as DomU unter Xen 4.4.1 (Debian 8). If I configure a VIF at installation time vif=['bridge=lan'], the kernel will panic in xnf:

panic[cpu0]/thread=ffffff0003ea7c40: BAD TRAP: type=e (#pf Page fault) rp=ffffff0003ea7980 addr=40 occurred in module "xnf" due to a NULL pointer dereference

sched: #pf Page fault
Bad kernel fault at addr=0x40
pid=0, pc=0xfffffffff8071bc7, sp=0xffffff0003ea7a70, eflags=0x10206
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 2660<vmxe,xmme,fxsr,mce,pae>
cr2: 40
        rdi:              286 rsi:                0 rdx:                8
        rcx: ffffff0149e6c058  r8:                0  r9:                0
        rax:              150 rbx:                2 rbp: ffffff0003ea7ac0
        r10:                0 r11:                0 r12: ffffff0149fed000
        r13:                0 r14:               15 r15:                7
        fsb:                0 gsb: fffffffffbc60020  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                e err:                0 rip: fffffffff8071bc7
         cs:             e030 rfl:            10206 rsp: ffffff0003ea7a70
         ss:             e02b

Warning - stack not written to the dump buffer
ffffff0003ea7860 unix:die+df ()
ffffff0003ea7970 unix:trap+dd8 ()
ffffff0003ea7980 unix:_cmntrap+12b ()
ffffff0003ea7ac0 xnf:xnf_tx_clean_ring+c7 ()
ffffff0003ea7b30 xnf:tx_slots_get+95 ()
ffffff0003ea7b70 xnf:xnf_intr+15b ()
ffffff0003ea7be0 unix:av_dispatch_autovect+91 ()
ffffff0003ea7c20 unix:dispatch_hardint+36 ()
ffffff0003e05970 unix:switch_sp_and_call+13 ()
ffffff0003e059d0 unix:do_interrupt+fa ()
ffffff0003e05a80 unix:xen_callback_handler+373 ()
ffffff0003e05a90 unix:xen_callback+cb ()
ffffff0003e05ba0 unix:__hypercall2+a ()
ffffff0003e05bb0 unix:HYPERVISOR_block+10 ()
ffffff0003e05bc0 unix:mach_cpu_idle+1d ()
ffffff0003e05bf0 unix:cpu_idle+fe ()
ffffff0003e05c00 unix:cpu_idle_adaptive+13 ()
ffffff0003e05c20 unix:idle+a7 ()
ffffff0003e05c30 unix:thread_start+8 ()

Same happens with 151a8.

I tried the installation without network, works ok without panic; the VM will boot ok with networking enabled:
Then I tried to enable networking manually, but as soon as I assign an IP address (ifconfig xnf0 192.168.x.y) I get the same panic.

xnf_crash.0 - mdb crash dump infos (262 KB) Michal Nowak, 2017-03-21 07:33 PM

History

#1 Updated by Aurélien Larcher over 1 year ago

Given the nature of this issue, I would suggest that you post your bug report to and CC .
I will reference the corresponding illumos bug when it is opened.
Thank you.

#2 Updated by Alexander Pyhalov 11 months ago

  • Project changed from OpenIndiana Distribution to illumos gate

#4 Updated by Michal Nowak 9 months ago

I can reproduce this issue with custom OI ISO created from current packages according to this XML: http://buildzone.oi-build.r61.net/text_mode_x86.xml on openSUSE 42.2 with Xen 4.8 from this repo: https://build.opensuse.org/package/show/Virtualization/qemu.

The DilOS ISO mentioned above ends like this on first boot (installation went fine):

> sudo virsh console DilOSpv
Connected to domain DilOSpv
Escape character is ^]
v4.8.0_05-482 chgset ''
DilOS Version 1.3.7.168-9d-gcc6 64-bit
Copyright (c) 2011-2017, DilOS. All rights reserved.
DEBUG enabled
NOTICE: vdev_disk_open("/dev/dsk/c1t1d0s0"): fallback to DKIOCGMEDIAINFO

NOTICE: vdev_disk_open("/dev/dsk/c1t1d0s0"): fallback to DKIOCGMEDIAINFO

Configuring devices.
dcpc: unable to resolve dependency, cannot load module 'drv/cpc'
NOTICE: vdev_disk_open("/dev/dsk/c1t1d0s0"): fallback to DKIOCGMEDIAINFO

NOTICE: vdev_disk_open("/dev/dsk/c1t1d0s0"): fallback to DKIOCGMEDIAINFO

...

NOTICE: vdev_disk_open("/dev/dsk/c1t1d0s0"): fallback to DKIOCGMEDIAINFO

panic[cpu1]/thread=ffffff0004b63c40: assertion failed: tidp->next == INVALID_TX_ID, file: ../../common/xen/io/xnf.c, line: 1268

ffffff0004b63a60 genunix:process_type+1900ad ()
ffffff0004b63ad0 xnf:xnf_tx_clean_ring+120 ()
ffffff0004b63b40 xnf:tx_slots_get+e0 ()
ffffff0004b63b80 xnf:xnf_intr+139 ()
ffffff0004b63be0 unix:av_dispatch_autovect+81 ()
ffffff0004b63c20 unix:dispatch_hardint+36 ()
ffffff0004b12950 unix:switch_sp_and_call+13 ()
ffffff0004b129b0 unix:do_interrupt+146 ()
ffffff0004b12a70 unix:xen_callback_handler+42e ()
ffffff0004b12a80 unix:xen_callback+18f ()
ffffff0004b12b90 unix:__hypercall2+a ()
ffffff0004b12ba0 unix:HYPERVISOR_block+10 ()
ffffff0004b12bb0 unix:mach_cpu_idle+1d ()
ffffff0004b12bf0 unix:cpu_idle+d1 ()
ffffff0004b12c00 unix:cpu_idle_adaptive+13 ()
ffffff0004b12c20 unix:idle+a2 ()
ffffff0004b12c30 unix:thread_start+8 ()

dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel + curproc
dumping:  0:34 100% done
100% done: 80825 pages dumped, dump succeeded
rebooting...

Any directions what might be done next?

#5 Updated by Igor Kozhukhov 9 months ago

Michal Nowak wrote:

I can reproduce this issue with custom OI ISO created from current packages according to this XML: http://buildzone.oi-build.r61.net/text_mode_x86.xml on openSUSE 42.2 with Xen 4.8 from this repo: https://build.opensuse.org/package/show/Virtualization/qemu.

The DilOS ISO mentioned above ends like this on first boot (installation went fine):

[...]

Any directions what might be done next?

well, i need a time for debug dilos on new xen.
i'm using dilos pv guests under dilos-xen-3.4-dom0
it is old one, but still working
probably need updates for new xen under dilos side, but i'm not ready for it yet.
based on log - missed one module
dcpc: unable to resolve dependency, cannot load module 'drv/cpc'
i can take a look why i missed it on installed system.
i have plans update dilos to new xen, but i have no estimates when i can start to do it - need port build dependencies to new system first - i'm working on it. you can ping me on #dilos IRC (FreeNode)

#6 Updated by Michal Nowak 9 months ago

Thanks Igor.

Attaching mdb crash dump infos per https://wiki.illumos.org/display/illumos/How+To+Report+Problems. Hopefully it's of some use (it's aligned to 80 characters as I got it via serial line).

#7 Updated by Yuri Pankov 3 months ago

  • Subject changed from Hipster and 151a8 fail to install as Xen DomU to xnf: panic on Xen 4.x
  • Description updated (diff)

Apparently, something changed in Xen 4.x so now xnf driver simply panics. Applying the following fixes from delphix-os seems to fix the panic (along with bringing a lot of other nice stuff):

https://github.com/delphix/delphix-os/commit/d45e6bd76f28f4b9f5216000f4599f2ef41653f5
https://github.com/delphix/delphix-os/commit/e3e5e90849aac2fbc4fb0cb150aa97876f486e1b

Wonder if Delphix guys could upstream the changes.

#8 Updated by Yuri Pankov 2 months ago

  • Category set to driver - device drivers
  • Tags deleted (needs-triage)
  • % Done changed from 0 to 50
  • Assignee set to Yuri Pankov
  • Status changed from New to In Progress
  • Subject changed from xnf: panic on Xen 4.x to xnf: panic on Xen 4.7

I'll try to upstream Delphix changes.

#9 Updated by Yuri Pankov 2 months ago

  • Subject changed from xnf: panic on Xen 4.7 to xnf: panic on Xen 4.x

#10 Updated by Dan McDonald 2 months ago

From the Delphix reviews:

Improvements

Support LSO
Support Jumbo Frames

LSO
In order to support LSO, we first needed to support sending scatter-gathered packets.
In Xen, each tx and rx buffer can hold up to one page of data (4096 Bytes). Since one
packet can consist of several mblks, we must dma map each one of those and retrieve
the list of pages that they consist off. One tx slot will be used for each of those pages.
Once the packet has been transmitted and acknowledged by the host, resources must be
freed in a very specific order:

Release grants (a grant is a memory access permission given from the guest to the host).
Unbind dma buffers.
free the mblk.
Moreover, each time we process a response for a tx slot, we must release it's txid. This
has to be done right away as the number of free txids we have must be the same as the
number of free tx slots.

Jumbo Frames

To support Jumbo frames we must support scatter-gather both on the tx and the rx side.
In order to understand how the rx side works we need to grasp the concept of requests
and responses as it is somewhat different from the tx side. On the rx side a request
actually means that we have put a page-sized buffer on the rx ring and the host is
free to use it. When there's data incomming for the guest, the host will put it into
one of the buffers provided and put up a response on the rx ring. The response contains
the length and offset of the data inside the provided buffer as well as some flags. To
simplify matters, the host will always put the response on the same rx ring slot where
there was the request. In some cases, metadata will be put on the ring instead of a
typical response; this also means that the underlying buffer has not been consumed.

Comments

I had to modify the pseudo checksum calculation code to support chained mblks instead
of a contiguous buffer. Ideally, we'd not have to calculate the checksum ourselves,
however there is currently no way to tell the GLD layer that we want it to generate
pseudo checksums for IPv4 only, and that for IPv6 full checksums should be generated.
Hence we tell the GLD to not generate any checksums for IPv4 only, and then do the
pseudo header checksum calculation ourselves.
We previously thought that the size of the tx and rx descriptor ring is configurable.
It is not (I learned it the hard way). Right now it is limited to one page in size,
which represents 256 descriptors.
See https://wiki.xen.org/wiki/Xen_network_TODO.
Linux imposes a limit of 18 segments for a single packet. If we are provided with a
single mblk from the GLD then this is not an issue as it will map to maximum 17
one-paged segments. However, if some workloads provide multiple mblks and those would
map to more than 18 segments we currently perform a pullup before the mapping. If the
pullup induces a big penalty we might want to consider lowering the maximum LSO'd
packet size.

Additional Work Done

While implementing support for scatter-gather on the rx side, I've also implemented support
for receiving inband metadata info, called extras. Those are required to implement LRO.

Future Work

Support LRO (as of writing this review I already have a protoype for LRO, and it does
show cpu usage improvements on xenserver-01, however it seems like it is not enabled in
AWS).
Get the OS to generate pseudo header checksums

#11 Updated by Dan McDonald 2 months ago

ALSO from the Delphix reviews:

ran perf tests on AWS, ran sanity tests on xenserver-01
nicdrv tests done by @jkennedy

mblk mangler: hacked the mac layer to fragment mblks before passing them to the xnf driver.
Didn't see any issues with miscalculated checksums with and w.o the mangler, according to snoop.
Some packets were dropped during a high troughput test, but this appeared to be related to drop-on-tx-ring full policy in the mac layer.

Performance numbers on AWS (r3.8xlarge, Delphix Engine to Delphix Engine) are:
- ~8Gbps for 9000 MTU single stream
- >9Gbps for 9000 MTU multi-stream
- ~3.5Gbps for 1500 MTU single stream
- ~4Gbps for 1500 MTU multi-stream

Performance numbers for Linux (Centos 7) on AWS (r3.8xlarge, Linux to Linux) are:
- 7.5Gbps for 9000 MTU single stream
- 7Gbps for 9000 MTU multi-stream
- 2Gbps for 1500 MTU

NOT TESTED
- multicast add/remove
- DDI_SUSPEND/DDI_RESUME logic from within the guest (probably not something we support anyway).

#12 Updated by Electric Monk 2 months ago

  • % Done changed from 50 to 100
  • Status changed from In Progress to Closed

git commit 9276b3991ba20d5a5660887ba81b0bc7bed25a0c

commit  9276b3991ba20d5a5660887ba81b0bc7bed25a0c
Author: Yuri Pankov <yuri.pankov@nexenta.com>
Date:   2017-10-13T19:41:47.000Z

    7186 xnf: panic on Xen 4.x
    Contributed by: Frank Salzmann <frank@delphix.com>
    Contributed by: Pavel Zakharov <pavel.zakharov@delphix.com>
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Ken Mays <maybird1776@yahoo.com>
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Also available in: Atom