Feature #3847
DLPI should allow DL_NO_TX_LOOPBACK on promiscuous interfaces
80%
Description
qemu will utilize promiscuous interfaces to bridge into host-provided networking. The DLPI interface for doing this will create a promiscuous attachment that has a TX loop such that the sender will see their own packets on the interface.
This will explode in a loop in several IPv6 cases and even in normal IPv4 ARPing.
History
Updated by Theo Schlossnagle almost 7 years ago
VNICs:
# dladm show-link
LINK CLASS MTU STATE BRIDGE OVER
bnx0 phys 1500 down -- --
bnx1 phys 1500 up -- --
ips0 vnic 1500 up -- bnx1
boot0 vnic 1500 up -- bnx1
stub1 etherstub 9000 unknown -- --
stub2 etherstub 9000 unknown -- --
stub3 etherstub 9000 unknown -- --
lab0_1 vnic 9000 unknown -- stub1
lab0_2 vnic 9000 unknown -- stub2
lab0_3 vnic 9000 unknown -- stub3
kvm guest command line:
# pargs 7722
7722: /usr/bin/qemu-system-x86_64 -name lab -boot d -enable-kvm -smp 1 -vnc 0:0 -m 25
argv[0]: /usr/bin/qemu-system-x86_64
argv[1]: -name
argv[2]: lab
argv[3]: -boot
argv[4]: d
argv[5]: -enable-kvm
argv[6]: -smp
argv[7]: 1
argv[8]: -vnc
argv[9]: 0:0
argv[10]: -m
argv[11]: 256
argv[12]: -no-hpet
argv[13]: -drive
argv[14]: file=/dev/zvol/rdsk/test/lab0,if=ide,index=0
argv[15]: -drive
argv[16]: file=/home/ltirkkon/lab/debian-7.1.0-amd64-netinst.iso,media=cdrom,if=ide,index=2
argv[17]: -net
argv[18]: nic,name=net0,model=virtio,macaddr=2:8:20:f0:7d:d
argv[19]: -net
argv[20]: vnic,name=net0,ifname=lab0_1,macaddr=2:8:20:f0:7d:d
argv[21]: -net
argv[22]: nic,name=net1,model=virtio,macaddr=2:8:20:1a:59:e9
argv[23]: -net
argv[24]: vnic,name=net1,ifname=lab0_2,macaddr=2:8:20:1a:59:e9
argv[25]: -net
argv[26]: nic,name=net2,model=virtio,macaddr=2:8:20:56:aa:b9
argv[27]: -net
argv[28]: vnic,name=net2,ifname=lab0_3,macaddr=2:8:20:56:aa:b9
Using Debian installer as the guest OS, doing 'ip link set up eth0' causes the
guest to freeze (or maybe it just slows down so much it might as well be
frozen)
Doing that also causes kvmstat to show pretty much only zeroes:
pid vcpu | exits : haltx irqx irqwx iox mmiox | irqs emul eptv
7722 0 | 0 : 0 0 0 0 0 | 0 0 0
7722 0 | 0 : 0 0 0 0 0 | 0 0 0
7722 0 | 0 : 0 0 0 0 0 | 0 0 0
[...]
snooping on any of the vnics in question shows a lot of traffic like this:
UNSPECIFIED -> ff02::16 ICMPv6 Group membership report - MLDv2
and there is indeed quite a bit of traffic:
# dlstat -i 1 | egrep 'LINK|lab0_'
LINK IPKTS RBYTES OPKTS OBYTES
lab0_1 0 0 78.49M 7.06G
lab0_2 0 0 78.58M 7.07G
lab0_3 0 0 52.46M 4.72G
lab0_1 0 0 250.79K 22.57M
lab0_2 0 0 219.48K 19.75M
lab0_3 0 0 98.18K 8.84M
lab0_1 0 0 284.85K 25.64M
lab0_2 0 0 168.20K 15.14M
lab0_3 0 0 115.94K 10.43M
lab0_1 0 0 103.13K 9.28M
lab0_2 0 0 288.30K 25.95M
This doesn't happen when using non-global zones instead of KVM guests.
Using only two interfaces still generates traffic, but not enough for the guest
to completely die under the pressure.
# dlstat -i 1|egrep 'LINK|lab0_'
LINK IPKTS RBYTES OPKTS OBYTES
lab0_1 52 2.23K 175.43M 15.46G
lab0_2 0 0 175.30M 15.45G
lab0_3 0 0 109.08M 9.82G
lab0_1 0 0 14.08K 1.10M
lab0_2 0 0 6.10K 476.48K
lab0_3 0 0 0 0
lab0_1 0 0 13.87K 1.08M
lab0_2 0 0 10.93K 853.61K
LINK IPKTS RBYTES OPKTS OBYTES
lab0_3 0 0 0 0
lab0_1 0 0 20.66K 1.61M
lab0_2 0 0 18.18K 1.42M
lab0_3 0 0 0 0
lab0_1 0 0 8.09K 630.88K
lab0_2 0 0 7.70K 600.67K
lab0_3 0 0 0 0
'snoop -v lab0_1' when using two NICs shows traffic from both ethernet
addresses:
# snoop -vd lab0_1|grep $(dladm show-vnic -po macaddress lab0_2)
Using device lab0_1 (promiscuous mode)
ETHER: Source = 2:8:20:1a:59:e9,
[...]
Booting Debian with ipv6.disable=1 makes the immediate issue go away: doing ip
link set up eth0 does not generate traffic. However I was able to get the same symptoms with:
ip link set up eth0
ip addr add 192.168.10.1/24 dev eth0
ping 192.168.10.2
# (just to generate some arp traffic)
# hit ^C and wait for dlstat to go down to zero
ip link set up dev eth1
boom, ARP everywhere:
LINK IPKTS RBYTES OPKTS OBYTES
lab0_1 0 0 166.65K 7.00M
lab0_2 0 0 173.59K 7.29M
Updated by Robert Mustacchi about 6 years ago
- Assignee set to Robert Mustacchi
- % Done changed from 0 to 80
- Tags deleted (
needs-triage)
I've implemented DL_PROMISC_RX_ONLY as a means to achieve this. In my current work, it only does this via DLPI, but is not plumbed up in libdlpi at this time.
Updated by Lauri Tirkkonen over 4 years ago
In case it isn't clear, this also completely hoses ipv6 in guests even with one NIC, because DAD always fails.
Updated by Lauri Tirkkonen over 4 years ago
Lauri Tirkkonen wrote:
In case it isn't clear, this also completely hoses ipv6 in guests even with one NIC, because DAD always fails.
Actually, this is not true for an OmniOS guest, but it is for Linux.