Project

General

Profile

Bug #5699

Missing tcp/udp checksum with hardware offloading and tagged vlan

Added by Jorge Schrauwen over 5 years ago. Updated about 5 years ago.

Status:
Feedback
Priority:
Low
Assignee:
-
Category:
networking
Start date:
2015-03-09
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
needs-triage
Gerrit CR:

Description

Spend most of my weekend debugging this. Opening an issue just for future people stumbling on the same problem.

Not sure where is the best place to fix it, but this is more a FYI and a please fix.

hardware setup:
- 4 link aggr between switch and host
- (using LACP but should be irrelevant)
- 4 tagged vlan's going over said aggr

host is running smartos, a kvm vm is running and has 4 nics (in each vlan respectively)
traffic from else where to the vm and back flow fine. (ip packets have checksums, etc...)

the gz has a vnic hanging of the same aggr, traffic from else where to the gz and back flow fine. (ip packets have checksums, etc...)

traffic from the gz to the vm does not flow for tcp/udp, arp and icmp are ok.
the vm was discarding the traffic due to the tcp or udp checksum being blank.

This is because traffic flow goes like this in that case
vnic_gz -> aggr -> vnic_vm and never has a checksum added.

My vm is running OpenBSD, other operating systems may be less picky.

#1

Updated by Robert Mustacchi over 5 years ago

The issue here is that the vnd driver which was driving this is taking the outgoing traffic via a promiscuous callback and thus it wasn't passing through the normal mac_tx loopback paths which would have caused it to be properly checksummed for loopback. The solution is probably to ensure that all the traffic destined for the mac in vnd doesn't rely on the promisc callbacks and to potentially simplify the code and have it simply consume things from mac directly rather than via dls.

#2

Updated by Jorge Schrauwen about 5 years ago

I no longer have the exact setup from back then but I spun up a custom smartos image with https://smartos.org/bugview/OS-4600 commit.

Seems to be fixed :)

Feel free to close this one as I can't

Also available in: Atom PDF