Want UDP src port hashing for VXLAN
From the original Joyent bug:
In general, hardware switches and OSes perform fanout based on some set of L2, L3, and L4 information. For TCP and UDP traffic, the L2 bits are the source and destination mac address, for L3, it's the source and destination MAC addresses, and for L4, it's the source and destination ports.
Outside of an encapsulation, normally different flows of packets would have different sets of these, for example, if you consider iperf and when it has multiple threads, each thread has a different source port, even though all the other information is the same, which allows for different flows to be hashed to different devices or different paths.
However, if you consider something like VXLAN, it forces all of the traffic between two hosts to be encapsulated in a UDP frame and unless we do something, all of the different bits that the devices being encapsulated used are now useless. By default in our implementation, we sent from the address that we bound to in the overlay device, in other words because we bound to port 4789 -- the default VXLAN port -- we were stuck.
We're not the first to think of this. The VXLAN spec actually says:
- Source Port: It is recommended that the UDP source port number be calculated using a hash of fields from the inner packet -- one example being a hash of the inner Ethernet frame's headers. This is to enable a level of entropy for the ECMP/load- balancing of the VM-to-VM traffic across the VXLAN overlay. When calculating the UDP source port number in this manner, it is RECOMMENDED that the value be in the dynamic/private port range 49152-65535 [RFC6335].
What we've done is basically implement something that does this based on the combination of the L2, L3, and L4 hashes. Based on the data in OS-4008, we've seen that this does actually allow us to fanout and gives things a bit more breathing room, allowing for the different UDP packets to be processed by different MAC soft rings, which means that different flows can be delivered up to the overlay device in parallel and we won't block the entire soft ring.
Work by Robert Mustacchi