Bug #12849
closedixgbe can do unnecessary allocations during tx
100%
Description
In ixgbe_ring_tx
, when LSO is active it may allocate and move around new mblk_ts based on the dblk_t alignment in order to get all of the headers (ethernet, IP, TCP/UDP) into a single aligned mblk_t. However, it then sets the copy threshold to the size of all the combined headers. This means the dblk_ts containing the headers (after all the shuffling) are not mapped to the device, but instead are copied into an already mapped buffer.
All of this shuffling also greatly complicates the TX code, making it hard to analyze or make changes, and is ultimately unnecessary -- we can copy the headers directly from their existing dblk_ts into an already mapped buffer and forgo the allocb()
calls, and then proceed as we currently do with the remaining contents of the packet. It also allows us to drastically simplify the logic in ixgbe_ring_tx
.
Related issues
Updated by Jason King over 1 year ago
Besides a smoke test of running a kernel with the changes and passing normal traffic without incident (sjorge has been running with this patch for several months w/o incident).
Additionally, Jorge was able to find a workload that explicitly triggered the 'need to msgpullup() due to too many descriptors being used for this packet' (basically taking the if statement starting at usr/src/uts/common/io/ixgbe_tx.c:332 -- verified using dtrace) and verified that things operated normally.
Additionally, manta load tests were run against a SmartOS PI with this change. There was a slight slowdown that was seen, however it was believed to be due to a known issue with the manta metadata tier and the consensus was that the PI tests didn't appear to present any alarming findings.
The change was designed to only simplify how the TX descriptors were populated, it shouldn't have changed the contents of the descriptors (or how the mblk_t contents was distributed across descriptors), though this seems like it would be difficult to prove.
Updated by Jason King about 1 year ago
For completeness, I ran some iperf3 between combinations of unmodified and modified (with the PI) systems:
A(fix pi)->B (existing PI) [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-60.00 sec 23.3 GBytes 3.34 Gbits/sec 0.000 ms 0/17277970 (0%) sender [ 5] 0.00-60.00 sec 10.2 GBytes 1.45 Gbits/sec 0.015 ms 9745069/17277145 (56%) receiver B (existing PI)->A (fix PI) [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-60.00 sec 19.9 GBytes 2.85 Gbits/sec 0.000 ms 0/14751440 (0%) sender [ 5] 0.00-60.01 sec 17.5 GBytes 2.50 Gbits/sec 0.001 ms 1794418/14751402 (12%) receiver A(existing PI)->B(existing PI) [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-60.00 sec 23.4 GBytes 3.35 Gbits/sec 0.000 ms 0/17372630 (0%) sender [ 5] 0.00-60.00 sec 13.4 GBytes 1.92 Gbits/sec 0.003 ms 7442430/17371395 (43%) receiver B->(existing PI)->A (existing PI) [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-60.00 sec 16.2 GBytes 2.32 Gbits/sec 0.000 ms 0/12018300 (0%) sender [ 5] 0.00-60.01 sec 16.0 GBytes 2.29 Gbits/sec 0.001 ms 175189/12018297 (1.5%) receiver A (fix pi) -> B (fix pi) [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-60.00 sec 23.2 GBytes 3.32 Gbits/sec 0.000 ms 0/17172380 (0%) sender [ 5] 0.00-60.00 sec 16.6 GBytes 2.38 Gbits/sec 0.001 ms 4865355/17172347 (28%) receiver B (fix pi) -> A (fix pi) [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams [ 5] 0.00-60.00 sec 17.3 GBytes 2.48 Gbits/sec 0.000 ms 0/12841090 (0%) sender [ 5] 0.00-60.01 sec 16.8 GBytes 2.41 Gbits/sec 0.001 ms 358693/12841076 (2.8%) receiver
While generally better with the new PI, they are not noticeably worse (which was the goal).
Updated by Electric Monk about 1 year ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit e39a7b5b5afda54edb90e52709591832137db39f
commit e39a7b5b5afda54edb90e52709591832137db39f Author: Jason King <jason.king@joyent.com> Date: 2021-02-26T23:17:31.000Z 12849 ixgbe can do unnecessary allocations during tx Reviewed by: Robert Mustacchi <rm@fingolfin.org> Approved by: Richard Lowe <richlowe@richlowe.net>
Updated by Dan McDonald about 1 year ago
- Related to Bug #13644: ixgbe_ring_tx() has an invalid VERIFY under memory pressure added