Project

General

Profile

Actions

Bug #12582

closed

CUBIC module should react immediately to CC_RTO congestion signal

Added by Paul Winder about 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

This is related to #12581 and fixes one of the issues described there.

Analysis courtesy of Cody Mello

The CUBIC congestion control module waits multiple retransmissions until it shrinks the congestion window:

case CC_RTO:
    /*
     * Grab the current time and record it so we know when the
     * most recent congestion event was. Only record it when the
     * timeout has fired more than once, as there is a reasonable
     * chance the first one is a false alarm and may not indicate
     * congestion.
     */
    if (CCV(ccv, tcp_timer_backoff) >= 2) {
        cubic_data->num_cong_events++;
        cubic_data->t_last_cong = gethrtime();
        cubic_ssthresh_update(ccv);
        cubic_data->max_cwnd = cwin;
        CCV(ccv, tcp_cwnd) = mss;
    }
    break;

This means that when the packet loss was caused by a large congestion window, CUBIC doesn't shrink the window, which causes TCP to retransmit at the same window size after timing out. This causes further congestion, and further packet loss, which results in the retransmission timer getting backed off even further. For example, running iperf over a WiFi network:
[root@test1 ~]# date && iperf3 -c 37.153.108.173 -p 11845 -t 60
Mon Aug 12 18:09:23 UTC 2019
Connecting to host 37.153.108.173, port 11845
[  4] local 10.88.88.52 port 53886 connected to 37.153.108.173 port 11845
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   269 KBytes  2.21 Mbits/sec
[  4]   1.00-2.00   sec   679 KBytes  5.57 Mbits/sec
[  4]   2.00-3.00   sec   866 KBytes  7.09 Mbits/sec
[  4]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   4.00-5.00   sec   127 KBytes  1.04 Mbits/sec
[  4]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec
[  4]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec
[  4]  10.00-11.00  sec  0.00 Bytes  0.00 bits/sec
[  4]  11.00-12.00  sec  0.00 Bytes  0.00 bits/sec
[  4]  12.00-13.00  sec  0.00 Bytes  0.00 bits/sec
[  4]  13.00-14.00  sec  0.00 Bytes  0.00 bits/sec
[  4]  14.00-15.00  sec  0.00 Bytes  0.00 bits/sec
[  4]  15.00-16.00  sec   764 KBytes  6.25 Mbits/sec
[  4]  16.00-17.00  sec   722 KBytes  5.91 Mbits/sec
^C

Watching it with connstat (edited to remove extraneous headers and the iperf control connection for clarity):
[root@test1 ~]# connstat -T d -o laddr,lport,raddr,rport,rto,rtt,rttvar,cwnd,unsent,outbytes,retransbytes -i 1 -F rport=11845,state=established
Mon Aug 12 18:09:24 UTC 2019
Mon Aug 12 18:09:25 UTC 2019
LADDR                     LPORT RADDR                     RPORT RTO     RTT     RTTVAR  CWND       UNSENT     OUTBYTES   RETRANSBYTES
10.88.88.52               53886 37.153.108.173            11845 650     241719  102244  8725       132520     13069      0
Mon Aug 12 18:09:26 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 635     235519  100000  404781     36200      805181     0
Mon Aug 12 18:09:27 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 1312    256447  100000  608912     136864     1430021    524928
Mon Aug 12 18:09:28 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 1312    256447  100000  608912     135416     1722573    524928
Mon Aug 12 18:09:29 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 2624    256447  100000  608912     117288     1871021    1132448
Mon Aug 12 18:09:30 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 2624    256447  100000  608912     117288     1871021    1132448
Mon Aug 12 18:09:31 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 2624    256447  100000  608912     117288     1871021    1132448
Mon Aug 12 18:09:32 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 5248    256447  100000  23168      117288     1871021    1178784
Mon Aug 12 18:09:33 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 5248    256447  100000  228112     117288     1871021    1700032
Mon Aug 12 18:09:34 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 5248    256447  100000  228112     117288     1871021    1700032
Mon Aug 12 18:09:35 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 5248    256447  100000  228112     117288     1871021    1700032
Mon Aug 12 18:09:36 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 5248    256447  100000  228112     117288     1871021    1700032
Mon Aug 12 18:09:37 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 5248    256447  100000  228112     117288     1871021    1700032
Mon Aug 12 18:09:38 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 5248    256447  100000  228112     117288     1871021    1700032
Mon Aug 12 18:09:39 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 10496   256447  100000  20856      117288     1871021    1742608
Mon Aug 12 18:09:40 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 610     210170  100000  182960     130320     2379269    1847456
Mon Aug 12 18:09:41 UTC 2019
10.88.88.52               53886 37.153.108.173            11845 661     261258  100000  340280     101360     3280677    1876416

At 18:09:29, almost a whole congestion window's worth of packets get retransmitted, but CWND doesn't shrink. At 18:09:32, after another timeout, it finally shrinks, and retransmits 521248 bytes, which don't get acknowledged. The congestion window then grows again while it waits for the RTO to time out again, at 18:09:39. At this point, because we've backed off three times, we again shrink the congestion window, retransmit, and this time the data gets acknowledged


Related issues

Related to illumos gate - Bug #12581: sockets using cubic congestion control can blockClosedPaul Winder

Actions
Actions

Also available in: Atom PDF