Project

General

Profile

Actions

Bug #12581

closed

sockets using cubic congestion control can block

Added by Paul Winder about 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

This seemed to readily happen on a long distance ssh session, Eg in my case from the UK to Maryland. If the remote ssh shell is printing bursts of data, the output stops mid-stream. A good example is terminal output from a iperf client or server. It tends to have an interval with no printing then a burst as it prints its statistics. At times it would not print all of its output, and from that point on no more would be presented to the xterm. Analysis would show that the underlying iperf was still running and would respond to input (Eg ctrl-c), but the xterm session would appear dead.

Analysis of a dump taken showed the stream between the iperf output to sshd was fully backed up and marked QFULL. Moving forward into the sshd socket connection to the remote peer showed that the tcp connection had rafts of mblks held ready for transmission, but were not sent. Network traffic analysis also showed there to be message retransmits and the last message was an ACK from the client to the sshd server.

I noticed the CC algorithm in use was cubic, switching to sunreno prevented the issue form occurring.

There are some very handy dtrace probes around the congestion control algorithms, I used those to discover there were 2 cases which would result in the transmission stopping:

Case 1

Congestion found because of RTO (roundtrip time out)

  9  12502 cc_cong_signal:cwnd-cc-cong-signal 0xaad5 : old_cwnd: 0x1e0 new_cwnd: 0x1e0 old_ssthresh: 0x5fe6 new_ssthresh: 0x5fe6 type: 2
 16  12503 cc_ack_received:cwnd-cc-ack-received 0xaad5 : old_cwnd: 0x1e0 new_cwnd: 0x3c0
 19  12514 cc_after_idle:cwnd-cc-after-idle 0xaad5 : old_cwnd: 0x3c0 new_cwnd: 0x3c0
 16  12503 cc_ack_received:cwnd-cc-ack-received 0xaad5 : old_cwnd: 0x3c0 new_cwnd: 0x4a4
  9  12514 cc_after_idle:cwnd-cc-after-idle 0xaad5 : old_cwnd: 0x4a4 new_cwnd: 0x4a4
  9  12514 cc_after_idle:cwnd-cc-after-idle 0xaad5 : old_cwnd: 0x4a4 new_cwnd: 0x4a4
  9  12514 cc_after_idle:cwnd-cc-after-idle 0xaad5 : old_cwnd: 0x4a4 new_cwnd: 0x4a4
.
.
.

The first line shows congestion signal type 2 (RTO). Before calling the cubic congestion signal handler the congestion window (cwnd) was 0x1e0 and on return it was unchanged.
Following this there was an ACK which increased the cwnd, an idle notification (no effect) and then another ACK which increased the cwnd to 0x4a4.
After this there were no more ACKs, and just calls to the after idle callback which left the cwnd stuck at 0x4a4.
It appears significantly the cwnd is less than one MSS (TCP max segment size) of 0x4e0, which stops the connection from transmitting, and as there is nothing else happening which is increasing the cwnd, the transmission is blocked.

This issue is alleviated by applying SmartOS fix OS-7951

Case 2

Congestion found because of too many DUP ACKs

 16  13826 cc_cong_signal:cwnd-cc-cong-signal 0x7cec : old_cwnd: 0x438af05 new_cwnd: 0x5d2b77 old_ssthresh: 0x88e new_ssthresh: 0x5d2b77 type: 8
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x5d2b77 new_cwnd: 0x5d2b77
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x24fc new_cwnd: 0x24fc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x24fc new_cwnd: 0x24fc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x24fc new_cwnd: 0x24fc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x24fc new_cwnd: 0x24fc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x24fc new_cwnd: 0x24fc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x24fc new_cwnd: 0x24fc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x24fc new_cwnd: 0x25f8
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x2400 new_cwnd: 0x2cdc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x1b24 new_cwnd: 0x23dc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x126c new_cwnd: 0x1adc
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x9fc new_cwnd: 0x12d8
 16  13825 cc_post_recovery:cwnd-cc-post-recovery 0x7cec : old_cwnd: 0x120 new_cwnd: 0x120
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x120 new_cwnd: 0x240
  1  13838 cc_after_idle:cwnd-cc-after-idle 0x7cec : old_cwnd: 0x240 new_cwnd: 0x240
 16  13827 cc_ack_received:cwnd-cc-ack-received 0x7cec : old_cwnd: 0x240 new_cwnd: 0x32c
  1  13838 cc_after_idle:cwnd-cc-after-idle 0x7cec : old_cwnd: 0x32c new_cwnd: 0x32c
  1  13838 cc_after_idle:cwnd-cc-after-idle 0x7cec : old_cwnd: 0x32c new_cwnd: 0x32c
  1  13838 cc_after_idle:cwnd-cc-after-idle 0x7cec : old_cwnd: 0x32c new_cwnd: 0x32c
.
.
.

Congestion signal is received for exceeding DUP ACKs. The cwnd is driven down and eventually post recovery callback is called. On entry to post recover the cwnd is 0x120 and unchanged on exit. As in the previous case the cwnd never recovers above the TCP MSS and transmission is blocked.


Related issues

Related to illumos gate - Bug #12582: CUBIC module should react immediately to CC_RTO congestion signalClosedPaul Winder

Actions
Related to illumos gate - Bug #12583: Import FreeBSD congestion control updatesClosedPaul Winder

Actions
Actions #1

Updated by Paul Winder about 2 years ago

Following changes to the cubic post recovery routine, here is a dtrace example which would previously have blocked:

15  14122 cc_cong_signal:cwnd-cc-cong-signal 0xa5dc : old_cwnd: 0x3178 new_cwnd: 0x276b old_ssthresh: 0x2896 new_ssthresh: 0x276b type: 8
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0x276b new_cwnd: 0x276b
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0x3a84 new_cwnd: 0x3a84
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0x3a84 new_cwnd: 0x3a84
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0x3a84 new_cwnd: 0x3a84
.
.
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0xa08 new_cwnd: 0x1318
 15  14121 cc_post_recovery:cwnd-cc-post-recovery 0xa5dc : old_cwnd: 0xf8 new_cwnd: 0x4e0
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0x4e0 new_cwnd: 0x5d8
 16  14134 cc_after_idle:cwnd-cc-after-idle 0xa5dc : old_cwnd: 0x5d8 new_cwnd: 0x5d8
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0x5d8 new_cwnd: 0x6bc
 15  14123 cc_ack_received:cwnd-cc-ack-received 0xa5dc : old_cwnd: 0x6bc new_cwnd: 0x8d8
.
.

Actions #2

Updated by Paul Winder about 2 years ago

  • Related to Bug #12582: CUBIC module should react immediately to CC_RTO congestion signal added
Actions #3

Updated by Paul Winder about 2 years ago

  • Related to Bug #12583: Import FreeBSD congestion control updates added
Actions #4

Updated by Electric Monk about 2 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 70 to 100

git commit f1ee6ec32e6dc51b77619a27816d945ad64a5378

commit  f1ee6ec32e6dc51b77619a27816d945ad64a5378
Author: Paul Winder <paul@winders.demon.co.uk>
Date:   2020-04-24T07:09:39.000Z

    12581 sockets using cubic congestion control can block
    Reviewed by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
    Reviewed by: Dan McDonald <danmcd@joyent.com>
    Reviewed by: Robert Mustacchi <rm@fingolfin.org>
    Approved by: Garrett D'Amore <garrett@damore.org>

Actions

Also available in: Atom PDF