Feature #7062

Connections remain in TIME_WAIT too long

Added by Robert Mustacchi over 4 years ago. Updated over 4 years ago.

Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


We had a customer who tuned their TCP time wait interval by running:

 ndd -set /dev/tcp tcp_time_wait_interval 20000

However, when they run the following command:

netstat -naf inet -P tcp

They see the connections sitting in TIME_WAIT longer than the 20 second interval they've set. It appears that it is possible to set tcp_time_wait_interval within the local zone, but changing this setting has not reduced the amount of time the TCP connections are sitting in TIME_WAIT.

Looking at the mentioned zone I was able to capture the described behavior.

I first confirmed via mdb that the tcp_time_wait_interval was set appropriately in their netstack.
Using this dtrace script, I gathered data regarding the actual lifetimes of connections in the TIME_WAIT state:

/args[0]->tcp_tcps->tcps_netstack->netstack_stackid == 598/
        tw[arg0] = timestamp;
        this->time = (timestamp - tw[arg0]) / 1000000;
        tw[arg0] = 0;
        @nsq = lquantize(this->time, 5000, 60000, 5000);

Despite the proper setting, it does appear that a number of connections are lingering longer than the 10-15 seconds which would be expected:

           value  ------------- Distribution ------------- count
          < 5000 |                                         7
            5000 |                                         1
           10000 |@                                        226
           15000 |                                         82
           20000 |@@                                       278
           25000 |@@@@@@@@@@@@                             1985
           30000 |@@@@@@@@@@@@@@@@@@@                      3193
           35000 |@@                                       371
           40000 |@@                                       352
           45000 |@                                        248
           50000 |@                                        110
           55000 |                                         0

Assuming that the timer initiating calls to tcp_time_wait_collector is firing at the appropriate 5-second intervals, I went looking for another explanation. I believe it has to do with how connections are appended to the TIME_WAIT queue for later clean-up. This is done on a per-squeue basis, meaning that the contained connections can have differing time_wait_interval values. A connection with a short time_wait_interval will have to wait behind one with a long interval given the current logic used to walk the list.

In order to prevent tenants with longer tcp_time_wait_interval values from delaying connection cleanup for those with a shorter value, the connection traversal logic requires restructuring.


Updated by Electric Monk over 4 years ago

  • Status changed from New to Closed

git commit 2404c9e6b54f427b32dd0a2d46940d6a4c5299bc

commit  2404c9e6b54f427b32dd0a2d46940d6a4c5299bc
Author: Patrick Mooney <>
Date:   2016-06-09T20:31:42.000Z

    7062 Connections remain in TIME_WAIT too long
    7061 local TCP connections should be expediently purged from TIME_WAIT
    Reviewed by: Jerry Jelinek <>
    Reviewed by: Robert Mustacchi <>
    Reviewed by: Garrett D'Amore <>
    Approved by: Dan McDonald <>

Also available in: Atom PDF