Project

General

Profile

Actions

Bug #7749

closed

setsockopt(TCP_KEEPCNT) can return EINVAL spuriously

Added by Robert Mustacchi over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Category:
networking
Start date:
2017-01-09
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

In illumos, TCP keep-alive aborts a connection after TCP_KEEPALIVE_ABORT_THRESHOLD milliseconds of unresponsiveness to a keep-alive probe – which itself starts after TCP_KEEPALIVE_THRESHOLD milliseconds of idle time. The interval we use for the keep-alive probes is an implementation detail – it defaults to the RTO and is doubled until it reaches the maximum RTO. In Linux, however, this interval is set explicitly: after TCP_KEEPIDLE idle seconds, the system sends TCP_KEEPCNT probes at a spacing of TCP_KEEPINTVL. As soon as an ACK is received, the TCP_KEEPIDLE timer starts (that is, TCP_KEEPINTVL is not further considered until TCP_KEEPIDLE idle seconds have again elapsed). The problem is that these two variables aren't set at once – and we need to somehow deal with the intermediate state (or what should be the intermediate state, anyway) where one has been set but the other is unset. We basically do the best we can by considering the abort threshold to be the product of the two values, and inferring one from the other.

That is, if we're setting the interval and haven't yet set the count, we'll divide the (default) abort threshold by the specified interval to derive a count. Likewise, if we set the count and haven't yet set the interval, we'll assume the product to be the abort threshold and divide it by the specified count to derive an interval. But here's the problem: in this latter case, we check the derived interval against the maximum RTO, returning EINVAL if we find that the interval exceeds the RTO. This is problematic because with the default abort interval of 480 seconds and the default maximum RTO of 60 seconds, any count less than 8 will result in EINVAL – even if the program was about to set an interval that would make the count entirely valid.

Here's a program that demonstrates this:

#include <stdarg.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>

#include <netdb.h>

#include <netinet/in.h>
#include <netinet/tcp.h>

char *g_cmd = "kacnt";

static void
fatal(char *fmt, ...)
{
    va_list ap;
    int error = errno;

    va_start(ap, fmt);

    (void) fprintf(stderr, "%s: ", g_cmd);
    (void) vfprintf(stderr, fmt, ap);

    if (fmt[strlen(fmt) - 1] != '\n')
        (void) fprintf(stderr, ": %s\n", strerror(error));

    exit(EXIT_FAILURE);
}

int
main()
{
    struct protoent *pp;
    int keepalive = 1;
    int keepidle = 60;
    int keepcnt = 3;
    int keepintvl = 5;
    int sock, p, sz = sizeof (int);

    if ((pp = getprotobyname("tcp")) == NULL)
        fatal("couldn't find 'tcp'");

    if ((sock = socket(PF_INET, SOCK_STREAM, p = pp->p_proto)) < 0)
        fatal("couldn't create socket");

    if (setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &keepalive, sz) != 0)
        fatal("couldn't set SO_KEEPALIVE");

    if (setsockopt(sock, p, TCP_KEEPIDLE, &keepidle, sz) != 0)
        fatal("couldn't set TCP_KEEPIDLE");

    if (setsockopt(sock, p, TCP_KEEPCNT, &keepcnt, sz) != 0)
        fatal("couldn't set TCP_KEEPCNT");

    if (setsockopt(sock, p, TCP_KEEPINTVL, &keepintvl, sz) != 0)
        fatal("couldn't set TCP_KEEPINTVL");

    printf("sock is %d\n", sock);
    return (0);
}

This program spuriously fails with EINVAL when the TCP_KEEPCNT results in an invalid derived interval. Fortunately, in this case we track the actual intent of the user (namely, the specified count), so it's actually adequate to simply clamp the derived interval at the maximum RTO when TCP_KEEPCNT is set before TCP_KEEPINTVL.

Actions #1

Updated by Electric Monk over 4 years ago

  • Status changed from New to Closed

git commit a41f965a2f911f4f56617a2e6ceaeef4e1c58e70

commit  a41f965a2f911f4f56617a2e6ceaeef4e1c58e70
Author: Bryan Cantrill <bryan@joyent.com>
Date:   2017-01-30T18:32:33.000Z

    7749 setsockopt(TCP_KEEPCNT) can return EINVAL spuriously
    Reviewed by: Dave Pacheco <dap@joyent.com>
    Reviewed by: Dan McDonald <danmcd@omniti.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Actions

Also available in: Atom PDF