Project

General

Profile

Feature #1403

Support deferring TCP accept()s

Added by Theo Schlossnagle about 8 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
Category:
kernel
Start date:
2011-08-23
Due date:
% Done:

70%

Estimated time:
50.00 h
Difficulty:
Medium
Tags:
needs-triage

Description

This is a performance request. Many TCP based services use protocols in which the client speaks first. This causes servers to first accept() and then call read that will either block or return EAGAIN as no data is yet available.

Deferred accept() allows the application request the kernel only percolate the accept return (or readability in the case of a non-blocking accept) when the connection has data available for reading.

Linux uses the TCP_DEFER_ACCEPT setsockopt option to toggle this behavior. FreeBSD provides a far more enhanced method of controlling this by allowing kernel modules to "filter" the accept by analyzing the tcp payload in-kernel. This allows for things like deferring the TCP accept() system call return until a full HTTP request is detected and readable.

This request is to implement a FreeBSD-like pluggable accept defer that support "dataready" (meaning some data is available) and a proof-of-concept HTTP accept filter. Also, TCP_DEFER_ACCEPT setsockopt capability (turning on the "dataready" accept filter).

History

#1

Updated by Dan McDonald over 7 years ago

  • Assignee set to Dan McDonald
  • % Done changed from 0 to 70
  • Difficulty changed from Hard to Medium

Theo and I sussed out that the socket filtering infrastructure was already in place. Here's a webrev:

http://kebe.com/~danmcd/webrevs/sockf/

It's in the review-and-fine-tuning stage.

#2

Updated by Ryan Zezeski almost 5 years ago

I've rebased this webrev against latest master and incorporated changes made in illumos-omnios.

http://zinascii.com/pub/illumos/webrevs/1403/

I'm currently in process of writing tests and reviewing.

#3

Updated by Garrett D'Amore almost 5 years ago

I'm curious if any performance measurements have been made. It seems like it would save at most about one system call per connection. For workloads that are entirely connect/accept driven (short lived HTTP) that might make a difference, but I thought a lot of work had been done to minimize the impact of this already (at the HTTP protocol level, by connection reuse, etc.)

#4

Updated by Garrett D'Amore over 4 years ago

More justification written here:

http://www.techrepublic.com/article/take-advantage-of-tcp-ip-options-to-optimize-data-transmission/

In particular, the implication for the 3-way handshake, avoiding one extra packet transmission, is important. This wasn't clear from my first reading of the description above. Still, this only affects connection setup, but I can see how on busy webservers it can make a substantial difference.

Also available in: Atom PDF