Support deferring TCP accept()s
This is a performance request. Many TCP based services use protocols in which the client speaks first. This causes servers to first accept() and then call read that will either block or return EAGAIN as no data is yet available.
Deferred accept() allows the application request the kernel only percolate the accept return (or readability in the case of a non-blocking accept) when the connection has data available for reading.
Linux uses the TCP_DEFER_ACCEPT setsockopt option to toggle this behavior. FreeBSD provides a far more enhanced method of controlling this by allowing kernel modules to "filter" the accept by analyzing the tcp payload in-kernel. This allows for things like deferring the TCP accept() system call return until a full HTTP request is detected and readable.
This request is to implement a FreeBSD-like pluggable accept defer that support "dataready" (meaning some data is available) and a proof-of-concept HTTP accept filter. Also, TCP_DEFER_ACCEPT setsockopt capability (turning on the "dataready" accept filter).
Updated by Dan McDonald almost 8 years ago
- Assignee set to Dan McDonald
- % Done changed from 0 to 70
- Difficulty changed from Hard to Medium
Theo and I sussed out that the socket filtering infrastructure was already in place. Here's a webrev:
It's in the review-and-fine-tuning stage.
Updated by Garrett D'Amore about 5 years ago
I'm curious if any performance measurements have been made. It seems like it would save at most about one system call per connection. For workloads that are entirely connect/accept driven (short lived HTTP) that might make a difference, but I thought a lot of work had been done to minimize the impact of this already (at the HTTP protocol level, by connection reuse, etc.)
Updated by Garrett D'Amore almost 5 years ago
More justification written here:
In particular, the implication for the 3-way handshake, avoiding one extra packet transmission, is important. This wasn't clear from my first reading of the description above. Still, this only affects connection setup, but I can see how on busy webservers it can make a substantial difference.