epoll should better detect fd reassignment
Several customers, internal and otherwise, have reported seeing nginx periodically spew errors related to
EPERM. This is a little bit odd, as there isn't much in the way of access checking in the guts of epoll/devpoll itself. That logic does, however, allow errors from
VOP_POLLhandlers to trickle up. With the merging of OS-5941, attempts to epoll on regular files and directories will result in
EPERMis the expectation if a normal file end up in the set, why then is nginx stumbling over the problem. Certainly it wouldn't add a plain fd intentionally, only to complain about it later. Another possibility does exist in this case: An active fd was closed, without first being removed from the set, followed by it being quickly reassigned when nginx went to open a regular file. This touches on one of the wrinkles in the core design of epoll: it expects to act on the underlying
vnode_twould likely be the closes analog) rather than the fd itself. This places a premium in detecting when the resource backing of an fd changes.
I believe the trouble detailed in illumos-joyent#139 is also related to this. In that case, it wasn't
EPERMproblems due to
VREGresources sneaking into the set but rather unexpected successful events for freshly
accept(2)-ed sockets which occupy previously polled fd entries. The framework in question (https://github.com/oktal/pistache/) maintains a list of active sockets which it compares events against. When the
accept(2)succeeds, placing a new resource into that fd, it's possible for the
epoll_wait()to win the race against the framework socket registration code, emitting an event which won't match against the internal list of known sockets.
Updated by Patrick Mooney 3 months ago
I've re-run the epoll test suite to check for unexprected regressions and found none. The behavior in question was from tight races on fd reuse, so reproducing it is a challenge. The patched illumos epoll matches (sans #12910) the SmartOS epoll which functioned well in the face of those same conditions when tested by others.
Updated by Electric Monk 3 months ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
commit 086d96878f5f62a25a6d90e5b03a1ef9ba352231 Author: Patrick Mooney <email@example.com> Date: 2020-07-07T14:56:47.000Z 12909 epoll should better detect fd reassignment Portions contributed by: Bryan Cantrill <firstname.lastname@example.org> Portions contributed by: Jerry Jelinek <email@example.com> Reviewed by: Jerry Jelinek <firstname.lastname@example.org> Reviewed by: Dan McDonald <email@example.com> Reviewed by: Jason King <firstname.lastname@example.org> Approved by: Robert Mustacchi <email@example.com>