Bug #5397
closedpfiles slow with many open sockets
100%
Description
pfiles(1) takes a long time to run against a process that has many open sockets.
Using the following node.js code we can simulate this problem by opening 1000 sockets.
#!/usr/bin/env node var net = require('net'); var num = +process.argv[2] || 1000; for (var i = 0; i < num; i++) net.connect(80, 'daveeddy.com'); console.log('opened %d sockets', num);
dave - sjc1-dave-01 sunos ~/pfiles $ ./open-sockets.js & [1] 75375 dave - sjc1-dave-01 sunos ~/pfiles $ opened 1000 sockets ptime pfiles 75375 > /dev/null real 45.522214487 user 0.334410765 sys 44.691393983
Here we can see that it takes 45 seconds to pfiles(1) a process with 1000 open sockets... almost all of which is spent in the kernel.
Using DTrace we can see that a majority of the time is spent in readlink(2).
dave - sjc1-dave-01 sunos ~/pfiles $ node open-sockets.js & [1] 81200 dave - sjc1-dave-01 sunos ~/pfiles $ opened 1000 sockets sudo dtrace -n 'syscall:::entry /pid == $target/ {self->ts = timestamp} syscall:::return /self->ts/ { @[probefunc] = sum(timestamp - self->ts); self->ts = 0}' -c 'pfiles 81200' ... pfiles output snipped ... privsys 3371 sigpending 4520 zone 6441 ioctl 7535 fstat64 8560 systeminfo 9087 openat 12413 llseek 13445 getpid 14175 sysconfig 15086 getrlimit 15339 lseek 18407 sysi86 19677 access 21017 lstat 21465 sigaction 23405 setcontext 23453 unlink 29045 lwp_cond_broadcast 48491 schedctl 65762 fstat 131493 resolvepath 136590 getdents 144807 close 147894 stat 157936 mmap 161735 fcntl 192865 open 421342 read 554952 memcntl 567726 mmapobj 746945 brk 1116175 exece 1203990 lwp_sigmask 88373188 writev 273709103 pwrite 355720528 write 842822775 pread 1979864413 readlink 43978668288
Almost the entire time is spent blocked in the kernel from readlink(2).
Grabbing userland stacks at the time of readlink(2) entry reveals:
8 14722 readlink:entry libc.so.1`__systemcall6+0x1e libproc.so.1`load_fdinfo+0x17c libproc.so.1`Pfdinfo_iter+0x32 pfiles`show_files+0x4b pfiles`main+0x1b6 pfiles`_start+0x6c
Profiling the kernel during that time we can see this flamegraph (flamegraph attached also)
https://us-east.manta.joyent.com/bahamas10/public/pfiles-kernel-sample.html
And, as rmustacc anticipated, it is blocked in dnlc_reverse_lookup
. This will have to do a linear scan of the entire table as there is no v_path
associated with this vnode
.
To test avoiding readlink(2) with pfiles(1) I custom compiled a shared object to LD_PRELOAD
in that circumvents readlink(2) entirely.
#include <stdlib.h> #include <errno.h> size_t readlink(const char *path, char *buf, size_t bufsize) { errno = ENOSYS; return -1; }
$ cc -shared -fPIC -o readlink.so readlink.c $ node open-1k-sockets.js & [1] 88758 $ LD_PRELOAD=./readlink.so /usr/bin/amd64/ptime /usr/bin/amd64/pfiles 88758 ... 1000 open sockets ... real 4.061913621 user 0.354056032 sys 2.692370925
As we can see, this severely (~1/11) decreased the time it takes for pfiles(1) to run against a process with a large number of open sockets.
On to the actual solution.
I've attached a patch which I believe (untested, still trying to figure out how to do that) circumvents the readlink(2) calls for both doors and sockets.
richlowe in #illumos also suggested that those links shouldn't be present in /proc/<pid>/path, because they are broken by design.
Thanks to rmustacc, wesolows, papertigers, and richlowe in #illumos.
raw unsorted data: https://gist.github.com/bahamas10/98bd876ece6ad3abce81
Files
Updated by Dave Eddy over 8 years ago
tested and working (on SmartOS) following the instructions here: https://wiki.smartos.org/display/DOC/Building+SmartOS+on+SmartOS
It works as expected.
Updated by Richard PALO over 8 years ago
Dave Eddy wrote:
tested and working (on SmartOS) following the instructions here: https://wiki.smartos.org/display/DOC/Building+SmartOS+on+SmartOS
It works as expected.
This sounds reasonable, although Cstyle may prefer to have the multiple case labels aligned like:
switch (info->pr_mode & S_IFMT) { case S_IFSOCK: case S_IFDOOR: /* not applicable */ len = -1; break; default: ...
Updated by Dave Eddy over 8 years ago
Richard PALO wrote:
Dave Eddy wrote:
tested and working (on SmartOS) following the instructions here: https://wiki.smartos.org/display/DOC/Building+SmartOS+on+SmartOS
It works as expected.
This sounds reasonable, although Cstyle may prefer to have the multiple case labels aligned like:
[...]
Thanks Richard, I've changed the style of the case statement to match the style guide.
Updated by Dave Eddy over 8 years ago
- File 5397-pfiles.tar.gz 5397-pfiles.tar.gz added
Created webrev (also attached)
http://us-east.manta.joyent.com/bahamas10/public/webrev/5397-pfiles/index.html
Updated by Electric Monk over 8 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit d907f8b938aec9d8b57fdb15c241b98641b8b052
commit d907f8b938aec9d8b57fdb15c241b98641b8b052 Author: Dave Eddy <dave@daveeddy.com> Date: 2015-02-26T02:23:10.000Z 5397 pfiles slow with many open sockets Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Garrett D'Amore <garrett@damore.org>