Bug #14788


FDINFO misbehaves in multiple ways

Added by Dan McDonald about 1 month ago. Updated 8 days ago.

Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


The FDINFO support in procfs needs a few fixes to survive under stress.

1.) procfs should take a reference on the file_t during it pr_getf() routine.

2.) procfs should, in the case of prgetattr() check to see if the held process is exit()-ing or not.
(alternatively, it should use prlock() instead of p_pr_lock(), but checking for exit is all that's strictly needed).

Testing this should confirm-or-deny if #14762 is necessary or not, given that a fix here will eliminate uf_refcnt bumps.

Most of this fix has been discovered and developed by Gordon Ross, but the testing and cleanup was done by Dan McDonald, who misfiled two symptom bugs ( #14685 and #14775 ).


netstat-loop.csh (55 Bytes) netstat-loop.csh Dan McDonald, 2022-07-11 09:05 PM
open-close-loop.c (1.04 KB) open-close-loop.c Dan McDonald, 2022-07-11 09:05 PM
14788-monitor.csh (118 Bytes) 14788-monitor.csh Dan McDonald, 2022-07-11 09:05 PM
full-semantics.txt (1.94 KB) full-semantics.txt Dan McDonald, 2022-07-11 10:01 PM
fail-fast-semantics.txt (2.81 KB) fail-fast-semantics.txt Dan McDonald, 2022-07-11 10:01 PM
14788-dtrace.d (895 Bytes) 14788-dtrace.d Dan McDonald, 2022-07-26 09:10 PM
14788-17hrs-dtrace.txt (84.3 KB) 14788-17hrs-dtrace.txt Dan McDonald, 2022-07-27 02:07 PM
Actions #1

Updated by Electric Monk about 1 month ago

  • Gerrit CR set to 2222
Actions #2

Updated by Dan McDonald 29 days ago

Testing regimen...

One needs:
- A suffciently fast AND sufficiently number of cores machine. I used an 8-way VMware VM on a 6-core (12-if-you-count hyperthreading) Mac Mini.

Run tests from the attachments as follows:
- 6 copies of the attached open-close-loop.c w/o an argument (AF_UNIX)
- 6 copies of open-close-loop.c with 6 different ports (AF_INET/SOCK_DGRAM)
- 12 copies of the netstat-loop.csh file
- The 14788-monitor.csh file on a 35-or-more row terminal window
- The following DTrace invocation:

dtrace -n 'pr_getf:entry {self->trace = 1; } pr_getf:return /self->trace/ { if (arg1 != NULL) @returns[stack(), "pr_getf success"] = count(); else @returns[stack(), "pr_getf fail"] = count(); self->trace = 0;} mutex_vector_tryenter:return / self->trace && arg1 == 0/ {@returns[stack(), "mutex_tryenter fail"] = count(); } tick-30s {printf("\n\n\n\n\n\n\n"); printa(@returns);} '

On a stock kernel, this will induce a panic in minutes.

Actions #3

Updated by Dan McDonald 29 days ago

Non-debug test results from the DTrace script over 3hours or more of runs for both possible solutions (full-semantics & fail-fast-semantics) are attached.

Actions #4

Updated by Dan McDonald 19 days ago

Since last update I've:

- Seen that blocking in pr_getf() (proc->p_lock dropped, but P_PR_LOCK flag is set) can cause deadlock (where one participant is looping in cv_wait() for a condition that will never clear), convincing me that only performing mutex_trylock(f_tlock) is safest. (Multiple kernel dumps to back up this assertion.)

- Added a kernel-global-tunable number of retries on f_tlock. If a retry fails pr_getf() effectively restarts, which can result in either success, another trylock failure (leading to returning NULL or retry, depending on retry count), or that the file table entry was emptied by process thread closing.

- Added DTrace probes to detect number-of-retries for success, and trylock failures including the try number.

Here is output from this D script:

dtrace -n 'pr_getf_trysuccess { @trysuccess[ "success, tries == ", arg1 ] = count(); } pr_getf_tryfail { @tryfailure [ "Failure upon try #", arg1 ] = count(); } pr_getf:return / arg1 != NULL / { @success = count(); } pr_getf:return /arg1 == NULL/ { @failure = count(); } tick-30s { printa(@trysuccess) ; printa(@tryfailure); printa(@success) ; printa(@failure)} '

over a 24-hour (no seizure) period of the same testing regimen above:

  success, tries ==       3     81454
  success, tries ==       2   6091162
  success, tries ==       1      10077964631

  Failure upon try #      3     60548
  Failure upon try #      2    142003
  Failure upon try #      1   6233175



Actions #5

Updated by Dan McDonald 14 days ago

New DTrace probe introduced, and new DTrace script attached. Letting the stress-test run for more hours, and then will provide a very verbose output (including file types and pathnames for mutex_trylock() failures, 1, 2, and 3-time failures).

Actions #6

Updated by Dan McDonald 13 days ago

Attaching a 17-ish hour test with the new DTrace script's output. As mentioned earlier, it's verbose.

Actions #7

Updated by Dan McDonald 8 days ago

By our measurements in 14788-17hrs-dtrace.txt, only 0.5% of pr_getf() failures were caused by 3rd-tryenter failures. And out of all pr_getf() calls, only 0.07% of pr_getf() calls were failures. Multiplying the two percentages together, only 0.0003% of pr_getf() failures were caused by three mutex_tryenter() failures.


Also available in: Atom PDF