Project

General

Profile

Bug #10179

Race condition in fifo_open

Added by Martijn Dekker 9 months ago.

Status:
New
Priority:
Urgent
Assignee:
-
Category:
kernel
Start date:
2019-01-05
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

The following script, that tests named pipes (FIFOs), breaks consistently on all SunOS kernels, including illumos, when executed by ksh93, bash or dash. All it does is make 100 FIFOs and read a line from each -- it should be trivial. If the system has one CPU, or has more than one CPU and is under heavy load, attempts to read from a FIFO block incorrectly. This means FIFOs are fundamentally broken.

#! /bin/sh
# On systems with more than 1 CPU, the bug is only triggered under
# heavy system load, so max all CPUs minus one while this script runs.
numCPUs=$(psrinfo | wc -l)
pids=
i=0; while test "$((i+=1))" -lt "$numCPUs"; do
    while :; do :; done &
    pids="$pids $!" 
done
# Make 100 FIFOs, reading a line from each.
# With the bug, it hangs on 'read' after the first few.
tmpdir=/tmp/FIFOs$$
trap "kill $pids; exec rm -rf $tmpdir" EXIT INT PIPE TERM
mkdir -m700 "$tmpdir" || exit
i=0; while test "$((i+=1))" -le 100; do
    fifo=$tmpdir/FIFO$i
    mkfifo "$fifo" || exit
    echo "this is FIFO $i" >"$fifo" &
    read foo <"$fifo" && echo "$foo" 
done

A bug was filed in the internal Solaris bug tracking system. Apparently the bug exists as of Solaris 2.5 (1995) so it was also inherited by illumos. Further details on the fix are as follows (quoting Casper H.S. Dik):

The function fifo_open() uses cv_wait_sig_swap(); it returns
"0" when it finds that a signal was received and "1" when
no signal was received but when the cv was signalled.

But if both a signal (SIGCHLD in our test case) and a cv was
signalled, it still returns "0" and the code returns EINTR; it
should verify if the condition is actually met and return 0
because otherwise we have consumed the peer.

Generally, it would wakeup the sleeping thread immediately so
you would either wakeup on the signal or on the cv_signal but
if the wakeup is delayed, you get both cv_signal & the signal
and the cv_wait_sig_* then returns that it received a signal.

I'm not sure if that is actually the proper behaviour of
cv_wait_sig*(), but generally a caller of cv_wait*() should
work on the assumption that a wakeup can be spurious and it
should check the condition.

Now generally a system call can return EINTR and it can be
restarted; however in the case of fifo_open() it cannot be
restarted: the peer has been consumed.

Casper

More background info from here :

Looking at the code, it seems it was broken since Solaris 2.5

The history of cv_wait_sig*() makes clear that the intention to return
a signal was received even when the cv_signal() was also seen.
(And that cv_wait_sig*() was changed in Solaris 2.8 to make sure that
the cv_signal() was not consumed when it returned a signal was received.

To be honest, I think this is backward. If the code find that both
a unix signal and a cv_signal have been received, it should pretend
the signal did not happen; cv_wait_sig*() would be called again, anyway,
and the next call would return finally return a unix signal was found.
(Or in the case of fifo_open(), it would just succeed and won't return
EINTR)

But changing cv_wait_sig*() semantics is much more risky than fixing
fifo_open().

Casper

Also available in: Atom PDF