Bug #12316
hald_runner dies getting SIGPIPE
100%
Description
Perform the following actions:
1) Mount USB stick in Mate
2) umount it in CLI
If you are lucky (or unlucky) enough, hald runner dies. SMF doesn't notice this. If you run hald with --daemon=no --verbose=yes, you can see either
23:57:49.342 [I] osspec.c:213: mnttab event Run started hal-storage-cleanup-mountpoint (0) (1) ! full path is '/usr/lib/hal/hal-storage-cleanup-mountpoint', program_dir is '/usr/lib/hal' 2560: XYA attempting to get lock on /media/.hal-mtab-lock 2560: XYA got lock on /media/.hal-mtab-lock in hal-storage-cleanup-mountpoint for mount point '/media/Ubuntu 18.04.2 LTS amd64' hal_mtab = '/dev/dsk/c0t0d0p0 101 0 hsfs nosuid /media/Ubuntu 18.04.2 LTS amd64 ' line = '/dev/dsk/c0t0d0p0 101 0 hsfs nosuid /media/Ubuntu 18.04.2 LTS amd64' devfile = '/dev/dsk/c0t0d0p0' uid = '101' session id = '0' fs = 'hsfs' options = 'nosuid' mount_point = '/media/Ubuntu 18.04.2 LTS amd64' Found entry for mount point '/media/Ubuntu 18.04.2 LTS amd64' in /media/.hal-mtab2560: XYA released lock on /media/.hal-mtab-lock *** [DIE] hald_runner.c:runner_died():168 : Runner died
when you are unlucky or
23:56:24.220 [I] osspec.c:213: mnttab event Run started hal-storage-cleanup-mountpoint (0) (1) ! full path is '/usr/lib/hal/hal-storage-cleanup-mountpoint', program_dir is '/usr/lib/hal' 2494: XYA attempting to get lock on /media/.hal-mtab-lock 2494: XYA got lock on /media/.hal-mtab-lock in hal-storage-cleanup-mountpoint for mount point '/media/Ubuntu 18.04.2 LTS amd64' hal_mtab = '/dev/dsk/c0t0d0p0 101 0 hsfs nosuid /media/Ubuntu 18.04.2 LTS amd64 ' line = '/dev/dsk/c0t0d0p0 101 0 hsfs nosuid /media/Ubuntu 18.04.2 LTS amd64' devfile = '/dev/dsk/c0t0d0p0' uid = '101' session id = '0' fs = 'hsfs' options = 'nosuid' mount_point = '/media/Ubuntu 18.04.2 LTS amd64' Found entry for mount point '/media/Ubuntu 18.04.2 LTS amd64' in /media/.hal-mtab2494: XYA released lock on /media/.hal-mtab-lock pid 2494: rc=0 signaled=0: /usr/lib/hal/hal-storage-cleanup-mountpoint 23:56:24.276 [I] devinfo_storage.c:1436: Cleaned up mount point '/media/Ubuntu 18.04.2 LTS amd64'
if you are lucky.
truss shows the following pattern:
hald-runner spawns /usr/lib/hal/hal-storage-cleanup-mountpoint and writes '\n' to its stdin. If it's unlucky and hal-storage-cleanup-mountpoint has already finished its work, it gets SIGPIPE.
Updated by Alexander Pyhalov about 1 year ago
The simple test case which reproduces hal behavior (with both glib 2.62.4 and glib 2.58.3). sleep() is necessary to make things more reproducible.
#include <glib.h> #include <unistd.h> #include <stdio.h> int main() { GError *error = NULL; char *env[0]; char *argv[2]; gint *stdin_p; gint stdin_v; gint pid; stdin_p= &stdin_v; argv[0] = "/usr/bin/true"; argv[1] = NULL; if (g_spawn_async_with_pipes("/", argv, NULL, G_SPAWN_DO_NOT_REAP_CHILD, NULL, NULL, &pid, stdin_p, NULL, NULL, &error)) { sleep(10); // if (write(stdin_v, "", 0) != 0) // Fixed behavior if (write(stdin_v, "\n", 1) != 1) printf("Warning: Error while writing r->input (%s) to stdin_v.\n","\n"); close(stdin_v); } return 0; }
Updated by Alexander Pyhalov about 1 year ago
SIGPIPE in test case seems to be the intended behavior - it's consistent across Linux and OI
Updated by Alexander Pyhalov about 1 year ago
Suggested fix: http://buildzone.oi-build.r61.net/webrev-12316/
I've tested this by running USB disk mounts/unmounts and looking at hal behavior.
After fix I see:
child (/usr/lib/hal/hal-storage-cleanup-mountpoint):
... 1864: rmdir("/media/Ubuntu 18.04.2 LTS amd64") = 0 1864: rename("/media/.hal-mtab~", "/media/.hal-mtab") = 0 1864: fcntl(3, F_SETLK, 0x08040DDC) = 0 1864: close(3) = 0 1864: getpid() = 1864 [1706] 1864: write(1, " F o u n d e n t r y ".., 130) = 130 1864: _exit(0)
parent (hald-runner):
... 1706/1: write(7, 0x084E6F50, 0) = 0 1706/1: close(7) = 0 1706/1: waitid(P_PID, 1864, 0x08039260, WEXITED|WTRAPPED|WNOHANG) = 0 1706/1: pollsys(0x084E3EA8, 2, 0x08039578, 0x00000000) = 0 1706/1: write(1, " p i d 1 8 6 4 : r c".., 71) = 71 1706/1: fstat(8, 0x080393F8) = 0 1706/1: fcntl(8, F_GETFL) = 2 1706/1: read(8, 0x084F0DD8, 1024) = 0 1706/1: close(8) = 0
Before fix this write to pipe - 1706/1: write(7,)
could fail (and failed in half of cases), after fix it doesn't (being a 0-bytes write).
Updated by Electric Monk about 1 year ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit b877e47f88a401dbef6fff48940d38855c01fcbc
commit b877e47f88a401dbef6fff48940d38855c01fcbc Author: Alexander Pyhalov <apyhalov@gmail.com> Date: 2020-02-21T19:01:41.000Z 12316 hald_runner dies getting SIGPIPE Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Dan McDonald <danmcd@joyent.com>