Project

General

Profile

Actions

Bug #14098

closed

handle failure when muxing non-device streams

Added by Joshua M. Clulow about 1 month ago. Updated about 1 month ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
kernel
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

A panic from a process using I_LINK to arrange STREAMS multiplexing:

panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe0040e6c8a0 addr=20 occurred in module "genunix" due to a NULL pointer dereference
dump content: kernel pages only
> $C
fffffe0040e6c9b0 handle_release+0xf(0)
fffffe0040e6ca20 ldi_munlink_fp+0xd4(fffffe2cc429f930, fffffe30c030a2d0, 1)
fffffe0040e6caf0 munlink+0x2f0(fffffe2cc429f930, fffffe2cba51cc08, 5, fffffe312f312e10, fffffe0040e6cba4, fffffe30c098e3c8)
fffffe0040e6cb60 munlinkall+0x4f(fffffe2cc429f930, 5, fffffe312f312e10, fffffe0040e6cba4, fffffe30c098e3c8)
fffffe0040e6cc00 strclose+0x509(fffffe30aa73b880, 3, fffffe312f312e10)
fffffe0040e6cc80 sotpi_close+0xab(fffffe30892802e8, 3, fffffe312f312e10)
fffffe0040e6ccb0 socket_close_internal+0x18(fffffe30892802e8, 3, fffffe312f312e10)
fffffe0040e6cd30 socket_vop_close+0xf6(fffffe30aa73b880, 3, 1, 0, fffffe312f312e10, 0)
fffffe0040e6cdb0 fop_close+0x66(fffffe30aa73b880, 3, 1, 0, fffffe312f312e10, 0)
fffffe0040e6cdf0 closef+0x68(fffffe312a51bef8)
fffffe0040e6ce30 closeall+0x57(fffffe3095647110)
fffffe0040e6cec0 proc_exit+0x429(1, 0)
fffffe0040e6cee0 exit+0xb(1, 0)
fffffe0040e6cf00 rexit+0x15(0)
fffffe0040e6cf10 sys_syscall+0x1a8()

Related issues

Related to illumos gate - Bug #14133: rlogind: ioctl I_LINK of tcp connection failedClosedRich Lowe

Actions
Actions #1

Updated by Joshua M. Clulow about 1 month ago

Notes from Rich

I_LINK will fail when attempting to link a (device) stream and a non-device file of some kind, such as a socket. We check that the lower stream is a device (in mlink_fp), and handle that correctly if it's not. We also end up, in the non-persistent I_LINK case checking that the upper stream is a device (in ldi_ident_from_stream). If that second check fails, we ASSERT (and crash immediately), or keep going (which works), and crash later when we close the file descriptor and fail to unlink the streams.

Conservatively, we should handle errors coming up from ldi_mlink_fp into mlink_fp and mlink, and return EINVAL if they're a problem.

It is possible that muxing in a socket is actually OK, but the comments in ldi_mlink_fp lead us to believe that a non-persistent mux that only has a major number to differentiate it (as we'd get from an unbound socket to get to ip(7D), rather than opening the clone device) is potentially a problem.

Actions #2

Updated by Joshua M. Clulow about 1 month ago

Testing Notes

Tested using a modified version of wireguard from Nahum which discovered this bug (which fails and no longer panics). A minimal reproduction was also created to isolate the specific failure.

Tested using a modified version of wireguard from Nahum which instead does what we are determining to be correct in this bug (which works, and will establish links over the VPN correctly).

Actions #3

Updated by Electric Monk about 1 month ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 489c4c5ee49bb6dca6f69e0c79b358fd79799b73

commit  489c4c5ee49bb6dca6f69e0c79b358fd79799b73
Author: Richard Lowe <richlowe@richlowe.net>
Date:   2021-09-22T18:22:46.000Z

    14098 handle failure when muxing non-device streams
    Reviewed by: Dan McDonald <danmcd@joyent.com>
    Approved by: Joshua M. Clulow <josh@sysmgr.org>

Actions #4

Updated by Joshua M. Clulow 24 days ago

  • Related to Bug #14133: rlogind: ioctl I_LINK of tcp connection failed added
Actions

Also available in: Atom PDF