Bug #14971


`mac_fini_ops` can lead to NULL-pointer dereferences

Added by Benjamin Naecker 3 months ago. Updated 3 months ago.

Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:
External Bug:


It's currently possible for mac_fini_ops to lead to a NULL-pointer dereference. That routine currently calls dld_fini_ops unconditionally, which accesses and dereferences a pointer, specifically, the dev_ops->cb_ops->cb_str field. This field is supposed to be set to NULL by non-STREAMS drivers, per the man page for cb_ops(9S). In the normal case, that pointer is set to point to newly-allocated memory, inside dld_init_ops. However, mac_init_ops may not actually call dld_init_ops. In the case where ddi_name_to_major returns DDI_MAJOR_T_NONE, mac_init_ops returns early, never calling dld_init_ops. Thus memory goes unallocated, and when dld_fini_ops, a NULL pointer is dereferenced.

To be safe, mac_fini_ops could check that the pointer that will be dereferenced in dld_fini_ops is actually non-NULL, and only call dld_fini_ops when the pointer is valid.

Here is some mdb output showing a stack trace and the issue itself:

> ::status
debugging crash dump vmcore.7 (64-bit) from feldspar
operating system: 5.11 helios-1.0.21180 (i86pc)
build version: remotes/ci-build/xde-0-gd365ee6a2b

image uuid: 48e1aecb-c5ea-6258-b020-d8e0e3948f39
panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe0083329b70 addr=0 occurred in module "dld" due to a NULL pointer dereference
dump content: kernel pages only
> $C
fffffe0083329c90 dld_fini_ops+0x25(ffffffffc00fe0a8)
fffffe0083329cd0 stubs_common_code+0x59()
fffffe0083329cf0 mac_fini_ops+0xe(ffffffffc00fe0a8)
fffffe0083329d80 _init+0x10e()
fffffe0083329dc0 modinstall+0x92(fffffe59e67dfaa0)
fffffe0083329e20 mod_hold_installed_mod+0x77(fffffe59e94e4580, 0, 0, fffffe0083329e30)
fffffe0083329e80 modctl_modload+0xb8(0, fffffc7fffdf2040, fffffc7fffdf203c)
fffffe0083329f00 modctl+0x36e(0, 0, fffffc7fffdf2040, fffffc7fffdf203c, fffffc7fe2c28690, 0)
fffffe0083329f10 sys_syscall+0x17d()
> ffffffffc00fe0a8::print -t struct dev_ops
struct dev_ops {
    int devo_rev = 0x4
    int devo_refcnt = 0
    int (*)() devo_getinfo = nodev_getinfo
    int (*)() devo_identify = nulldev_identify
    int (*)() devo_probe = nulldev_identify
    int (*)() devo_attach = xde_attach
    int (*)() devo_detach = xde_detach
    int (*)() devo_reset = nodev_reset
    struct cb_ops *devo_cb_ops = xde_cb_ops
    struct bus_ops *devo_bus_ops = 0
    int (*)() devo_power = nodev_power
    int (*)() devo_quiesce = ddi_quiesce_not_needed
> xde_cb_ops::print -t struct cb_ops
struct cb_ops {
    int (*)() cb_open = nulldev_open
    int (*)() cb_close = nulldev_close
    int (*)() cb_strategy = nodev
    int (*)() cb_print = nodev
    int (*)() cb_dump = nodev
    int (*)() cb_read = nodev_read
    int (*)() cb_write = nodev_read
    int (*)() cb_ioctl = xde_ioctl
    int (*)() cb_devmap = nodev
    int (*)() cb_mmap = nodev
    int (*)() cb_segmap = nodev
    int (*)() cb_chpoll = nochpoll
    int (*)() cb_prop_op = ddi_prop_op
    struct streamtab *cb_str = 0
    int cb_flag = 0x20
    int cb_rev = 0x1
    int (*)() cb_aread = nodev
    int (*)() cb_awrite = nodev

The fact that cb_str is NULL will cause a trap on the dereference here.

Actions #1

Updated by Benjamin Naecker 3 months ago

I've found a way to reproduce this, and so to test the changes I'm making. This was all hit while working on a driver called `xde`, which is a mac provider. During its _init(9E) routine, it calls mac_init_ops(9F) and, if any of the setup code fails, also mac_fini_ops(9F). I can reliably panic my machine by calling rem_drv xde followed by modload /kernel/drv/amd64/xde. The first rem_drv command seems to ensure that the call to ddi_name_to_major returns DDI_MAJOR_T_NONE. I can see that because the driver itself prints a message that mac_fini_ops(9F) failed:

Sep 19 19:36:59 feldspar xde: [ID 125844 kern.warning] WARNING: mod_install failed: 6

That 6 is ENXIO. I believe that comes from trying to uninstall modules in the dependency graph of this driver that don't exist, though I've not proven that.

In any case, running that rem_drv and modload sequence reliably panics my machine. With the small fix I've developed, running modload instead shows:

bnaecker@feldspar : ~ $ pfexec modload /kernel/drv/amd64/xde
can't load module: No such device or address
bnaecker@feldspar : ~ $
Actions #2

Updated by Electric Monk 3 months ago

  • Gerrit CR set to 2370

Also available in: Atom PDF