Bug #12801
closedlibdiskmgt leaks PROM device information handles like a sieve
100%
Description
In the findevs()
routine, we call di_prom_init()
to get a PROM device information handle. This handle is stored in the struct search_args
, but does not appear to be cleaned up afterwards.
For potentially flimsy reasons, /dev/openprom
allows at most 32 handles to be opened at once; a persistent process which uses libdiskmgmt
without exiting can end up hanging on to a lot of these handles. While it's conceivable that /dev/openprom
should be more flexible, it's assuredly not correct to leak these handles.
Related issues
Updated by Joshua M. Clulow about 2 years ago
Note that if you're hitting this condition and all /dev/openprom
handles are in use, prtconf -d
will fail to work:
$ prtconf -d System Configuration: i86pc Memory size: 32658 Megabytes System Peripherals (Software Nodes): prtconf: di_prom_init() failed.: Resource temporarily unavailable
Updated by Joshua M. Clulow about 2 years ago
- Related to Bug #8709: teach libdiskmgt about nvme, sata, and xen added
Updated by Joshua M. Clulow about 2 years ago
This appears to be a regression introduced by #8709, which removed the original cleanup code.
Updated by Garrett D'Amore about 2 years ago
I remember running across that limit many years ago and being confounded by it. There is no real reason for it, and it's a shame that it's set so low.
Having said that -- we should definitely fix the bug. But we could probably change the code to support a vastly larger number of cloned file handles. I expect it would be almost trivial to fix that limit.
Updated by Joshua M. Clulow about 2 years ago
Yes I suspect it could just allocate the object it needs and keep it in a list with some less restrictive maximum number of handles. Found the leak, though!
Updated by Joshua M. Clulow almost 2 years ago
You can see how many of these handles are currently open with:
pfexec mdb -ke 'oprom_state::array | ::print "struct oprom_state" already_open | ::grep ".!=0" ! wc -l'
e.g., on one system that does not have the fix, I see the value 16. Upon inspection with pfiles
I can see these are held open by modules in fmd
and syseventd
.
Updated by Joshua M. Clulow almost 2 years ago
For some reason dtrace
is getting caught on a forward declaration for struct search_args
in the CTF for libdiskmgt.so.1
on my system, so I am fishing out the dev_walk_status
value this way:
newcastle # ldd $(which diskinfo) | grep libdisk libdiskmgt.so.1 => /usr/lib/libdiskmgt.so.1 newcastle # mdb -e '::print -ta struct search_args' /usr/lib/libdiskmgt.so.1 0 struct search_args { 0 disk_t *disk_listp 4 controller_t *controller_listp 8 bus_t *bus_listp c di_devlink_handle_t handle 10 di_prom_handle_t ph 14 di_node_t node 18 di_minor_t minor 1c int dev_walk_status } newcastle # dtrace -n 'pid$target::findevs:entry { self->x = (userland int *)(arg0 + 0x1C); } pid$target::findevs:return /self->x/ { printf("dev_walk_status = %d\n", *self->x); self->x = 0; }' \ -c diskinfo dtrace: description 'pid$target::findevs:entry ' matched 2 probes TYPE DISK VID PID SIZE RMV SSD - c5t0d0 - - 0.00 GiB yes no NVME c6t0025385791B4A2BBd0 Samsung SSD 970 EVO 500GB 465.76 GiB no yes NVME c9t0025385891B28EEFd0 Samsung SSD 970 EVO 500GB 465.76 GiB no yes dtrace: pid 21494 has exited CPU ID FUNCTION:NAME 1 82007 findevs:return dev_walk_status = 0
Updated by Joshua M. Clulow almost 2 years ago
Testing Notes¶
After updating to bits with the change, there is just one standing handle open:
$ pfexec mdb -ke 'oprom_state::array | ::print "struct oprom_state" already_open | ::grep ".!=0" ! wc -l' 1
This is held by fmd
, because of the handle we create and hold onto in topo_snap_create()
.
In addition, it seems that diskinfo
is successfully creating and freeing the handle, and returning success from findevs()
:
newcastle # dtrace -n 'pid$target::findevs:entry { self->x = (userland int *)(arg0 + 0x1C); } pid$target::findevs:return /self->x/ { printf("dev_walk_status = %d\n", *self->x); self->x = 0; } pid$target::di_prom_init:return { printf("-> %p\n", arg1); ustack(); } pid$target::di_prom_fini:entry { ustack() }' -c diskinfo dtrace: description 'pid$target::findevs:entry ' matched 4 probes TYPE DISK VID PID SIZE RMV SSD - c5t0d0 - - 0.00 GiB yes no NVME c6t0025385791B4A2BBd0 Samsung SSD 970 EVO 500GB 465.76 GiB no yes NVME c9t0025385891B28EEFd0 Samsung SSD 970 EVO 500GB 465.76 GiB no yes dtrace: pid 1283 has exited CPU ID FUNCTION:NAME 6 73086 di_prom_init:return -> 9013008 libdevinfo.so.1`di_prom_init+0x68 libdiskmgt.so.1`initialize+0x3b libdiskmgt.so.1`make_descriptors+0x17 libdiskmgt.so.1`cache_get_descriptors+0x1d libdiskmgt.so.1`media_get_descriptors+0x22 libdiskmgt.so.1`dm_get_descriptors+0x8d diskinfo`enumerate_disks+0x3a diskinfo`main+0x11f diskinfo`_start_crt+0x96 diskinfo`_start+0x1a 6 73087 di_prom_fini:entry libdevinfo.so.1`di_prom_fini libdiskmgt.so.1`findevs+0xf0 libdiskmgt.so.1`initialize+0x3b libdiskmgt.so.1`make_descriptors+0x17 libdiskmgt.so.1`cache_get_descriptors+0x1d libdiskmgt.so.1`media_get_descriptors+0x22 libdiskmgt.so.1`dm_get_descriptors+0x8d diskinfo`enumerate_disks+0x3a diskinfo`main+0x11f diskinfo`_start_crt+0x96 diskinfo`_start+0x1a 6 73085 findevs:return dev_walk_status = 0 6 73086 di_prom_init:return -> 9013008 libdevinfo.so.1`di_prom_init+0x68 libtopo.so.1`topo_snap_hold+0xb3 diskinfo`enumerate_disks+0x8e diskinfo`main+0x11f diskinfo`_start_crt+0x96 diskinfo`_start+0x1a 6 73087 di_prom_fini:entry libdevinfo.so.1`di_prom_fini libtopo.so.1`topo_snap_destroy+0x17b libtopo.so.1`topo_snap_release+0x28 diskinfo`enumerate_disks+0xe0 diskinfo`main+0x11f diskinfo`_start_crt+0x96 diskinfo`_start+0x1a
Updated by Electric Monk almost 2 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
git commit be235d17490f6a5ad01c6a5319530fce66d88389
commit be235d17490f6a5ad01c6a5319530fce66d88389 Author: Joshua M. Clulow <josh@sysmgr.org> Date: 2020-08-06T05:21:59.000Z 12801 libdiskmgt leaks PROM device information handles like a sieve Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: Robert Mustacchi <rm@fingolfin.org> Reviewed by: Yuri Pankov <yuripv@yuripv.dev> Approved by: Richard Lowe <richlowe@richlowe.net>