Bug #5976
closede1000g use after free on start failure
100%
Description
On a VM with very little available memory we can get a panic triggered by 'ifconfig e1000g0 plumb'. It's because e1000g improperly unwinds resources allocated during a failed mac start. (The system in question had ~6MB free, with no pagecache and minimal ARC use.)
e1000g_free_rx_packets+0x60(ffffff019fea8900, 0) e1000g_free_packets+0x2d(ffffff01664fc000) e1000g_release_dma_resources+0x20(ffffff01664fc000) e1000g_start+0x14d(ffffff01664fc000, 1) e1000g_m_start+0x55(ffffff01664fc000) mac_start+0x8e(ffffff016656a9f8) dls_open+0x83(ffffff0160e7fe68, ffffff0160e80d88, ffffff01536fae20) dld_str_attach+0x1b0(ffffff01536fae20, 0) dld_str_open+0xad(ffffff0191256010, ffffff0004c4fa68, 0) dld_open+0x37(ffffff0191256010, ffffff0004c4fa68, 3, 0, ffffff016172e1b8) qattach+0x12b(ffffff01912562b8, ffffff0004c4fa68, 3, ffffff016172e1b8, 0, 0) stropen+0x34f(ffffff01911e2200, ffffff0004c4fa68, 3, ffffff016172e1b8) spec_open+0x281(ffffff0004c4fc78, 3, ffffff016172e1b8, 0) fop_open+0x89(ffffff0004c4fc78, 3, ffffff016172e1b8, 0) vn_openat+0x235(8046f90, 0, 3, 6c, ffffff0004c4fdf0, 0) copen+0x20c(ffd19553, 8046f90, 3, fec6406c) openat32+0x27(ffd19553, 8046f90, 2, fec6406c) open32+0x25(8046f90, 2, fec6406c) _sys_sysenter_post_swapgs+0x237()
This happens to be a use-after-free in e1000g_free_rx_packets:
> ffffff01664fc000::print struct e1000g rx_ring[0].rx_data | ::print e1000g_rx_data_t packet_area packet_area = 0xffffff016ed45d50 > ffffff01664fc000::print struct e1000g rx_ring[0].rx_data | ::print e1000g_rx_data_t packet_area | ::whatis ffffff016ed45d50 is freed from kmem_alloc_112: ADDR BUFADDR TIMESTAMP THREAD CACHE LASTLOG CONTENTS ffffff0157d59de8 ffffff016ed45d50 ba0820ccd ffffff0161efc060 ffffff0146629008 ffffff01483637c0 ffffff014b3cd818 kmem_cache_free_debug+0x10f kmem_cache_free+0x153 kmem_free+0x55 e1000g_free_rx_sw_packet+0x53 e1000g_free_rx_packets+0x7b e1000g_alloc_rx_packets+0xa9 e1000g_alloc_packets+0xb8 e1000g_alloc_dma_resources+0x90 e1000g_start+0x101 e1000g_m_start+0x55 mac_start+0x8e dls_open+0x83 dld_str_attach+0x1b0 dld_str_open+0xad dld_open+0x37
e1000g_free_rx_sw_packet can optionally free memory associated with rx data packet. (B_TRUE as the second argument causes the packet to be freed.)
When starting a mac, e1000g_start gets control. Then the following calls happen (-> indicates a function call):
e1000g_start -> e1000g_alloc_dma_resources e1000g_alloc_dma_resources -> e1000g_alloc_packets e1000g_alloc_packets -> e1000g_alloc_rx_packets
If e1000g_alloc_rx_packets fails to allocate a packet, it frees all already allocated packets by calling e1000g_free_rx_packets(..., B_TRUE). The B_TRUE argument is passed along to e1000g_free_rx_sw_packet to kmem_free the packet structures. Then, e1000g_alloc_rx_packets returns an error code which propagates all the way up to e1000g_start which continues:
e1000g_start -> e1000g_release_dma_resources e1000g_release_dma_resources -> e1000g_free_packets e1000g_free_packets -> e1000g_free_rx_packets(..., B_FALSE)
e1000g_free_rx_packets loops over rx_data->packet_area which now points to freed data. We get a panic.
e1000g_free_rx_packets needs to clear rx_data->packet_area after processing the linked list if kmem_free was requested. This will prevent the second iteration from using freed memory.
The only other case when packet_area is freed is when e1000g_start succeeds and e1000g_stop is tearing things down. There, we're dealing with this sequence of events:
e1000g_stop -> e1000g_release_dma_resources e1000g_free_packets -> e1000g_free_rx_packets(..., B_FALSE)
e1000g_free_rx_packets loops over rx_data->packet_area not freeing. Eventually, we return to e100g_stop and it continues:
e1000g_stop -> e1000g_free_rx_pending_buffers e1000g_free_rx_pending_buffers -> e1000g_free_rx_sw_packet(..., B_TRUE)
e1000g_free_rx_pending_buffers is called only if rx_data->pending_count is 0. Note that in this case, we iterate over the list twice. To actually free the packets the second time around, we must not clear the packet_area pointer.
So, we must clear the packet_area if and only if we freed the packets via kmem_free.
Updated by Electric Monk almost 7 years ago
- Status changed from New to Closed
- % Done changed from 50 to 100
git commit bcfab0594401266bd287f71573312d8af05de184
commit bcfab0594401266bd287f71573312d8af05de184 Author: Josef 'Jeff' Sipek <josef.sipek@nexenta.com> Date: 2015-06-04T19:57:36.000Z 5976 e1000g use after free on start failure Reviewed by: Robert Mustacchi <rm@joyent.com> Reviewed by: Marcel Telka <marcel.telka@nexenta.com> Reviewed by: Hans Rosenfeld <hans.rosenfeld@nexenta.com> Reviewed by: Kevin Crowe <kevin.crowe@nexenta.com> Approved by: Dan McDonald <danmcd@omniti.com>