Bug #13096
closedxnf asleep at wheel while freemem smashes into the ground
100%
Description
The xnf
driver (for Xen virtual network interfaces) performs a variety of exciting allocations within xnf_send()
. When the system is under memory pressure, those allocations can block waiting for pages to become available. Also while the system is under memory pressure, various kmem caches need to be reaped to free up pages; at least one of those (ip_ire_reclaim()
) appears to need to execute in an squeue. The reap can get stuck waiting to enter the squeue forever, preventing any subsequent reaping and all but ensuring deadlock.
In a dump from the firing of the pageout deadman, we can see the kmem_taskq
is active and about half way through a reap:
> ::taskq -n kmem_taskq ADDR NAME ACT/THDS Q'ED MAXQ INST fffffe0bd3416c68 kmem_taskq 1/ 1 236 482 0
It is stuck in ip_ire_reclaim()
:
> ::taskq -n kmem_taskq | ::walk taskq_thread | ::stacks THREAD STATE SOBJ COUNT fffffe000f417c20 SLEEP CV 1 swtch+0x133 cv_wait+0x68 tcp_ixa_cleanup_wait_and_finish+0x53 conn_ixa_cleanup+0x1a3 ipcl_walk+0x91 ip_ire_reclaim_stack+0x56 ip_ire_reclaim+0x36 kmem_cache_reap+0x3e taskq_thread+0x2cd thread_start+0xb
The conn_ixa_cleanup()
routine kicks some work into an squeue and waits for it to complete. Where are we on squeues:
> ::squeue -v ADDR STATE CPU FIRST LAST WORKER fffffe0bd3d05d00 00329 1 fffffe0bdceede60 fffffe0bdecd81c0 fffffe0010bd7c20 | +--> SQS_PROC being processed SQS_FAST ... in fast-path mode SQS_BOUND worker thread bound to CPU ADDR STATE CPU FIRST LAST WORKER fffffe0bd3d05dc0 00820 1 0000000000000000 0000000000000000 fffffe000fcc5c20 | +--> SQS_BOUND worker thread bound to CPU ADDR STATE CPU FIRST LAST WORKER fffffe0bd3d05e80 00820 0 0000000000000000 0000000000000000 fffffe000f58bc20 | +--> SQS_BOUND worker thread bound to CPU
Ah, one is running. Let's see what it is doing:
> fffffe0bd3d05d00::print squeue_t sq_run | ::stacks THREAD STATE SOBJ COUNT fffffe0bdd9920c0 SLEEP CV 1 swtch+0x133 cv_wait+0x68 page_create_throttle+0x174 page_create_va+0x598 segkmem_page_create+0x97 segkmem_xalloc+0x13f segkmem_alloc_vn+0x3b segkmem_alloc+0x17 vmem_xalloc+0x629 vmem_alloc+0x190 kmem_slab_create+0x7c kmem_slab_alloc+0x10b kmem_cache_alloc+0x15b xnf_data_txbuf_alloc+0x17 xnf_mblk_map+0x69 xnf_send+0x23f mac_ring_tx+0x43 mac_provider_tx+0x85 mac_tx+0x295 str_mdata_fastpath_put+0x8e ip_xmit+0x843 ire_send_wire_v4+0x345 conn_ip_output+0x1d4 tcp_send_data+0x58 tcp_output+0x582 squeue_enter+0x3f9 tcp_sendmsg+0x16c so_sendmsg+0x24a socket_sendmsg+0x62 socket_vop_write+0x61 fop_write+0x60 write+0x2c6
Ah.