Project

General

Profile

Actions

Bug #13096

closed

xnf asleep at wheel while freemem smashes into the ground

Added by Joshua M. Clulow almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Category:
driver - device drivers
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

The xnf driver (for Xen virtual network interfaces) performs a variety of exciting allocations within xnf_send(). When the system is under memory pressure, those allocations can block waiting for pages to become available. Also while the system is under memory pressure, various kmem caches need to be reaped to free up pages; at least one of those (ip_ire_reclaim()) appears to need to execute in an squeue. The reap can get stuck waiting to enter the squeue forever, preventing any subsequent reaping and all but ensuring deadlock.

In a dump from the firing of the pageout deadman, we can see the kmem_taskq is active and about half way through a reap:

> ::taskq -n kmem_taskq
ADDR             NAME                             ACT/THDS Q'ED  MAXQ INST
fffffe0bd3416c68 kmem_taskq                         1/   1  236   482    0

It is stuck in ip_ire_reclaim():

> ::taskq -n kmem_taskq | ::walk taskq_thread | ::stacks
THREAD           STATE    SOBJ                COUNT
fffffe000f417c20 SLEEP    CV                      1
                 swtch+0x133
                 cv_wait+0x68
                 tcp_ixa_cleanup_wait_and_finish+0x53
                 conn_ixa_cleanup+0x1a3
                 ipcl_walk+0x91
                 ip_ire_reclaim_stack+0x56
                 ip_ire_reclaim+0x36
                 kmem_cache_reap+0x3e
                 taskq_thread+0x2cd
                 thread_start+0xb

The conn_ixa_cleanup() routine kicks some work into an squeue and waits for it to complete. Where are we on squeues:

> ::squeue -v
            ADDR STATE CPU            FIRST             LAST           WORKER
fffffe0bd3d05d00 00329   1 fffffe0bdceede60 fffffe0bdecd81c0 fffffe0010bd7c20
                 |
                 +-->  SQS_PROC     being processed
                       SQS_FAST     ... in fast-path mode
                       SQS_BOUND    worker thread bound to CPU

            ADDR STATE CPU            FIRST             LAST           WORKER
fffffe0bd3d05dc0 00820   1 0000000000000000 0000000000000000 fffffe000fcc5c20
                 |
                 +-->  SQS_BOUND    worker thread bound to CPU

            ADDR STATE CPU            FIRST             LAST           WORKER
fffffe0bd3d05e80 00820   0 0000000000000000 0000000000000000 fffffe000f58bc20
                 |
                 +-->  SQS_BOUND    worker thread bound to CPU

Ah, one is running. Let's see what it is doing:

> fffffe0bd3d05d00::print squeue_t sq_run | ::stacks
THREAD           STATE    SOBJ                COUNT
fffffe0bdd9920c0 SLEEP    CV                      1
                 swtch+0x133
                 cv_wait+0x68
                 page_create_throttle+0x174
                 page_create_va+0x598
                 segkmem_page_create+0x97
                 segkmem_xalloc+0x13f
                 segkmem_alloc_vn+0x3b
                 segkmem_alloc+0x17
                 vmem_xalloc+0x629
                 vmem_alloc+0x190
                 kmem_slab_create+0x7c
                 kmem_slab_alloc+0x10b
                 kmem_cache_alloc+0x15b
                 xnf_data_txbuf_alloc+0x17
                 xnf_mblk_map+0x69
                 xnf_send+0x23f
                 mac_ring_tx+0x43
                 mac_provider_tx+0x85
                 mac_tx+0x295
                 str_mdata_fastpath_put+0x8e
                 ip_xmit+0x843
                 ire_send_wire_v4+0x345
                 conn_ip_output+0x1d4
                 tcp_send_data+0x58
                 tcp_output+0x582
                 squeue_enter+0x3f9
                 tcp_sendmsg+0x16c
                 so_sendmsg+0x24a
                 socket_sendmsg+0x62
                 socket_vop_write+0x61
                 fop_write+0x60
                 write+0x2c6

Ah.

Actions

Also available in: Atom PDF