Project

General

Profile

Bug #3620

Corruption of the `xprt-ready' queue in svc_xprt_qdelete()

Added by Marcel Telka over 7 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
nfs - NFS server and client
Start date:
2013-03-11
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

The `xprt-ready' queue in rpcmod could get corrupted by svc_xprt_qdelete(). This could lead to a situation when a request is queued in a pool, but there is missing hint for it in the `xprt-ready' queue. As a result a thread will spin in the svc_poll() trying to get a hint from the `xprt-ready' queue, but since the hint is not there it will spin forever.

#1

Updated by Marcel Telka over 7 years ago

Root Cause 1 (less likely)

The svc_xprt_qdelete() works on the `xprt-ready' queue and tries to clean some entries there. Since such work is unprotected by a mutex, it can happen that the `xprt-ready' queue changes during the svc_xprt_qdelete() run (in svc_xprt_qget()/svc_xprt_qput()).

987static void
988svc_xprt_qdelete(SVCPOOL *pool, SVCMASTERXPRT *xprt)
989{
990    __SVCXPRT_QNODE *q = pool->p_qend;
991    __SVCXPRT_QNODE *qtop = pool->p_qtop;
992
993    /*
994     * Delete all the references to xprt between the current
995     * position of pool->p_qend and current pool->p_qtop.
996     */
997    for (;;) {
998        if (q->q_xprt == xprt)
999            q->q_xprt = NULL;
1000        if (q == qtop)
1001            return;
1002        q = q->q_next;
1003    }

In extreme case it can happen that the qtop might overtake the current q. If that happens, and if the line 998 is executed before the overtake, and line 999 is executed after the overtake, we could unintentionally overwrite the valid q_xprt by NULL.

Root Cause 2 (more likely)

The another problem in svc_xprt_qdelete() is that it could overwrite pool->p_qtop->q_xprt. Since this slot in the `xprt-ready' queue is considered free - see svc_xprt_qput(), this might overwrite a newly added entry if the entry addition happens just between lines 998 and 999 above.

Either scenario described above will cause that we will have a request queued, but we'll miss a hint for it. So a thread will cycle in svc_poll().

#2

Updated by Marcel Telka over 7 years ago

  • Status changed from New to Pending RTI
#3

Updated by Rich Lowe over 7 years ago

  • Status changed from Pending RTI to Resolved
  • % Done changed from 0 to 100
  • Tags deleted (needs-triage)

Resolved in 8cc2da6

Also available in: Atom PDF