Project

General

Profile

Actions

Bug #4908

closed

rpcbind NULL ptr dereference at forward_destroy+0x22()

Added by Hans Rosenfeld over 9 years ago. Updated about 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
cmd - userland programs
Start date:
2014-06-03
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:

Description

rpcbind aborts repeatedly on my system (OI hipster illumos-adf3407).

From the core:

> $C
08037b98 forward_destroy+0x22(80c0ac8, 80576ba, 8037bcc, 8037bd8, 0, a)
08047cb8 rpcbproc_callit_com+0xe4d(80c0a98, 80c0980, 5, 2)
08047cd8 pmap_service+0x109(80c0a98, 80c0980, 8047d0c, feb67185, fedf2a40, 810fa8c)
08047d28 libnsl.so.1`_svc_prog_dispatch+0x153(80c0980, 80c0a60, 80c0a98, feb6d189, 80525c2, fee10440)
08047d98 libnsl.so.1`_svc_run_mt+0x593(805a2eb, feece353, feef85e5, 8070a58, 8047e38, 8047e70)
08047db8 libnsl.so.1`svc_run+0x48(0, 3, fef6bac0, 2328, 1, 1)
08047df8 main+0x3af(feef8627, fef6f888, 8047e2c, 8054853, 1, 8047e38)
08047e2c _start+0x83(1, 8047ee0, 0, 8047ef2, 8047f0c, 8047f1d)
> forward_destroy+0x22::dis
forward_destroy+4:              subl   $0x4,%esp
forward_destroy+7:              movl   0x8(%ebp),%ebx
forward_destroy+0xa:            movl   0x806d67c,%eax   <rpcbind`fihead>
forward_destroy+0xf:            cmpl   %ebx,%eax
forward_destroy+0x11:           jne    +0xa     <forward_destroy+0x1d>
forward_destroy+0x13:           movl   0x4(%eax),%eax
forward_destroy+0x16:           movl   %eax,0x806d67c   <rpcbind`fihead>
forward_destroy+0x1b:           jmp    +0x8     <forward_destroy+0x25>
forward_destroy+0x1d:           movl   (%ebx),%eax
forward_destroy+0x1f:           movl   0x4(%ebx),%edx
forward_destroy+0x22:           movl   %edx,0x4(%eax)
forward_destroy+0x25:           movl   0x806d680,%eax   <rpcbind`fitail>
forward_destroy+0x2a:           cmpl   %ebx,%eax
forward_destroy+0x2c:           jne    +0x9     <forward_destroy+0x37>
forward_destroy+0x2e:           movl   (%eax),%eax
forward_destroy+0x30:           movl   %eax,0x806d680   <rpcbind`fitail>
forward_destroy+0x35:           jmp    +0x7     <forward_destroy+0x3e>
forward_destroy+0x37:           movl   0x4(%ebx),%eax
forward_destroy+0x3a:           movl   (%ebx),%edx
forward_destroy+0x3c:           movl   %edx,(%eax)
forward_destroy+0x3e:           subl   $0xc,%esp
> ::regs
%cs = 0x0043            %eax = 0x00000000
%ds = 0x004b            %ebx = 0x080c0ac8
%ss = 0x004b            %ecx = 0x00000000
%es = 0x004b            %edx = 0x00000000
%fs = 0x0000            %esi = 0x080c0ac8
%gs = 0x01c3            %edi = 0x08037c04

Related issues

Related to illumos gate - Bug #4575: Single threaded rpcbind is not scalableResolvedMarcel Telka2014-02-05

Actions
Actions #1

Updated by Marcel Telka over 9 years ago

  • Category set to cmd - userland programs
Actions #3

Updated by Sebastien Roy about 9 years ago

We've collected a handful of core dumps from rpcbind due to this bug in the past week. From a cursory glance at one of these dumps, the issue is in this code:

static void
forward_destroy(struct finfo *fi)
{
        assert(MUTEX_HELD(&finfo_lock));
        assert(fi->flag & FINFO_ACTIVE);

        if (fihead == fi) {
                assert(fi->prev == NULL);
                fihead = fi->next;
        } else {
                fi->prev->next = fi->next;
        }

The finfo structure has typical linked list members at the start:

struct finfo {
        struct finfo    *prev;
        struct finfo    *next;
        int             flag;
#define FINFO_ACTIVE    0x1
        ulong_t         caller_xid;
...

So, let's look at the finfo passed in:

> 80c0a48/2p
0x80c0a48:      0               0

Hmm, it must be the head of the list. What is fihead?

> fihead::print -i uintptr_t               
0x806d67c

It's not the pointer passed in. What is fihead pointing at, then?

> fihead/2p
rpcbind`fihead:
rpcbind`fihead: 0x810fbb0       0x80c0a48

Interesting, so the next pointer points at the finfo structure passed in to
forward_destroy(), but how on earth could the head of this list have a non-NULL
prev pointer?

In any case, when we execute the line:

        fi->prev->next = fi->next;

We'll dereference the NULL next pointer, which is how we faulted on address 4.

The mystery is how the list became so corrupt.

Actions #4

Updated by Marcel Telka about 9 years ago

  • Status changed from New to In Progress
  • Assignee set to Marcel Telka
Actions #5

Updated by Marcel Telka about 9 years ago

  • Subject changed from rpcbind NULL ptr dereference to rpcbind NULL ptr dereference at forward_destroy+0x22()
Actions #6

Updated by Marcel Telka about 9 years ago

Root cause

When the new (2nd, 3rd, etc., not the 1st one) finfo struct is being added to the fihead/fitail list in forward_register(), the prev pointer of the existing fihead is not updated to point to the newly added finfo, leaving it NULL.

Actions #7

Updated by Marcel Telka about 9 years ago

  • Status changed from In Progress to Pending RTI
Actions #8

Updated by Electric Monk about 9 years ago

  • Status changed from Pending RTI to Closed
  • % Done changed from 0 to 100

git commit e9b15d2325fa000cefe14bd7e131ea138867c7f7

commit  e9b15d2325fa000cefe14bd7e131ea138867c7f7
Author: Marcel Telka <marcel.telka@nexenta.com>
Date:   2014-07-23T14:03:43.000Z

    4908 rpcbind NULL ptr dereference at forward_destroy+0x22()
    Reviewed by: Dan McDonald <danmcd@omniti.com>
    Approved by: Gordon Ross <gwr@nexenta.com>

Actions

Also available in: Atom PDF