Project

General

Profile

Actions

Bug #14778

open

NULL deref in log_sendmsg

Added by Ryan Zezeski about 1 month ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
kernel
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

While testing a custom kernel driver I encountered a NULL pointer dereference in log_sendmsg() while it was attempting to print the message of a panic initiated by said custom kernel driver. The deref appears to be for b_wptr on a NULL mblk_t *; I explain below.

First, the panic and stack.

> ::status
debugging crash dump /var/crash/kalm/vmcore.37 (64-bit) from kalm
operating system: 5.11 helios-1.0.21050 (i86pc)
build version: remotes/ci-build/xde-0-g7aeff7448e

image uuid: 15647d6d-6461-ee93-c8d4-fd51bd779cd1
panic message: BAD TRAP: type=e (#pf Page fault) rp=fffffe007ce7e070 addr=20 occurred in module "genunix" due to a NULL pointer dereference
dump content: kernel pages only

> $C
fffffe007ce7e1e0 log_sendmsg+0x450(fffffe587072e340, 0)
fffffe007ce7e3b0 cprintf+0x39d(fffffe5966852f00, fffffe007ce7e430, 30, fffffffffbcb3554, fffffffffbcb356c, fffffffff85d69d0, fffffe0000000000, ffffffff00000000, 0, 0, 0)
fffffe007ce7e420 vzdcmn_err+0x6f(0, fffffffff85d69d0, 2, fffffe5966852f00, fffffe007ce7e430, 0)
fffffe007ce7e4a0 cmn_err+0x61(2, fffffe5966852f00)
fffffe007ce7e520 rust_begin_unwind+0x80()
fffffe007ce7e560 core::panicking::panic_fmt::h7d7971299bdd0756+0x2d()
fffffe007ce7e600 opte::engine::packet::Packet<opte::engine::packet::Initialized>::wrap::h7db70314098cdb88+0x220()
fffffe007ce7f130 xde_rx+0x43()
fffffe007ce7f180 mac_promisc_dispatch_one+0xac(fffffe58beae1778, fffffe59502f46c0, 0, 0)
fffffe007ce7f200 mac_promisc_dispatch+0x82(fffffe5896062010, fffffe59502f46c0, fffffe5895edcb18, 0)
fffffe007ce7f250 mac_provider_tx+0x57(fffffe5896062010, 0, fffffe59502f46c0, fffffe5895edcb18)
fffffe007ce7f2e0 mac_bcast_send+0x28a(fffffe5898d97be8, fffffe5895edcb18, fffffe59502f46c0, 1)
fffffe007ce7f390 mac_tx_send+0x15a(fffffe5895edcb18, 0, fffffe59502f46c0, fffffe007ce7f3b8)
fffffe007ce7f440 mac_tx_single_ring_mode+0x82(fffffe5898da7040, fffffe59502f46c0, 0, 0, 0)
fffffe007ce7f500 mac_tx+0x1a9(fffffe5895edcb18, fffffe59502f46c0, 0, 0, 0)
fffffe007ce7f550 str_mdata_fastpath_put+0x8e(fffffe5898e6ce48, fffffe59502f46c0, 0, 0)
fffffe007ce7f5a0 dld_wput+0x1ac(fffffe5893575df8, fffffe59502f46c0)
fffffe007ce7f610 putnext+0x233(fffffe5895702b78, fffffe59502f46c0)
fffffe007ce7f720 ip_xmit+0x632(fffffe59502f46c0, fffffe5893d41950, 806000, 30, cb1e743a, 0, fffffe5800000000, fffffe5942135830)
fffffe007ce7f7b0 ip_postfrag_loopcheck+0x9c(fffffe59502f46c0, fffffe5893d41950, 806000, 30, cb1e743a, 0, ffffffff00000000, fffffe5942135830)
fffffe007ce7fa00 ire_send_wire_v6+0x126(fffffe5898e60010, fffffe59502f46c0, fffffe5931ae51d0, fffffe59421356c0, fffffe5890babef0)
fffffe007ce7fa90 ire_send_multicast_v6+0xa6(fffffe5898e60010, fffffe59502f46c0, fffffe5931ae51d0, fffffe59421356c0, fffffe5890babef0)
fffffe007ce7fb10 conn_ip_output+0x1d4(fffffe59502f46c0, fffffe59421356c0)
fffffe007ce7fc20 icmp_output_newdst+0x644(fffffe58f3a1d000, fffffe59502f46c0, 0, fffffe59368eaee8, fffffe58a14a5c70, 2b1, fffffe59421356c0)
fffffe007ce7fcb0 rawip_send+0x4b7(fffffe58f3a1d000, fffffe59502f46c0, fffffe007ce7fe50, fffffe58a14a5c70)
fffffe007ce7fd40 so_sendmsg+0x24a(fffffe593afe8d20, fffffe007ce7fe50, fffffe007ce7fe80, fffffe58a14a5c70)
fffffe007ce7fda0 socket_sendmsg+0x62(fffffe593afe8d20, fffffe007ce7fe50, fffffe007ce7fe80, fffffe58a14a5c70)
fffffe007ce7fe40 sendit+0x1ab(d, fffffe007ce7fe50, fffffe007ce7fe80, 8000)
fffffe007ce7ff00 sendto+0x88(d, 1e53d58, 8, 8000, fffffc7feafff308, 20)
fffffe007ce7ff10 sys_syscall+0x17d()

Looking at the mblk chain passed to `log_sendmsg() we find the following.

> fffffe587072e340::list mblk_t b_cont | ::mblk
            ADDR FL TYPE    LEN   BLEN              RPTR             DBLK
fffffe587072e340 0  proto   24    80    fffffe587072ac30 fffffe587072abc0
fffffe587072e300 0  data    142   208   fffffe5870729e30 fffffe5870729dc0

The log_sendmsg() routine expects the message to be in the 2nd mblk, and it checks the string for a specific pattern to determine if a "facility" is provided.

    src = (char *)mp2->b_rptr;
    dst = strstr(src, "FACILITY_AND_PRIORITY] ");
    if (dst != NULL) {
        facility = dst - src;
        body = facility + 23; /* strlen("FACILITY_AND_PRIORITY] ") */
    }

In this case that string does exist.

> fffffe5870729e30/S
0xfffffe5870729e30:             xde: [ID 964519 FACILITY_AND_PRIORITY] WARNING: panicked at 'no bytes found', /home/rpz/oxidecomputer/opte/opte/src/e
ngine/packet.rs:1214:13\n

Eventually this ends up in the following code, which lines up with the asm around the failure site.

                if ((mp2 = copymsg(mp)) == NULL)
                    break;
                if (facility != 0) {
                    src = (char *)mp2->b_cont->b_rptr;
                    dst = src + facility;
                    fac = (lc->pri & LOG_FACMASK) >> 3;
                    dst += snprintf(dst,
                        LOG_FACSIZE + LOG_PRISIZE, "%s.%s",
                        log_fac[MIN(fac, LOG_NFACILITIES)],
                        log_pri[lc->pri & LOG_PRIMASK]);
                    src += body - 2; /* copy "] " too */
                    while (*src != '\0')
                        *dst++ = *src++;
                    *dst++ = '\0';
                    mp2->b_cont->b_wptr = (uchar_t *)dst;
                }

Let's deconstruct the asm and see if we can figure out what's going on. Some keys points to keep in mind:

  • The mblk_t->b_cont is offset 0x10.
  • The mblk_t->b_wptr is offset 0x20.
  • The null pointer deref is for address 0x20, at log_sendmsg+0x450.
> ::print -ta mblk_t
0 mblk_t {
    0 struct msgb *b_next
    8 struct msgb *b_prev
    10 struct msgb *b_cont
    18 unsigned char *b_rptr
    20 unsigned char *b_wptr
    28 struct datab *b_datap
    30 unsigned char b_band
    31 unsigned char b_tag
    32 unsigned short b_flag
    38 queue_t *b_queue
}

> log_sendmsg+0x450::dis
log_sendmsg+0x430:              incq   %rdx
log_sendmsg+0x433:              incq   %r11
log_sendmsg+0x436:              movb   %al,-0x1(%r11)
log_sendmsg+0x43a:              movzbl (%rdx),%eax
log_sendmsg+0x43d:              testb  %al,%al
log_sendmsg+0x43f:              jne    -0x11    <log_sendmsg+0x430>
log_sendmsg+0x441:              movq   -0x48(%rbp),%rax
log_sendmsg+0x445:              movb   $0x0,(%r11)
log_sendmsg+0x449:              incq   %r11
log_sendmsg+0x44c:              movq   0x10(%rax),%rax

log_sendmsg+0x450:              movq   %r11,0x20(%rax)  <--- panic here

log_sendmsg+0x454:              jmp    -0x113   <log_sendmsg+0x346>
log_sendmsg+0x459:              movq   -0x50(%rbp),%rdi
log_sendmsg+0x45d:              call   +0x789ae <copymsg>
log_sendmsg+0x462:              testq  %rax,%rax
log_sendmsg+0x465:              movq   %rax,%rbx
log_sendmsg+0x468:              je     -0x25e   <log_sendmsg+0x210>
log_sendmsg+0x46e:              movq   0x10(%rax),%rax
log_sendmsg+0x472:              movq   -0x70(%rbp),%rdi
log_sendmsg+0x476:              addq   %rdi,0x18(%rax)
log_sendmsg+0x47a:              movq   +0x2b21b7(%rip),%rdi     <log_recentq>

> ::regs
%rax = 0x0000000000000000                 %r9  = 0x0000000000000000
%rbx = 0xfffffffffbd0b080      log_global %r10 = 0xfffffe5891685eb0
%rcx = 0x0000000000000032                 %r11 = 0xfffffe5891685f35
%rdx = 0xfffffe5891685f3d                 %r12 = 0xfffffe587072ac30
%rsi = 0x0000000000000000                 %r13 = 0x0000000000000010
%rdi = 0xfffffffffb95f0f6                 %r14 = 0x0000000000000001
%r8  = 0xfffffe007ce7e03f                 %r15 = 0xfffffffffbd0b080 log_global

%rip = 0xfffffffffba10390 log_sendmsg+0x450
%rbp = 0xfffffe007ce7e1e0
%rsp = 0xfffffe007ce7e160
%rflags = 0x00010286
  id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
  status=<of,df,IF,tf,SF,zf,af,PF,cf>

%cs = 0x0030    %ds = 0x004b    %es = 0x004b    %fs = 0x0000
%gs = 0x0000    %gsbase = 0xfffffe58915ff000    %kgsbase = 0x0
%trapno = 0xe   %err = 0x2      %cr2 = 0x20     %cr3 = 0xfdc907000

Okay, so rax is 0x0, and I'm assuming by the 0x20(%rax) we are at line mp2->b_cont->b_wptr = (uchar_t *)dst; in the source code. Where does rax's value come from?

log_sendmsg+0x44c:              movq   0x10(%rax),%rax

This looks like a mblk_t->b_cont deref. We need to travel further up the asm to see rax populated.

log_sendmsg+0x441:              movq   -0x48(%rbp),%rax

Its value came from position -0x48 in the stack. Given the source code this is presumably variable mp2. In fact, looking at the asm proves this point (remember mblk_t * is first arg, which is passed via %rdi).

log_sendmsg+0x16:               movq   0x10(%rdi),%rsi

log_sendmsg+0x25:               movq   %rsi,-0x48(%rbp)

The mp2 variable is overwritten by a copy of the original mblk, via copymsg(mp). This is done so that the routine can modify the message payload being sent. So in this case mp2 is really a copy of mp, which means it will have a non-NULL b_cont. We can verify this by checking the mblk_t * at -0x48(%rbp).

> <rbp - 0x48/J
0xfffffe007ce7e198:             fffffe589349f440

> fffffe589349f440::list mblk_t b_cont |::mblk
            ADDR FL TYPE    LEN   BLEN              RPTR             DBLK
fffffe589349f440 0  proto   24    80    fffffe5931b6dc70 fffffe5931b6dc00
fffffe587f99cae0 0  data    56    208   fffffe587f9911d8 fffffe587f991140

So how in the hell is %rax NULL? Did Ozymandias send a beam of tachyons through my CPU? I don't have the answer to this question. If anyone has ideas please let me know.

There's one other interesting twist. Remember, this happened as my custom kernel driver (xde) was trying panic. The panic message is for a supposedly zero-sized mblk. However, if you look at the mblk passed to it, it's not zero-sized.

fffffe007ce7e560 core::panicking::panic_fmt::h7d7971299bdd0756+0x2d()
fffffe007ce7e600 opte::engine::packet::Packet<opte::engine::packet::Initialized>::wrap::h7db70314098cdb88+0x220()
fffffe007ce7f130 xde_rx+0x43()
fffffe007ce7f180 mac_promisc_dispatch_one+0xac(fffffe58beae1778, fffffe59502f46c0, 0, 0)
fffffe007ce7f200 mac_promisc_dispatch+0x82(fffffe5896062010, fffffe59502f46c0, fffffe5895edcb18, 0)

> fffffe59502f46c0::mblk
            ADDR FL TYPE    LEN   BLEN              RPTR             DBLK
fffffe59502f46c0 0  data    62    80    fffffe5931ae51c2 fffffe5931ae5140

Seriously, what is happening?

Actions #1

Updated by Ryan Zezeski about 1 month ago

Also, just one more confirmation that log_sendmsg+450 lines up with source code line mp2->b_cont->b_wptr = (uchar_t *)dst;. If I print r11 as a string it shows me the end of the panic message set by xde.

log_sendmsg+0x450:              movq   %r11,0x20(%rax)  

> <r11/S
0xfffffe5891685f35:             1214:13\n

And the original message sent to log_sendmsg.

> fffffe5870729e30/S                                                                                                                        [118/686]
0xfffffe5870729e30:             xde: [ID 964519 FACILITY_AND_PRIORITY] WARNING: panicked at 'no bytes found', /home/rpz/oxidecomputer/opte/opte/src/e
ngine/packet.rs:1214:13\n
Actions

Also available in: Atom PDF