Project

General

Profile

Actions

Bug #15883

open

DMA buffer allocation gets redzone page while dumping

Added by Thirteen Oxide 18 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
kernel
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
Gerrit CR:
External Bug:

Description

While testing a fix for an unrelated bug, I encountered a secondary panic while dumping, thankfully with kmdb loaded. This machine has an NVMe dump device, and I was running reboot -d in a loop. The secondary panic is as follows:

Sep  5 13:52:27 gimlet-sn15 reboot: initiated by root

panic[cpu35]/thread=fffffdcb103ec880: forced crash dump initiated at user request

Warning - stack not written to the dump buffer
fffffb045572be20 genunix:kadmin+62f ()
fffffb045572beb0 genunix:uadmin+11d ()
fffffb045572bf00 unix:brand_sys_syscall32+290 ()

dumping to /dev/dsk/c1t00A0750130082CFBd0s4, offset 219902377984, content: kernel + curproc

panic[cpu35]/thread=fffffdcb103ec880: BAD TRAP: type=e (#pf Page fault) rp=fffffffffbe9d970 addr=fffffdcbdc695000

panic: entering debugger (continue to reboot)

Welcome to kmdb
Loaded modules: [ scsi_vhci crypto mac cpc ipcc neti ptm ufs unix zfs krtld sata apix hook lofs genunix ip logindmux nsmb specfs random mm cpu.generic arp sockfs smbsrv ]
[35]> $C
fffffffffbe9d508 kmdb_enter+0xb()
fffffffffbe9d538 debug_enter+0x75(fffffffffbdb1900)
fffffffffbe9d628 panicsys+0x597(fffffffffbdaa4f0, fffffffffbe9d798, fffffffffbe9d638, 0)
fffffffffbe9d780 vpanic+0x15c()
fffffffffbe9d7f0 0xfffffffffbcc9d41()
fffffffffbe9d880 die+0x115(e, fffffffffbe9d970, fffffdcbdc695000, 23)
fffffffffbe9d960 trap+0xa45(fffffffffbe9d970, fffffdcbdc695000, 23)
fffffffffbe9d970 cmntrap_pushed+0xf6()
fffffffffbe9da80 copy_pattern+0x24(baddcafebaddcafe, fffffdcbdc695000, 1000)
fffffffffbe9db10 kmem_slab_create+0xde(fffffdca78f80008, 1)
fffffffffbe9db80 kmem_slab_alloc+0x18a(fffffdca78f80008, 1)
fffffffffbe9dbd0 kmem_cache_alloc+0x163(fffffdca78f80008, 1)
fffffffffbe9dc50 rootnex`rootnex_coredma_allochdl+0x3f(fffffdca3eb2ad18, fffffdca78ff1d10, fffffffffbe9de38, 0, 0, fffffdcb185679f8)
fffffffffbe9dcc0 rootnex`rootnex_dma_allochdl+0xc5(fffffdca3eb2ad18, fffffdca78ff1d10, fffffffffbe9de38, 0, 0, fffffdcb185679f8)
fffffffffbe9dd40 ddi_dma_allochdl+0x7e(fffffdca3eb298c0, fffffdca78ff1d10, fffffffffbe9de38, 0, 0, fffffdcb185679f8)
fffffffffbe9dda0 pcieb`pcieb_dma_allochdl+0x3e(fffffdca3eb298c0, fffffdca78ff1d10, fffffffffbe9de38, 0, 0, fffffdcb185679f8)
fffffffffbe9de20 ddi_dma_allochdl+0x7e(fffffdca78ff1d10, fffffdca78ff1d10, fffffffffbe9de38, 0, 0, fffffdcb185679f8)
fffffffffbe9ded0 ddi_dma_alloc_handle+0x8c(fffffdca78ff1d10, fffffdca78fe9988, 0, 0, fffffdcb185679f8)
fffffffffbe9df20 blkdev`bd_xfer_ctor+0x78(fffffdcb185679e8, fffffdca78fe9780, 1)
fffffffffbe9dfb0 kmem_cache_alloc_debug+0x18a(fffffdcaeced4008, fffffdcb185679e8, 1, 0, fffffffff803a576)
fffffffffbe9e000 kmem_cache_alloc+0x138(fffffdcaeced4008, 1)
fffffffffbe9e0a0 blkdev`bd_xfer_alloc+0x56(fffffdca78fe9780, fffffdcb039dcd40, fffffffff8034be0, 1)
fffffffffbe9e140 blkdev`bd_dump+0x124(1100000044, fffffdcb2fd22000, 19999a08, 5f8)
fffffffffbe9e170 bdev_dump+0x37(1100000044, fffffdcb2fd22000, 19999a08, 5f8)
fffffffffbe9e1e0 specfs`spec_dump+0x66(fffffdcb0878e800, fffffdcb2fd22000, 19999a08, 5f8, 0)
fffffffffbe9e250 fop_dump+0x4b(fffffdcb0878e800, fffffdcb2fd22000, 19999a08, 5f8, 0)
fffffffffbe9e2a0 dumpvp_flush+0xd7()
fffffffffbe9e2e0 dumpvp_write+0xac(fffffdca3eb93746, b)
fffffffffbe9e310 dumpvp_ksyms_write+0x1c(fffffdca3eb93746, 0, b)
fffffffffbe9e360 ksyms_emit+0x47(fffffffffbe9e628, fffffdca3eb93746, b, 8)
fffffffffbe9e400 ksyms_walk_one+0xdb(fffffffffbe9e628, fffffdca3eb4f000, 564f3)
fffffffffbe9e570 vmem_walk+0xf4(fffffdca0856d000, 1, fffffffffc0d74f0, fffffffffbe9e628)
fffffffffbe9e610 ksyms_walk+0xe4(fffffffffbe9e628, 0, 7ffffffffff0eb27, fffffffffc07ca70, 0, 8)
fffffffffbe9e8b0 ksyms_snapshot+0x2cd(fffffffffc07ca70, 0, 7fffffffffffffff)
fffffffffbe9e950 dumpsys+0x228()
fffffffffbe9ea40 panicsys+0x241(fffffffffc27ccb0, fffffb045572bd38, fffffffffbe9ea50, 1)
fffffb045572bd20 vpanic+0x15c()
fffffb045572bd90 0xfffffffffbcc9d41()
fffffb045572be20 kadmin+0x62f(5, 1, 0, fffffdcbbb743a00)
fffffb045572beb0 uadmin+0x11d(5, 1, 0)
fffffb045572bf00 sys_syscall32+0x20f()
[35]>

Because this fault occurred during dumping, it cannot be debugged post-mortem. So I started looking at it a bit here, so that hopefully we'll have enough to go on.
[35]> fffffdcbdc695000::whatis
fffffdcbdc695000 is allocated from the kmem_firewall vmem arena:
            ADDR TYPE            START              END             SIZE
                                THREAD        TIMESTAMP
fffffdcb32575c18 ALLC fffffdcbdc695000 fffffdcbdc696000             4096
[35]> fffffdcbdc695000::vtop
Address fffffdcbdc695000 is unmapped.
kmdb: failed to get physical mapping: no mapping for address
[35]> fffffdca78f80008::kmem_cache
ADDR             NAME                      FLAG  CFLAG  BUFSIZE  BUFTOTL
fffffdca78f80008 rootnex_dmahdl            0269 000000     2744     2578
[35]> fffffffffbe9de38::print ddi_dma_attr_t
{
    dma_attr_version = 0
    dma_attr_addr_lo = 0
    dma_attr_addr_hi = 0xffffffffffffffff
    dma_attr_count_max = 0xfff
    dma_attr_align = 0x1000
    dma_attr_burstsizes = 0x7ff
    dma_attr_minxfer = 0x1000
    dma_attr_maxxfer = 0x20000
    dma_attr_seg = 0xfff
    dma_attr_sgllen = 0xffffffff
    dma_attr_granular = 0x200
    dma_attr_flags = 0x200
}
[35]>

This doesn't seem particularly promising, but suggests the possibility of memory corruption elsewhere.
[35]> ::kmem_verify
Cache Name                      Addr             Cache Integrity
...
blkdev1_xfer                    fffffdcaeced4008 1 corrupt buffer
...
[35]> fffffdcaeced4008::kmem_verify
Summary for cache 'blkdev1_xfer'
  buffer fffffdcb2417e548 (allocated) has a corrupt buftag
[35]> fffffdcb2417e548::whatis
fffffdcb2417e548 is allocated from blkdev1_xfer:
            ADDR          BUFADDR        TIMESTAMP           THREAD
                            CACHE          LASTLOG         CONTENTS
fffffdcb186cfd28 fffffdcb2417e548       2269710667 fffffb045071ec20
                 fffffdcaeced4008 fffffdc77b0929c0 fffffdc901326a50
                 kmem_cache_free_debug+0x114
                 kmem_cache_free+0x89
                 blkdev`bd_xfer_free+0x4d
                 blkdev`bd_xfer_done+0x1c7
                 nvme`nvme_bd_xfer_done+0x6e
                 taskq_thread+0x2ee
                 thread_start+0xb

That's interesting, and possibly related as blkdev is once again on the scene. It looks like this has been freed but for some reason is thought to be allocated. Let's see who else might be around:
[35]> ::stacks -m blkdev
THREAD           STATE    SOBJ                COUNT
fffffb045071ec20 ONPROC   <NONE>                  1
                 apix`apix_hilevel_intr_prolog+0xa0
                 apix`apix_do_interrupt+0x565
                 _interrupt+0x1f2
                 mutex_delay_default+0xa
                 dumpsys_helper+0x2b
                 panic_idle+0x25
                 xc_serv+0xd5
                 xc_common+0x241
                 xc_call+0x3d
                 hat_tlb_inval_range+0x2f9
                 hat_tlb_inval+0x29
                 unlink_ptp+0xb5
                 htable_release+0x89
                 hat_unload_callback+0x375
                 hat_unload+0x66
                 segkmem_xfree+0x75
                 segkmem_free+0x23
                 vmem_xfree+0x1c2
                 vmem_free+0x3d
                 kmem_slab_destroy+0x64
                 kmem_slab_free+0x3d7
                 kmem_slab_free_constructed+0x87
                 kmem_cache_free+0x118
                 rootnex`rootnex_coredma_freehdl+0x3e
                 rootnex`rootnex_dma_freehdl+0x75
                 ddi_dma_freehdl+0x56
                 ddi_dma_freehdl+0x56
                 ddi_dma_free_handle+0x1b
                 blkdev`bd_xfer_dtor+0x27
                 kmem_cache_free_debug+0x221
                 kmem_cache_free+0x89
                 blkdev`bd_xfer_free+0x4d
                 blkdev`bd_xfer_done+0x1c7
                 nvme`nvme_bd_xfer_done+0x6e
                 taskq_thread+0x2ee
                 thread_start+0xb

Is this the "corrupt" bufctl?
[35]> fffffb045071ec20::findstack -v
stack pointer for thread fffffb045071ec20 (tq:nvme0_cmd_taskq1): fffffb045071deb0
  fffffb045071df20 apix`apix_hilevel_intr_prolog+0xa0(fffffdcafb8b0000, f, a, fffffb045071dfa0)
  fffffb045071df90 apix`apix_do_interrupt+0x565(fffffb045071dfa0, fffffffffbe41ae0)
  fffffb045071dfa0 _interrupt+0x1f2()
  fffffb045071e0d0 mutex_delay_default+0xa()
  fffffb045071e0f0 dumpsys_helper+0x2b()
  fffffb045071e100 panic_idle+0x25()
  fffffb045071e180 xc_serv+0xd5(0, 0)
  fffffb045071e200 xc_common+0x241(fffffffffbc31fb0, fffffdc5198e5f70, fffffb045071e268, 0, fffffb045071e2a0, 2)
  fffffb045071e250 xc_call+0x3d(fffffdc5198e5f70, fffffb045071e268, 0, fffffb045071e2a0, fffffffffbc31fb0)
  fffffb045071e310 hat_tlb_inval_range+0x2f9(fffffdc5198e5f70, fffffb045071e328)
  fffffb045071e350 hat_tlb_inval+0x29(fffffdc5198e5f70, fffffdcbdc600000)
  fffffb045071e3b0 unlink_ptp+0xb5(fffffdcbbf53f088, fffffdcb183ef400, fffffdcbdc600000)
  fffffb045071e410 htable_release+0x89(fffffdcb183ef400)
  fffffb045071e580 hat_unload_callback+0x375(fffffdc5198e5f70, fffffdcbdc68d000, 1000, 4, 0)
  fffffb045071e5e0 hat_unload+0x66(fffffdc5198e5f70, fffffdcbdc68d000, 1000, 4)
  fffffb045071e670 segkmem_xfree+0x75(fffffdc519807000, fffffdcbdc68d000, 1000, fffffffffc301900, 0)
  fffffb045071e6a0 segkmem_free+0x23(fffffdc519807000, fffffdcbdc68d000, 1000)
  fffffb045071e710 vmem_xfree+0x1c2(fffffdc519808000, fffffdcbdc68d000, 1000)
  fffffb045071e740 vmem_free+0x3d(fffffdc519808000, fffffdcbdc68d000, 1000)
  fffffb045071e780 kmem_slab_destroy+0x64(fffffdca78f80008, fffffdcbdb6a2cd0)
  fffffb045071e7d0 kmem_slab_free+0x3d7(fffffdca78f80008, fffffdcbdc68d540)
  fffffb045071e810 kmem_slab_free_constructed+0x87(fffffdca78f80008, fffffdcbdc68d540, 1)
  fffffb045071e850 kmem_cache_free+0x118(fffffdca78f80008, fffffdcbdc68d540)
  fffffb045071e890 rootnex`rootnex_coredma_freehdl+0x3e(fffffdca3eb2ad18, fffffdca78ff1d10, fffffdcbdc68d540)
...

It sure is! The blkdev taskq thread was here, trying to free a completed I/O buffer:
static void
bd_xfer_free(bd_xfer_impl_t *xi)
{
»       if (xi->i_dmah) {
»       »       (void) ddi_dma_unbind_handle(xi->i_dmah);
»       }
»       if (xi->i_dfl != NULL) {
»       »       dfl_free((dkioc_free_list_t *)xi->i_dfl);
»       »       xi->i_dfl = NULL;
»       }
»       kmem_cache_free(xi->i_bd->d_cache, xi);
}

The last call here takes us to the dtor:
static void
bd_xfer_dtor(void *buf, void *arg)
{
»       bd_xfer_impl_t» *xi = buf;

»       _NOTE(ARGUNUSED(arg));

»       if (xi->i_dmah)
»       »       ddi_dma_free_handle(&xi->i_dmah);
»       xi->i_dmah = NULL;
}

The dtor is trying to free the associated DMA handle, which in turn frees a buffer allocated out of the rootnex DMA handle cache that we panicked on. While it was waiting for its shootdowns to complete, it took a priority xcall itself -- to enter `panic_idle`. It gets a bit fuzzy here, but it seems clear that we end up getting a redzone page out of the firewall cache when we go to allocate from it next, while dumping. It's unclear whether this is a problem with the kmem implementation or with the thread that's trying to shoot down the mappings being interruptible at this point. The buftag is presumably not actually corrupt; we're just seeing inconsistent state because the free is in progress.

No data to display

Actions

Also available in: Atom PDF