Project

General

Profile

Bug #8100

8021 seems to cause random BAD TRAP: type=d (#gp General protection)

Added by Toomas Soome over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2017-04-24
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

My supermicro system is getting random BAD TRAP: type=d (#gp General protection) at about the stage where ZFS filesystems are mounted - usually console login prompt is already present but the services are still starting. After backing out 8021, the boot is completed and no panics do occur.

Machine does dump, however savecore fails:

savecore: bad magic number baddcafe

I can get more data out with boot -k, if needed.

# psrinfo -vp
The physical processor has 4 cores and 8 virtual processors (0-7)
  The core has 2 virtual processors (0 4)
  The core has 2 virtual processors (1 5)
  The core has 2 virtual processors (2 6)
  The core has 2 virtual processors (3 7)
    x86 (GenuineIntel 306C3 family 6 model 60 step 3 clock 3500 MHz)
      Intel(r) Xeon(r) CPU E3-1246 v3 @ 3.50GHz

# prtconf -m
32657

$ zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            c3t0d0  ONLINE       0     0     0
            c3t1d0  ONLINE       0     0     0
            c3t3d0  ONLINE       0     0     0
            c3t4d0  ONLINE       0     0     0

errors: No known data errors

Files

beastie.txt (67.2 KB) beastie.txt Toomas Soome, 2017-04-24 05:22 PM
abd-raidz-fixes.patch (1.79 KB) abd-raidz-fixes.patch Daniel Kimmel, 2017-05-06 03:46 PM

Related issues

Related to illumos gate - Bug #8021: ARC buf data scatter-izationClosed2017-03-31

Actions
Has duplicate illumos gate - Bug #8107: Bad trap in "genunix" due to NULL pointer dereference on RAIDZClosed2017-04-25

Actions

History

#1

Updated by Toomas Soome over 2 years ago

  • Description updated (diff)
#2

Updated by Toomas Soome over 2 years ago

  • Description updated (diff)
#3

Updated by Marcel Telka over 2 years ago

  • Related to Bug #8021: ARC buf data scatter-ization added
#4

Updated by Dan McDonald over 2 years ago

savecore: bad magic number baddcafe

Ooooh, 0xbaddcafe == uninitialized allocated memory with a debug kernel or kmem_flags = 0xf.

#5

Updated by Daniel Kimmel over 2 years ago

A crash dump or backtrace would be really handy here. Let me know how I can help.

#6

Updated by Toomas Soome over 2 years ago

Not sure how much it helps, but attached one sample session.

#7

Updated by Toomas Soome over 2 years ago

Better trace:

panic[cpu3]/thread=ffffff003d70ac40: 
BAD TRAP: type=d (#gp General protection) rp=ffffff003d70a890 addr=ffffff003d70aad0

zpool-rpool: 
#gp General protection
addr=0xffffff003d70aad0
pid=6, pc=0xfffffffff7a33b45, sp=0xffffff003d70a980, eflags=0x10286
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 1406f8<smep,osxsav,xmme,fxsr,pge,mce,pae,pse,de>
cr2: 628000
cr3: 38ea000
cr8: 0

        rdi: ffffff0d56e02cd0 rsi: ffffff0d824b1888 rdx:                0
        rcx: baddcafebaddcaee  r8:                0  r9:                0
        rax: baddcafebaddcaee rbx: ffffff003d70a9f8 rbp: ffffff003d70a9a0
        r10:                1 r11:                1 r12: baddcafebaddcafe
        r13: ffffff003d70a9f8 r14: ffffff0d56e02f40 r15: ffffff003d70aad0
        fsb:                0 gsb: ffffff0d5d119580  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                d err:                0 rip: fffffffff7a33b45
         cs:               30 rfl:            10286 rsp: ffffff003d70a980
         ss:                0         

ffffff003d70a770 unix:real_mode_stop_cpu_stage2_end+b2c3 ()
ffffff003d70a880 unix:trap+a30 ()
ffffff003d70a890 unix:cmntrap+e6 ()
ffffff003d70a9a0 zfs:zio_walk_parents+35 ()
ffffff003d70aa40 zfs:zio_done+443 ()
ffffff003d70aa70 zfs:zio_execute+7f ()
ffffff003d70ab30 genunix:taskq_thread+2d0 ()
ffffff003d70ab40 unix:thread_start+8 ()

[3]> ::stack
kmdb_enter+0xb()
debug_enter+0x59(fffffffffb9361b8)
panicsys+0x5ce(fffffffffb934948, ffffff003d70a680, fffffffffbc7e260, 1)
vpanic+0x15c()
param_preset()
0xfffffffffb84a79c()
trap+0xa30(ffffff003d70a890, ffffff003d70aad0, 3)
0xfffffffffb8001d6()
zfs`zio_walk_parents+0x35(ffffff0d56e02bd0, ffffff003d70a9f8)
zfs`zio_done+0x443(ffffff0d56e02bd0)
zfs`zio_execute+0x7f(ffffff0d56e02bd0)
taskq_thread+0x2d0(ffffff0d56acc870)
thread_start+8()
[3]>

#8

Updated by Gernot Strasser over 2 years ago

Same issue here. Smartos latest bits, built from scratch, non-debug Version.
Dell Poweredge and Microserver N36L
No savecore.

#9

Updated by Robert Mustacchi over 2 years ago

With kmem_flags I was able to get the following stack trace:

kernel memory allocator: redzone violatio
panic[cpu7]/thread=ffffff0dc4d52c20: kernel heap corruption detected

ffffff003ea75b10 fffffffffba39684 ()
ffffff003ea75ba0 genunix:kmem_cache_alloc_debug+338 ()
ffffff003ea75c00 genunix:kmem_cache_alloc+d4 ()
ffffff003ea75c40 genunix:kmem_alloc+4b ()
ffffff003ea75c90 genunix:console_vprintf+b1 ()
ffffff003ea75d00 genunix:console_printf+40 ()
ffffff003ea75d90 genunix:log_sendmsg+41f ()
ffffff003ea75f70 genunix:cprintf+3b6 ()
ffffff003ea76030 genunix:printf+91 ()
ffffff003ea760e0 genunix:kmem_error+537 ()
ffffff003ea76170 genunix:kmem_cache_alloc_debug+338 ()
ffffff003ea761d0 genunix:kmem_cache_alloc+d4 ()
ffffff003ea76210 genunix:kmem_alloc+4b ()
ffffff003ea76260 genunix:console_vprintf+b1 ()
ffffff003ea762d0 genunix:console_printf+40 ()
ffffff003ea76360 genunix:log_sendmsg+41f ()
ffffff003ea76540 genunix:cprintf+3b6 ()
ffffff003ea76600 genunix:printf+91 ()
ffffff003ea766b0 genunix:kmem_error+537 ()
ffffff003ea76740 genunix:kmem_cache_alloc_debug+338 ()
ffffff003ea767a0 genunix:kmem_cache_alloc+d4 ()
ffffff003ea767e0 genunix:kmem_alloc+4b ()
ffffff003ea76830 genunix:console_vprintf+b1 ()
ffffff003ea768a0 genunix:console_printf+40 ()
ffffff003ea76930 genunix:log_sendmsg+41f ()
ffffff003ea76b10 genunix:cprintf+3b6 ()
ffffff003ea76bd0 genunix:printf+91 ()
ffffff003ea76c80 genunix:kmem_error+537 ()
ffffff003ea76d10 genunix:kmem_cache_alloc_debug+338 ()
ffffff003ea76d70 genunix:kmem_cache_alloc+d4 ()
ffffff003ea76db0 genunix:kmem_alloc+4b ()
ffffff003ea76e00 genunix:console_vprintf+b1 ()
ffffff003ea76e70 genunix:console_printf+40 ()
ffffff003ea76f00 genunix:log_sendmsg+41f ()
ffffff003ea770e0 genunix:cprintf+3b6 ()
ffffff003ea771a0 genunix:printf+91 ()
ffffff003ea77250 genunix:kmem_error+537 ()
ffffff003ea77290 genunix:kmem_free+1a8 ()
ffffff003ea772c0 zfs:abd_put+83 ()
ffffff003ea77300 zfs:vdev_raidz_map_free+aa ()
ffffff003ea773b0 zfs:vdev_raidz_physio+288 ()
ffffff003ea77450 zfs:zvol_dumpio_vdev+20b ()
ffffff003ea774f0 zfs:zvol_dumpio+12d ()
ffffff003ea77590 zfs:zvol_strategy+2e5 ()
ffffff003ea775c0 genunix:bdev_strategy+61 ()
ffffff003ea77620 specfs:spec_startio+8e ()
ffffff003ea77710 specfs:spec_getapage+1e9 ()
ffffff003ea77800 genunix:pvn_getpages+161 ()
ffffff003ea778b0 specfs:spec_getpage+d2 ()
ffffff003ea77970 genunix:fop_getpage+7e ()
ffffff003ea77b60 genunix:vpm_map_pages+2eb ()
ffffff003ea77c50 genunix:vpm_data_copy+8a ()
ffffff003ea77cf0 specfs:spec_read+15e ()
ffffff003ea77d90 genunix:fop_read+f3 ()
ffffff003ea77f00 genunix:pread+1a3 ()
ffffff003ea77f10 unix:brand_sys_syscall+21d ()

n: write past end of buffer

panic: entering debugger (continue to save dump)
Loaded modules: [ scsi_vhci stmf_sbd crypto mac cpc uppc neti sd ptm ufs mpt zfs
 sppp sata apix uhci fctl hook lofs idm ip logindmux usba xhci specfs pcplusmp 
nfs random qlc mm cpu.generic arp mpt_sas stmf sockfs smbsrv ]
kmdb: target stopped at:
kmdb_enter+0xb: movq   %rax,%rdi

Unfortunately kmdb blew up before I could get more info.

#10

Updated by Robert Mustacchi over 2 years ago

New one on the next boot:

f4ee^ bxstat = 0, should be f4ee

panic[cpu8]/thread=ffffff0ee71c10a0: vmem_hash_delete(ffffff0cfea0d000, ffffff0dd711ac10, 3857904): bad free

ffffff003fbf2190 genunix:vmem_hash_delete+9b ()
ffffff003fbf21f0 genunix:vmem_xfree+4b ()
ffffff003fbf2220 genunix:vmem_free+23 ()
ffffff003fbf2260 genunix:kmem_free+128 ()
ffffff003fbf22a0 zfs:abd_free_scatter+e2 ()
ffffff003fbf22c0 zfs:abd_free+16 ()
ffffff003fbf2300 zfs:vdev_raidz_map_free+4e ()
ffffff003fbf23b0 zfs:vdev_raidz_physio+288 ()
ffffff003fbf2450 zfs:zvol_dumpio_vdev+20b ()
ffffff003fbf24f0 zfs:zvol_dumpio+12d ()
ffffff003fbf2590 zfs:zvol_strategy+2e5 ()
ffffff003fbf25c0 genunix:bdev_strategy+61 ()
ffffff003fbf2620 specfs:spec_startio+8e ()
ffffff003fbf2710 specfs:spec_getapage+1e9 ()
ffffff003fbf2800 genunix:pvn_getpages+161 ()
ffffff003fbf28b0 specfs:spec_getpage+d2 ()
ffffff003fbf2970 genunix:fop_getpage+7e ()
ffffff003fbf2b60 genunix:vpm_map_pages+2eb ()
ffffff003fbf2c50 genunix:vpm_data_copy+8a ()
ffffff003fbf2cf0 specfs:spec_read+15e ()
ffffff003fbf2d90 genunix:fop_read+f3 ()
ffffff003fbf2f00 genunix:pread+1a3 ()
ffffff003fbf2f10 unix:brand_sys_syscall+21d ()
#11

Updated by Robert Mustacchi over 2 years ago

If we look at the abd_t that we attempted to free, in this most recent case, it is uninitialized:

[8]> $C
fffffffffbc82ec0 kmdb_enter+0xb()
fffffffffbc82ef0 debug_enter+0x59(fffffffffb934e08)
fffffffffbc82fd0 panicsys+0x5ce(fffffffffbb74138, ffffff003fbf2100, 
fffffffffbc82fe0, 1)
ffffff003fbf20f0 vpanic+0x15c()
ffffff003fbf2160 param_preset()
ffffff003fbf2190 vmem_hash_delete+0x9b(ffffff0cfea0d000, ffffff0dd711ac10, 
3addf0)
ffffff003fbf21f0 vmem_xfree+0x4b(ffffff0cfea0d000, ffffff0dd711ac10, 3addf0)
ffffff003fbf2220 vmem_free+0x23(ffffff0cfea0d000, ffffff0dd711ac10, 3addf0)
ffffff003fbf2260 kmem_free+0x128(ffffff0dd711ac10, 3addf0)
ffffff003fbf22a0 zfs`abd_free_scatter+0xe2(ffffff0dd711ac10)
ffffff003fbf22c0 zfs`abd_free+0x16(ffffff0dd711ac10)
ffffff003fbf2300 zfs`vdev_raidz_map_free+0x4e(ffffff0d68c8ed00)
ffffff003fbf23b0 zfs`vdev_raidz_physio+0x288(ffffff0d90010c00, fffffe06405d7000
, 1000, 5ffabb000, 5ffaab000, 40, ffffff0000000000)
ffffff003fbf2450 zfs`zvol_dumpio_vdev+0x20b(ffffff0d90010c00, fffffe06405d7000, 
5ffabb000, 5ffaab000, 1000, 40, ffffff0d00000000)
ffffff003fbf24f0 zfs`zvol_dumpio+0x12d(ffffff0d578bb780, fffffe06405d7000, 
3ff8f0000, 1000, 40, 0)
ffffff003fbf2590 zfs`zvol_strategy+0x2e5(ffffff0dc85e5180)
ffffff003fbf25c0 bdev_strategy+0x61(ffffff0dc85e5180)
ffffff003fbf2620 specfs`spec_startio+0x8e(ffffff0d5d0dbb00, ffffff002e70a808, 
3ff8f0000, 1000, 40)                  
ffffff003fbf2710 specfs`spec_getapage+0x1e9(ffffff0d5d0dbb00, 3ff8f0000, 1000, 
ffffff003fbf2afc, ffffff003fbf29f0, 1000, fffffffffbc3d460, 8f5000, 1, 
ffffff0d5d379b70)
ffffff003fbf2800 pvn_getpages+0x161(fffffffffbbe06c0, ffffff0d5d0dbb00, 
3ff8f0000, 1000, ffffff003fbf2afc, ffffff003fbf29f0, 1000, fffffffffbc3d460, 
8f5000, ffffff0000000001, ffffff0d5d379b70)
ffffff003fbf28b0 specfs`spec_getpage+0xd2(ffffff0d5d0dbb00, 3ff8f0000, 1000, 
ffffff003fbf2afc, ffffff003fbf29f0, 1000, fffffffffbc3d460, 8f5000, 
ffffff0000000001, ffffff0d5d379b70, 0)
ffffff003fbf2970 fop_getpage+0x7e(ffffff0d5d0dbb00, 3ff8f0000, 1000, 
ffffff003fbf2afc, ffffff003fbf29f0, 1000, fffffffffbc3d460, 8f5000, 1, 
ffffff0d5d379b70, 0)
ffffff003fbf2b60 vpm_map_pages+0x2eb(ffffff0d5d0dbb00, 3ff8f0000, 778, 1, 
ffffff003fbf2ba0, 3, ffffff003fbf2bec, ffffff0000000001)
ffffff003fbf2c50 vpm_data_copy+0x8a(ffffff0d5d0dbb00, 3ff8f0000, 778, 
ffffff003fbf2e60, 1, 0, ffffff0d00000000, 1)
ffffff003fbf2cf0 specfs`spec_read+0x15e(ffffff0d57724100, ffffff003fbf2e60, 0, 
ffffff0d5d379b70, 0)
ffffff003fbf2d90 fop_read+0xf3(ffffff0d57724100, ffffff003fbf2e60, 0, 
ffffff0d5d379b70, 0)
ffffff003fbf2f00 pread+0x1a3(4, 426ac0, 778, 3ff8f0000)
ffffff003fbf2f10 sys_syscall+0x1a2()
[8]> ffffff0dd711ac10::print abd_t
{
    abd_flags = 0xbaddcafe (ABD_FLAG_{OWNER|META}|0xffffffffbaddcaf8)
    abd_size = 0xbaddcafe
    abd_parent = 0xbaddcafebaddcafe
    abd_children = {
        rc_count = 0
    }
    abd_u = {
        abd_scatter = {
            abd_offset = 0xbaddcafe
            abd_chunk_size = 0xbaddcafe
            abd_chunks = [ ... ]
        }
        abd_linear = {
            abd_buf = 0xbaddcafebaddcafe
        }
    }
}

Note that 0xbaddcafe means that the data has been allocated, but not initialized. This comes from the raidz map. If we look at it, we can figure out which abd_t this is. It can either be the copy abd on the radiz map or it can be one related to one of the columns:

[8]> ffffff0d68c8ed00::print raidz_map_t
{
    rm_cols = 0x3
    rm_scols = 0x3
    rm_bigcols = 0
    rm_asize = 0x30000
    rm_missingdata = 0
    rm_missingparity = 0
    rm_firstdatacol = 0x1
    rm_nskip = 0
    rm_skipstart = 0
    rm_abd_copy = 0
    rm_reports = 0
    rm_freed = 0
    rm_ecksuminjected = 0
    rm_col = [
        {
            rc_devidx = 0
            rc_offset = 0x1ffe39000
            rc_size = 0x10000
            rc_abd = 0xffffff0dd711ac10
            rc_gdata = 0
            rc_error = 0
            rc_tried = 0
            rc_skipped = 0            
        },
    ]
}
[8]> ffffff0d68c8ed00::print raidz_map_t rm_firstdatacol
rm_firstdatacol = 0x1

Note here that the rm_firstdatacol is 0x1 and that the copy abd is NULL. So we must be freeing one of the other ones. If we look here, it turns out its the first one that's uninitialized:

[8]> ffffff0d68c8ed00::print raidz_map_t rm_firstdatacol
rm_firstdatacol = 0x1
[8]> ffffff0d68c8ed00::print raidz_map_t rm_col[0]      
rm_col[0] = {
    rm_col[0].rc_devidx = 0
    rm_col[0].rc_offset = 0x1ffe39000
    rm_col[0].rc_size = 0x10000
    rm_col[0].rc_abd = 0xffffff0dd711ac10
    rm_col[0].rc_gdata = 0
    rm_col[0].rc_error = 0
    rm_col[0].rc_tried = 0
    rm_col[0].rc_skipped = 0
}
[8]> ffffff0d68c8ed00::print raidz_map_t rm_col[0].rc_abd
rm_col[0].rc_abd = 0xffffff0dd711ac10
[8]> ffffff0d68c8ed00::print raidz_map_t rm_col[0].rc_abd[]
rm_col[0].rc_abd = {
    rm_col[0].rc_abd->abd_flags = 0xbaddcafe (ABD_FLAG_{OWNER|META}|
        0xffffffffbaddcaf8)
    rm_col[0].rc_abd->abd_size = 0xbaddcafe
    rm_col[0].rc_abd->abd_parent = 0xbaddcafebaddcafe
    rm_col[0].rc_abd->abd_children = {
        rc_count = 0
    }
    rm_col[0].rc_abd->abd_u = {
        abd_scatter = {
            abd_offset = 0xbaddcafe
            abd_chunk_size = 0xbaddcafe
            abd_chunks = [ ... ]
        }
        abd_linear = {
            abd_buf = 0xbaddcafebaddcafe
        }
    }
}

How could this happen? Well, if we look at the code in vdev_raidz_map_alloc it does seem to properly initialize this abd_t, but somehow it's been uninitialized. Just to verify:

[8]> ffffff0d68c8ed00::print -t raidz_map_t rm_col[0].rc_abd
abd_t *rm_col[0].rc_abd = 0xffffff0dd711ac10

That does match the abd_t that we have here. It's not clear if all of these are going to be similar in nature or not. But clearly once we have the uninitialized abd_t, it's pretty clear that this is a problem.

It's not clear exactly how this would be uninitialized, as it does seem to be allocated, but that abd seems to have been clobbered along the way, or some other memory corruption happened and someone allocated something even larger surrounding this. Given the vmem span corruption here, that's not surprising.

#12

Updated by Igor Kozhukhov over 2 years ago

  • Related to Bug #8107: Bad trap in "genunix" due to NULL pointer dereference on RAIDZ added
#13

Updated by Michal Nowak over 2 years ago

  • Related to deleted (Bug #8107: Bad trap in "genunix" due to NULL pointer dereference on RAIDZ)
#14

Updated by Michal Nowak over 2 years ago

  • Has duplicate Bug #8107: Bad trap in "genunix" due to NULL pointer dereference on RAIDZ added
#15

Updated by Jorgen Lundman over 2 years ago

Just wanted to make sure that the message sent to the openzfs list arrives here, where we discovered abd_copy() will be called by raidz where src and dst are the same. Adding a test (to skip work) for it resolved the panic issues we were experiencing.

#16

Updated by Gernot Strasser over 2 years ago

Where can I get the fixed version for another test?

#17

Updated by Michal Nowak over 2 years ago

Gernot Strasser wrote:

Where can I get the fixed version for another test?

You may try these images: http://dlc.openindiana.org/isos/hipster/20170502/.

#18

Updated by Gernot Strasser over 2 years ago

sorry I meant the source, I am building myself.
thanks anyway

#19

Updated by Daniel Kimmel over 2 years ago

There isn't a real fix yet because we don't have a root cause, but some folks had success using the patch mentioned here:
https://github.com/openzfs/openzfs/commit/4ee0199ec059ca9417a72125beae8d89fbc5a9df

#20

Updated by Daniel Kimmel over 2 years ago

Sorry, I should clarify -- that commit is just backing out ABD, but is not what I'm talking about :). See the comments at the very bottom of the page.

#21

Updated by Daniel Kimmel over 2 years ago

Thanks to the generous folks at Joyent, I was able to get access to a system that was reproducing the problem and root caused it to this line from the original integration (https://github.com/openzfs/openzfs/commit/2c9df17e7833028517263307ab877871c829e9a2#diff-826708a55165ad963711ccae15edaad6R1850), where for I/O to RAID-Z dump devices I passed an abd_t where I meant to pass the data buf associated with it. The Linux version of ABD doesn't have this mistake in it, because they never pulled over the illumos patch to allow RAID-Z dump devices.

I also audited the rest of the ABD / RAID-Z interactions, since that's the only part of the patch that Delphix hasn't been using for months. The full diff includes a couple of small cleanups elsewhere in that code but only the one correctness fix above. I'm attaching the patch file here in case anyone wants to give it a shot on their local system.

I'll attempt to reintegrate ABD after I hear back that this seems to be working for folks whose systems were borked before.

#22

Updated by Gernot Strasser over 2 years ago

This fixes the problem also for me. So far no issues found.
Thanks!

#23

Updated by Electric Monk over 2 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 770499e185d15678ccb0be57ebc626ad18d93383

commit  770499e185d15678ccb0be57ebc626ad18d93383
Author: Dan Kimmel <dan.kimmel@delphix.com>
Date:   2017-05-18T17:15:51.000Z

    8021 ARC buf data scatter-ization
    8100 8021 seems to cause random BAD TRAP: type=d (#gp General protection)
    Reviewed by: Matthew Ahrens mahrens@delphix.com
    Reviewed by: George Wilson george.wilson@delphix.com
    Reviewed by: Paul Dagnelie pcd@delphix.com
    Reviewed by: John Kennedy john.kennedy@delphix.com
    Reviewed by: Prakash Surya prakash.surya@delphix.com
    Reviewed by: Prashanth Sreenivasa pks@delphix.com
    Reviewed by: Pavel Zakharov pavel.zakharov@delphix.com
    Reviewed by: Chris Williamson chris.williamson@delphix.com
    Approved by: Richard Lowe <richlowe@richlowe.net>

Also available in: Atom PDF