Project

General

Profile

Actions

Bug #13100

closed

zdb rpool crash on raidz

Added by Toomas Soome almost 3 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:
External Bug:

Description

attempted run of zdb rpool:

# mdb core
Loading modules: [ libumem.so.1 libc.so.1 libzpool.so.1 libnvpair.so.1 libavl.so.1 libcmdutils.so.1 ld.so.1 ]
> ::stack
libzpool.so.1`raidz_copy_abd_cb+0x1c(81d9000, 81ae2d0, 1000, 0)
libzpool.so.1`abd_iterate_func2+0x1c3(7264280, 8188a00, 0, 0, b000, fffffc7fedcbaf00)
libzpool.so.1`avx2_gen_p+0x49(7fe18c0)
libzpool.so.1`vdev_raidz_math_generate+0x4f(7fe18c0)
libzpool.so.1`vdev_raidz_generate_parity+0x16(7fe18c0)
libzpool.so.1`raidz_parity_verify+0x9c(70959e0, 7fe18c0)
libzpool.so.1`vdev_raidz_io_done+0x48b(70959e0)
libzpool.so.1`zio_vdev_io_done+0x90(70959e0)
libzpool.so.1`zio_execute+0xe4(70959e0)
libfakekernel.so.1`taskq_thread+0x9d(cb5680)
libc.so.1`_thrp_setup+0x6c(fffffc7fe6237a40)
libc.so.1`_lwp_start()
> 

Running this with mdb:

> ::run rpool
mdb: stop on SIGSEGV
mdb: target stopped at:
libzpool.so.1`raidz_copy_abd_cb+0x2c:   vmovdqa (%rdx),%ymm0
mdb: You've got symbols!
Loading modules: [ ld.so.1 libc.so.1 libzpool.so.1 libnvpair.so.1 libavl.so.1 libcmdutils.so.1 ]
> ::regs    
%rax = 0x0000000000000000       %r8  = 0x0000000000000000
%rbx = 0x000000000000b000       %r9  = 0xfffffc7fedebe1a0
%rcx = 0x0000000000000080       %r10 = 0x0000000000001000
%rdx = 0x000000000870e290       %r11 = 0xfffffc7feec95cf0
%rsi = 0x000000000870e310       %r12 = 0xfffffc7fd5739be0
%rdi = 0x000000000873e080       %r13 = 0xfffffc7fd5739bc0
                                %r14 = 0x00000000086eda00
                                %r15 = 0x0000000000001000

%cs = 0x0053    %fs = 0x0000    %gs = 0x0000
%ds = 0x004b    %es = 0x004b    %ss = 0x004b

%rip = 0xfffffc7fedebe1cc libzpool.so.1`raidz_copy_abd_cb+0x2c
%rbp = 0xfffffc7fd5739b90
%rsp = 0xfffffc7fd5739b90

%rflags = 0x00010246
  id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
  status=<of,df,IF,tf,sf,ZF,af,PF,cf>

%gsbase = 0x0000000000000000
%fsbase = 0xfffffc7fef0f1240
%trapno = 0xd
   %err = 0x0
>

But %rdx has pointer aligned by 16 bytes, not 32:

ok 0x000000000870e290 32 /mod
ok .s
[data stack has 2 entries, top at 0x000000000134ce00]
[0x000000000134ce00   0]:              4425492 (0x0000000000438714)
[0x000000000134cdf8   1]:                   16 (0x0000000000000010)
[data stack base at 0x000000000134cdf8]
ok 

Testing done: build/install/boot, zdb does not crash with raidz any more.


Related issues

Related to illumos gate - Bug #12668: ZFS support for vectorized algorithms on x86 (initial support)ClosedJerry Jelinek

Actions
Actions #1

Updated by Dan McDonald over 2 years ago

I have a dump that's likely this bug.

::status on mdb notes a bad pointer that looks like a KERNEL pointer. So something here is using kernel pointers when it shouldn't be. It's related to the vectorized additions (#12668).

Actions #2

Updated by Dan McDonald over 2 years ago

  • Related to Bug #12668: ZFS support for vectorized algorithms on x86 (initial support) added
Actions #3

Updated by Toomas Soome about 2 years ago

  • Description updated (diff)
Actions #4

Updated by Electric Monk about 2 years ago

  • Gerrit CR set to 1448
Actions #5

Updated by Toomas Soome about 2 years ago

  • Status changed from New to In Progress
  • Assignee set to Toomas Soome
  • % Done changed from 0 to 90
  • Difficulty changed from Medium to Bite-size
Actions #6

Updated by Toomas Soome about 2 years ago

  • Description updated (diff)
Actions #7

Updated by Electric Monk about 2 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

git commit 9178578f07d5330b7bf2b7b699ec04ea6635297a

commit  9178578f07d5330b7bf2b7b699ec04ea6635297a
Author: Toomas Soome <tsoome@me.com>
Date:   2021-04-30T06:56:25.000Z

    13100 zdb rpool crash on raidz
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Reviewed by: Andy Fiddaman <Andy@omnios.org>
    Reviewed by: Jason King <jason.brian.king@gmail.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF