Actions
Bug #13100
closedzdb rpool crash on raidz
Start date:
Due date:
% Done:
100%
Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:
External Bug:
Description
attempted run of zdb rpool:
# mdb core Loading modules: [ libumem.so.1 libc.so.1 libzpool.so.1 libnvpair.so.1 libavl.so.1 libcmdutils.so.1 ld.so.1 ] > ::stack libzpool.so.1`raidz_copy_abd_cb+0x1c(81d9000, 81ae2d0, 1000, 0) libzpool.so.1`abd_iterate_func2+0x1c3(7264280, 8188a00, 0, 0, b000, fffffc7fedcbaf00) libzpool.so.1`avx2_gen_p+0x49(7fe18c0) libzpool.so.1`vdev_raidz_math_generate+0x4f(7fe18c0) libzpool.so.1`vdev_raidz_generate_parity+0x16(7fe18c0) libzpool.so.1`raidz_parity_verify+0x9c(70959e0, 7fe18c0) libzpool.so.1`vdev_raidz_io_done+0x48b(70959e0) libzpool.so.1`zio_vdev_io_done+0x90(70959e0) libzpool.so.1`zio_execute+0xe4(70959e0) libfakekernel.so.1`taskq_thread+0x9d(cb5680) libc.so.1`_thrp_setup+0x6c(fffffc7fe6237a40) libc.so.1`_lwp_start() >
Running this with mdb:
> ::run rpool mdb: stop on SIGSEGV mdb: target stopped at: libzpool.so.1`raidz_copy_abd_cb+0x2c: vmovdqa (%rdx),%ymm0 mdb: You've got symbols! Loading modules: [ ld.so.1 libc.so.1 libzpool.so.1 libnvpair.so.1 libavl.so.1 libcmdutils.so.1 ] > ::regs %rax = 0x0000000000000000 %r8 = 0x0000000000000000 %rbx = 0x000000000000b000 %r9 = 0xfffffc7fedebe1a0 %rcx = 0x0000000000000080 %r10 = 0x0000000000001000 %rdx = 0x000000000870e290 %r11 = 0xfffffc7feec95cf0 %rsi = 0x000000000870e310 %r12 = 0xfffffc7fd5739be0 %rdi = 0x000000000873e080 %r13 = 0xfffffc7fd5739bc0 %r14 = 0x00000000086eda00 %r15 = 0x0000000000001000 %cs = 0x0053 %fs = 0x0000 %gs = 0x0000 %ds = 0x004b %es = 0x004b %ss = 0x004b %rip = 0xfffffc7fedebe1cc libzpool.so.1`raidz_copy_abd_cb+0x2c %rbp = 0xfffffc7fd5739b90 %rsp = 0xfffffc7fd5739b90 %rflags = 0x00010246 id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 status=<of,df,IF,tf,sf,ZF,af,PF,cf> %gsbase = 0x0000000000000000 %fsbase = 0xfffffc7fef0f1240 %trapno = 0xd %err = 0x0 >
But %rdx has pointer aligned by 16 bytes, not 32:
ok 0x000000000870e290 32 /mod ok .s [data stack has 2 entries, top at 0x000000000134ce00] [0x000000000134ce00 0]: 4425492 (0x0000000000438714) [0x000000000134cdf8 1]: 16 (0x0000000000000010) [data stack base at 0x000000000134cdf8] ok
Testing done: build/install/boot, zdb does not crash with raidz any more.
Related issues
Updated by Dan McDonald over 2 years ago
I have a dump that's likely this bug.
::status on mdb notes a bad pointer that looks like a KERNEL pointer. So something here is using kernel pointers when it shouldn't be. It's related to the vectorized additions (#12668).
Updated by Dan McDonald over 2 years ago
- Related to Bug #12668: ZFS support for vectorized algorithms on x86 (initial support) added
Updated by Toomas Soome about 2 years ago
- Status changed from New to In Progress
- Assignee set to Toomas Soome
- % Done changed from 0 to 90
- Difficulty changed from Medium to Bite-size
Updated by Electric Monk about 2 years ago
- Status changed from In Progress to Closed
- % Done changed from 90 to 100
git commit 9178578f07d5330b7bf2b7b699ec04ea6635297a
commit 9178578f07d5330b7bf2b7b699ec04ea6635297a Author: Toomas Soome <tsoome@me.com> Date: 2021-04-30T06:56:25.000Z 13100 zdb rpool crash on raidz Reviewed by: Igor Kozhukhov <igor@dilos.org> Reviewed by: Andy Fiddaman <Andy@omnios.org> Reviewed by: Jason King <jason.brian.king@gmail.com> Approved by: Dan McDonald <danmcd@joyent.com>
Actions