Project

General

Profile

Actions

Bug #4820

open

Panic in zfs_ereport_start_checksum when checksum error is EINVAL

Added by Dan Vatca over 7 years ago. Updated over 6 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2014-04-28
Due date:
% Done:

0%

Estimated time:
Difficulty:
Bite-size
Tags:
needs-triage
Gerrit CR:

Description

Crash stack is the following:

ffffff005ca12cc0 zfs_ereport_start_checksum+0xa6(ffffff0d34a0f000, 0, ffffff11106134b0, 0, 18200, 0, ffffff005ca12ce0)
ffffff005ca12d50 zio_checksum_verify+0x7b(ffffff11106134b0)
ffffff005ca12d90 zio_execute+0x88(ffffff11106134b0)
ffffff005ca12dd0 zio_wait+0x37(ffffff11106134b0)
ffffff005ca12ec0 arc_read+0x5bd(0, ffffff0d34a0f000, ffffff0d8f559980, fffffffff799fde0, ffffff005ca12f10, 2, ffffff0000000040, ffffff005ca12f1c, ffffff005ca13080)
ffffff005ca12f80 backup_cb+0x119(ffffff0d34a0f000, 0, ffffff0d8f559980, ffffff005ca13080, ffffff0d8f559800, ffffff0e02ca53c0)
ffffff005ca13060 traverse_visitbp+0x231(ffffff005ca138b0, ffffff0d8f559800, ffffff0d8f559980, ffffff005ca13080)
ffffff005ca130f0 traverse_dnode+0xf6(ffffff005ca138b0, ffffff0d8f559800, 119a, 64)
ffffff005ca131d0 traverse_visitbp+0x55e(ffffff005ca138b0, ffffff0dbb114000, ffffff0e09832180, ffffff005ca13230)
ffffff005ca132b0 traverse_visitbp+0x43a(ffffff005ca138b0, ffffff0dbb114000, ffffff0d7bde7000, ffffff005ca13310)
ffffff005ca13390 traverse_visitbp+0x43a(ffffff005ca138b0, ffffff0dbb114000, ffffff0e098a9000, ffffff005ca133f0)
ffffff005ca13470 traverse_visitbp+0x43a(ffffff005ca138b0, ffffff0dbb114000, ffffff0d8f4fd000, ffffff005ca134d0)
ffffff005ca13550 traverse_visitbp+0x43a(ffffff005ca138b0, ffffff0dbb114000, ffffff0da8479000, ffffff005ca135b0)
ffffff005ca13630 traverse_visitbp+0x43a(ffffff005ca138b0, ffffff0dbb114000, ffffff0db577c000, ffffff005ca13690)
ffffff005ca13710 traverse_visitbp+0x43a(ffffff005ca138b0, ffffff0dbb114000, ffffff0dbb114040, ffffff005ca13730)
ffffff005ca137a0 traverse_dnode+0x8b(ffffff005ca138b0, ffffff0dbb114000, 119a, 0)
ffffff005ca13880 traverse_visitbp+0x625(ffffff005ca138b0, 0, ffffff0d5b076280, ffffff005ca13900)
ffffff005ca139b0 traverse_impl+0x158(ffffff0d34a0f000, ffffff0d5e8c9cc0, 119a, ffffff0d5b076280, 0, 0, ffffff000000000d, fffffffff79b0040, ffffff0e02ca53c0)
ffffff005ca13a10 traverse_dataset+0x54(ffffff0d5e8c9cc0, 0, d, fffffffff79b0040, ffffff0e02ca53c0)
ffffff005ca13ab0 dmu_send_impl+0x2ea(fffffffff7a57726, ffffff0d5a4b8600, ffffff0d5e8c9cc0, 0, 0, 1, ffffff0d360d6480, ffffff005ca13b98)
ffffff005ca13b70 dmu_send_obj+0x175(ffffff0e08507000, 119a, 0, 1, ffffff0d360d6480, ffffff005ca13b98)
ffffff005ca13bd0 zfs_ioc_send+0xc1(ffffff0e08507000)
ffffff005ca13c80 zfsdev_ioctl+0x4a7(8600000000, 5a1c, 8044c40, 100003, ffffff0d662ef910, ffffff005ca13e68)
ffffff005ca13cc0 cdev_ioctl+0x39(8600000000, 5a1c, 8044c40, 100003, ffffff0d662ef910, ffffff005ca13e68)
ffffff005ca13d10 spec_ioctl+0x60(ffffff0d5ba69980, 5a1c, 8044c40, 100003, ffffff0d662ef910, ffffff005ca13e68, 0)
ffffff005ca13da0 fop_ioctl+0x55(ffffff0d5ba69980, 5a1c, 8044c40, 100003, ffffff0d662ef910, ffffff005ca13e68, 0)
ffffff005ca13ec0 ioctl+0x9b(4, 5a1c, 8044c40)
ffffff005ca13f10 _sys_sysenter_post_swapgs+0x149()

This happens during a zfs send, but it could happen in other cases. The mechanism that created this situation is different, and this case does not attempt to discuss the motives for creating this situation.
What this case is about is the problem in zio_checksum_verify that does not look at take a harder look at the return value from zio_checksum_error. The problem there is that the info structure (the checksum error details) is not allocated unless the error is CHKSUM. When the error is EINVAL (as in this crash), the info will contain whatever was previously on the stack and this will trigger a panic when reporting the checksum error.

Actions

Also available in: Atom PDF