Project

General

Profile

Actions

Bug #5496

open

NULL pointer dereference in zfs:zap_leaf_lookup+4c caused by zfs recv of zvol replication stream

Added by Nils Goroll over 6 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2014-12-30
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

This is happening on smartos, but I suspect the issue to be general, so I log the bug with illumos. I apologize in case this was an invalid assumption.

I have a zvol which, when sent with -R and recv'd, will render the receiving pool unusuable in that the kernel panicks upon access to the zvol.
The panick will occur during
  • any import without -N
  • access to the zvol after an import with -N

Attempts to recover the receiving pool with zpool import -T <txg> for a txg before the recv was not possible, it failed with either one or more devices is currently unavailable or resulted in a panic

[root@e0-db-55-20-75-4e ~]# zdb -hhe franktmp

History:
...
2014-12-30.00:00:09 zfs snapshot franktmp/C5005C208641d0p0@zfs-auto-snap:daily-2014-12-30-00:00:00
  history command: 'zfs snapshot franktmp/C5005C208641d0p0@zfs-auto-snap:daily-2014-12-30-00:00:00'
  history who: 51
  history time: 1419897609
  history hostname: 'a0-36-9f-1e-71-40'
unrecognized record:
  dsname: 'franktmp/C5005C208641d0p0@1'
  dsid: 67
  history internal str: ''
  internal_name: 'snapshot'
  history txg: 6536
  history time: 1419924975
  history hostname: 'a0-36-9f-1e-71-40'
unrecognized record:
  ioctl: 'snapshot'
  in_nvl:
      snaps:
          franktmp/C5005C208641d0p0@1
      props:
  history who: 0
  history time: 1419925005
  history hostname: 'a0-36-9f-1e-71-40'
2014-12-30.07:36:45 zfs snapshot franktmp/C5005C208641d0p0@1
  history command: 'zfs snapshot franktmp/C5005C208641d0p0@1'
  history who: 0
  history time: 1419925005
  history hostname: 'a0-36-9f-1e-71-40'
unrecognized record:
  dsname: 'franktmp/d7f3c77a-2827-11e2-a972-436310932bf7'
  dsid: 72
  history internal str: ''
  internal_name: 'receive'
  history txg: 6539
  history time: 1419925038
  history hostname: 'a0-36-9f-1e-71-40'
unrecognized record:
  dsname: 'franktmp/d7f3c77a-2827-11e2-a972-436310932bf7'
  dsid: 72
  history internal str: '$hasrecvd='
  internal_name: 'set'
  history txg: 6540
  history time: 1419925038
  history hostname: 'a0-36-9f-1e-71-40'
unrecognized record:
  dsname: 'franktmp/d7f3c77a-2827-11e2-a972-436310932bf7'
  dsid: 72
  history internal str: 'snap=zfs-auto-snap:frequent-2013-10-29-20:15:00'
  internal_name: 'finish receiving'
  history txg: 6543
  history time: 1419925049
  history hostname: 'a0-36-9f-1e-71-40'
...
zdb -ul /dev/rdsk/c0t5000C50055E57AABd0s0
...
Uberblock[6]
        magic = 0000000000bab10c
        version = 5000
        txg = 5638
        guid_sum = 7104458112854163311
        timestamp = 1419899598 UTC = Tue Dec 30 00:33:18 2014
Uberblock[7]
        magic = 0000000000bab10c
        version = 5000
        txg = 5639
        guid_sum = 7104458112854163311
        timestamp = 1419899602 UTC = Tue Dec 30 00:33:22 2014
Uberblock[8]
        magic = 0000000000bab10c
        version = 5000
        txg = 6536
        guid_sum = 7104458112854163311
        timestamp = 1419924975 UTC = Tue Dec 30 07:36:15 2014
Uberblock[9]
        magic = 0000000000bab10c
        version = 5000
        txg = 6537
        guid_sum = 7104458112854163311
        timestamp = 1419925005 UTC = Tue Dec 30 07:36:45 2014
Uberblock[10]
        magic = 0000000000bab10c
        version = 5000
        txg = 5642
        guid_sum = 7104458112854163311
        timestamp = 1419899613 UTC = Tue Dec 30 00:33:33 2014
...
[root@e0-db-55-20-75-4e ~]# zpool import -N -f -R /mnt -F -T 5639 franktmp
cannot import 'franktmp': one or more devices is currently unavailable
[root@e0-db-55-20-75-4e ~]# zpool import -N -f -R /mnt -T 5639 franktmp
cannot import 'franktmp': one or more devices is currently unavailable
[root@e0-db-55-20-75-4e ~]# zpool import -N -o readonly=on -f -R /mnt -F -T 5639 franktmp
cannot import 'franktmp': one or more devices is currently unavailable
[root@e0-db-55-20-75-4e ~]# zpool import -N -o readonly=on -f -R /mnt -FV -T 5639 franktmp
## PANIC !

Receiving the last snapshot only was not causing any issues.
A zfs send -R output of the problematic dataset is available.
The following information is from several panics:

panic when booting after the initial panic:

Loading kmdb...
SunOS Release 5.11 Version joyent_20130530T224720Z 64-bit
Copyright (c) 2010-2013, Joyent Inc. All rights reserved.

panic[cpu0]/thread=ffffff19a3640860: BAD TRAP: type=e (#pf Page fault)
rp=ffffff00bb374710 addr=20 occurred in module "zfs" due to a NULL pointer
dereference

zpool: #pf Page fault
Bad kernel fault at addr=0x20
pid=101, pc=0xfffffffff7dc9e1c, sp=0xffffff00bb374800, eflags=0x10286
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4:
426f8<osxsav,vmxe,xmme,fxsr,pge,mce,pae,pse,de>
cr2: 20cr3: 17fefa6000cr8: 0

        rdi: ffffff19a8eec030 rsi: 43ef45a000000000 rdx:                0
        rcx:         a9345843  r8:   ce27f55acd0335  r9: ffffff00bb3746b8
        rax:                1 rbx:         a9345848 rbp: ffffff00bb374860
        r10:               30 r11:                0 r12: ffffff19a8eec030
        r13: ffffff19a8ce9800 r14:                1 r15:                1
        fsb:                0 gsb: fffffffffbc30e60  ds:               4b
         es:               4b  fs:                0  gs:              1c3
        trp:                e err:                0 rip: fffffffff7dc9e1c
         cs:               30 rfl:            10286 rsp: ffffff00bb374800
         ss:               38

Warning - stack not written to the dump buffer
ffffff00bb3745f0 unix:die+df ()
ffffff00bb374700 unix:trap+db3 ()
ffffff00bb374710 unix:cmntrap+e6 ()
ffffff00bb374860 zfs:zap_leaf_lookup+4c ()
ffffff00bb374910 zfs:fzap_lookup+a9 ()
ffffff00bb3749b0 zfs:zap_lookup_norm+111 ()
ffffff00bb374a10 zfs:zap_lookup+46 ()
ffffff00bb374a90 zfs:zvol_get_stats+3e ()
ffffff00bb374b00 zfs:zfs_ioc_objset_stats_impl+9b ()
ffffff00bb374b60 zfs:zfs_ioc_objset_stats+a7 ()
ffffff00bb374bd0 zfs:zfs_ioc_dataset_list_next+140 ()
ffffff00bb374c80 zfs:zfsdev_ioctl+4a7 ()
ffffff00bb374cc0 genunix:cdev_ioctl+39 ()
ffffff00bb374d10 specfs:spec_ioctl+60 ()
ffffff00bb374da0 genunix:fop_ioctl+55 ()
ffffff00bb374ec0 genunix:ioctl+9b ()
ffffff00bb374f10 unix:brand_sys_sysenter+1c9 ()

output of send/recv operation:

[root@a0-36-9f-1e-71-40 ~]# zfs send -Rv
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:monthly-2013-11-01-00:00:00
| zfs recv -Fd franktmp
send from @ to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
estimated size is 794M
send from @zfs-auto-snap:frequent-2013-10-29-20:15:00 to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@dataset estimated size is 465M
send from @dataset to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@4cd06216-fc85-4170-97ec-c307490c68fc-disk0
estimated size is 0
send from @4cd06216-fc85-4170-97ec-c307490c68fc-disk0 to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@2b166b22-f295-47e7-a66e-6497c98dd6b8-disk0
estimated size is 0
send from @2b166b22-f295-47e7-a66e-6497c98dd6b8-disk0 to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:30:00
estimated size is 0
send from @zfs-auto-snap:frequent-2013-10-29-20:30:00 to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:hourly-2013-10-29-21:00:00
estimated size is 0
send from @zfs-auto-snap:hourly-2013-10-29-21:00:00 to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:daily-2013-10-30-00:00:00
estimated size is 0
send from @zfs-auto-snap:daily-2013-10-30-00:00:00 to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:weekly-2013-11-01-00:00:00
estimated size is 0
send from @zfs-auto-snap:weekly-2013-11-01-00:00:00 to
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:monthly-2013-11-01-00:00:00
estimated size is 0
total estimated size is 1.23G
TIME        SENT   SNAPSHOT
07:37:19   56.2M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:20    139M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:21    235M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:22    338M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:23    412M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:24    511M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:25    592M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:26    687M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
07:37:27    769M
zones/d7f3c77a-2827-11e2-a972-436310932bf7@zfs-auto-snap:frequent-2013-10-29-20:15:00
TIME        SENT   SNAPSHOT
07:37:29   13.8K   zones/d7f3c77a-2827-11e2-a972-436310932bf7@dataset
## PANIC

stack of the panic

[1]> ::stack
kmdb_enter+0xb()
debug_enter+0x5d(fffffffffb931308)
panicsys+0x5cc(fffffffffb92fa00, ffffff00b8fef570, fffffffffbc7a5e0, 1)
vpanic+0x15c()
param_preset()
die+0xdf(e, ffffff00b8fef780, 20, 1)
trap+0xdb3(ffffff00b8fef780, 20, 1)
0xfffffffffb8001d6()
zfs`zap_leaf_lookup+0x4c(ffffff1f1d7599e0, ffffff1d47ecc080, ffffff00b8fef8f0)
zfs`fzap_lookup+0xa9(ffffff1d47ecc080, 8, 1, ffffff00b8fefad8, 0, 0)
zfs`zap_lookup_norm+0x111(ffffff19c22316c0, 2, fffffffff7e0dd50, 8, 1,
ffffff00b8fefad8)
zfs`zap_lookup+0x46(ffffff19c22316c0, 2, fffffffff7e0dd50, 8, 1,
ffffff00b8fefad8)
zfs`zvol_get_stats+0x3e(ffffff19c22316c0, ffffff1d2add5170)
zfs`zfs_ioc_objset_stats_impl+0x9b(ffffff1ee1640000, ffffff19c22316c0, 0)
zfs`zfs_ioc_objset_stats+0xa7(ffffff1ee1640000)
zfs`zfsdev_ioctl+0x4a7(5a00000000, 5a12, 8042240, 100003, ffffff19e38f9290,
ffffff00b8fefe68)
cdev_ioctl+0x39(5a00000000, 5a12, 8042240, 100003, ffffff19e38f9290,
ffffff00b8fefe68)
specfs`spec_ioctl+0x60(ffffff19a3480680, 5a12, 8042240, 100003, ffffff19e38f9290
, ffffff00b8fefe68)
fop_ioctl+0x55(ffffff19a3480680, 5a12, 8042240, 100003, ffffff19e38f9290,
ffffff00b8fefe68)
ioctl+0x9b(3, 5a12, 8042240)
_sys_sysenter_post_swapgs+0x149()

No data to display

Actions

Also available in: Atom PDF