Project

General

Profile

Actions

Bug #348

closed

ZFS should handle DKIOCGMEDIAINFOEXT failure

Added by Piotr Jasiukajtis over 12 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2010-10-15
Due date:
2013-08-01
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:

Description

This is also known as 6972328
(Installation of snv_139+ on HP BL685c G5 fails due to panic during auto install process)

Introduced in b139 revision 12208 fixed in b148 (not available in public) as 6967658 (sd_send_scsi_READ_CAPACITY_16() needs to handle SBC-2 and SBC-3 response formats)

The issue is caused by using DKIOCGMEDIAINFOEXT in ZFS
A workaround is to use DKIOCGMEDIAINFO.

Working workaround:

$ cd usr/src/uts/common/fs/zfs/
$ diff -u vdev_disk.c_orig vdev_disk.c
--- vdev_disk.c_orig    2010-10-06 14:39:22.267857597 +0200
+++ vdev_disk.c 2010-10-06 14:52:55.078122278 +0200
@@ -107,7 +107,7 @@
 {
        spa_t *spa = vd->vdev_spa;
        vdev_disk_t *dvd;
-       struct dk_minfo_ext dkmext;
+       struct dk_minfo dkm;
        int error;
        dev_t dev;
        int otyp;
@@ -287,11 +287,11 @@
         * Determine the device's minimum transfer size.
         * If the ioctl isn't supported, assume DEV_BSIZE.
         */
-       if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFOEXT, (intptr_t)&dkmext,
+       if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFO, (intptr_t)&dkm,
            FKIOCTL, kcred, NULL) != 0)
-               dkmext.dki_pbsize = DEV_BSIZE;
+               dkm.dki_lbsize = DEV_BSIZE;

-       *ashift = highbit(MAX(dkmext.dki_pbsize, SPA_MINBLOCKSIZE)) - 1;
+       *ashift = highbit(MAX(dkm.dki_lbsize, SPA_MINBLOCKSIZE)) - 1;

        /*
         * Clear the nowritecache bit, so that on a vdev_reopen() we will

I have tested that under illumos-gate revision 13197 full debug build.


Files

illumos_348_cpqary_workaround.diff (1.27 KB) illumos_348_cpqary_workaround.diff Piotr Jasiukajtis, 2013-07-01 04:45 PM

Related issues

Related to illumos gate - Bug #2154: Panic DL460 G1 zpool create -f test c2t0d0s0Feedback2012-02-20

Actions
Related to OpenIndiana Distribution - Bug #3016: Kernel Panic with text install with HP Smart ArrayNewOI illumos2012-07-22

Actions
Related to illumos gate - Bug #2087: sd needs to detect and handle SBC-3 correctlyFeedbackAlbert Lee2012-02-08

Actions
Has duplicate illumos gate - Bug #1129: Try to install OI 151 on HP 360RejectedGarrett D'Amore2011-06-20

Actions
Actions #1

Updated by Piotr Jasiukajtis over 12 years ago

I'm updating hg diff. It's the same as before but this time with a proper text formatting.

$ hg status -v
M usr/src/uts/common/fs/zfs/vdev_disk.c

$ hg diff
diff -r 4f23f0abcff2 usr/src/uts/common/fs/zfs/vdev_disk.c
--- a/usr/src/uts/common/fs/zfs/vdev_disk.c     Thu Dec 09 17:47:03 2010 -0800
+++ b/usr/src/uts/common/fs/zfs/vdev_disk.c     Wed Dec 15 11:55:58 2010 +0100
@@ -107,7 +107,7 @@
 {
        spa_t *spa = vd->vdev_spa;
        vdev_disk_t *dvd;
-       struct dk_minfo_ext dkmext;
+       struct dk_minfo dkm;
        int error;
        dev_t dev;
        int otyp;
@@ -287,11 +287,11 @@
         * Determine the device's minimum transfer size.
         * If the ioctl isn't supported, assume DEV_BSIZE.
         */
-       if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFOEXT, (intptr_t)&dkmext,
+       if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFO, (intptr_t)&dkm,
            FKIOCTL, kcred, NULL) != 0)
-               dkmext.dki_pbsize = DEV_BSIZE;
+               dkm.dki_lbsize = DEV_BSIZE;

-       *ashift = highbit(MAX(dkmext.dki_pbsize, SPA_MINBLOCKSIZE)) - 1;
+       *ashift = highbit(MAX(dkm.dki_lbsize, SPA_MINBLOCKSIZE)) - 1;

        /*
         * Clear the nowritecache bit, so that on a vdev_reopen() we will

$ hg heads
changeset:   13256:4f23f0abcff2
tag:         tip
user:        Chris Love <cjlove@san.rr.com>
date:        Thu Dec 09 17:47:03 2010 -0800
summary:     436 webrev URLs need updated

$ hg id -n
13256+
Actions #2

Updated by Albert Lee over 12 years ago

  • Assignee changed from Piotr Jasiukajtis to Albert Lee

I'm planning to fix the underlying problem.

Actions #3

Updated by Piotr Jasiukajtis over 12 years ago

  • Assignee changed from Albert Lee to Piotr Jasiukajtis

According to the offline communication, I'm also working on the fix :)

Actions #4

Updated by Ken Mays almost 12 years ago

  • Difficulty set to Medium
  • Tags set to needs-triage

RFE: Patch requires Illumos dev review and approval with current CPQary3 2.3 driver approved by HP engineering.

Actions #5

Updated by Bryan Leaman almost 12 years ago

I'm facing the same issue on an HP DL360 G4p. However, the latest dev-il build of oi_151a (illumos changeset 170f0c3a9064) works fine when I use a more recent driver downloaded from HP.

I'm still doing load testing, but version 2.4.4 and 2.4.6 of cpqary3 seem to work OK with the internal Smart Array 6i controller. If I revert back to the OI-bundled 2.2.0.a driver, the machine will panic in cpqary3_retrieve() during boot.

If this is indeed a driver issue then we should probably re-file this bug under OpenIndiana and maybe the team can package a newer driver for the 151 release. Otherwise some HP systems will not be able to install the OS.

Actions #6

Updated by Albert Lee over 11 years ago

cpqary3 is actually part of ON's closed bits. Even if the root cause of the panic is fixed, it may make sense to drop it entirely since it's outdated and there is an alternative source (unless someone can convince HP to provide a new driver under a licence that allows unlimited redistribution).

Actions #7

Updated by Piotr Jasiukajtis over 11 years ago

  • Assignee deleted (Piotr Jasiukajtis)
Actions #8

Updated by Albert Lee over 11 years ago

  • Category changed from kernel to driver - device drivers
  • Assignee set to Albert Lee
  • % Done changed from 10 to 50
Actions #9

Updated by Piotr Jasiukajtis almost 10 years ago

I'm uploading an updated workaround for the current gate revision (14068:2547a41b1162).

Actions #10

Updated by Dan McDonald almost 10 years ago

While bringing up a certain blkdev driver, I encountered problems with this as well (mostly pertaining to bad EXPANDSZ, see #3878). I have proposed a fallback-to-DKIOCGMEDIA for ZFS. I need to run it past all of the affected parties, however. This fallback code may completely solve this bug.

Actions #11

Updated by Dan McDonald almost 10 years ago

  • Subject changed from b139+ is not working with cpqary3 (6972328) to ZFS should handle DKIOCGMEDIAEXT failure
  • Category changed from driver - device drivers to zfs - Zettabyte File System

I'm hijacking this bug, because the fix is to have ZFS be more robust in the face of DKIOCGMEDIAINFOEXT failures.

Actions #12

Updated by Dan McDonald almost 10 years ago

  • Assignee changed from Albert Lee to Dan McDonald
Actions #13

Updated by Dan McDonald almost 10 years ago

  • Subject changed from ZFS should handle DKIOCGMEDIAEXT failure to ZFS should handle DKIOCGMEDIAINFOEXT failure
Actions #14

Updated by Dan McDonald almost 10 years ago

  • Status changed from New to Pending RTI
  • % Done changed from 50 to 100
Actions #15

Updated by Dan McDonald almost 10 years ago

commit a5b577712a34346841d970e0827b4920ace408af
Author: Dan McDonald <>
Date: Wed Jul 10 10:58:11 2013 -0400

348 ZFS should handle DKIOCGMEDIAINFOEXT failure
Reviewed by: Garrett D'Amore &lt;&gt;
Reviewed by: George Wilson &lt;&gt;
Approved by: Garrett D'Amore &lt;&gt;
Actions #16

Updated by Dan McDonald almost 10 years ago

  • Due date set to 2013-08-01
Actions #17

Updated by Albert Lee almost 10 years ago

  • Status changed from Pending RTI to In Progress
Actions #18

Updated by Dan McDonald over 9 years ago

  • Status changed from In Progress to Pending RTI
Actions #19

Updated by Dan McDonald over 9 years ago

  • Status changed from Pending RTI to In Progress
Actions #20

Updated by Dan McDonald over 9 years ago

  • Status changed from In Progress to Pending RTI
Actions #21

Updated by Yuri Pankov about 6 years ago

  • Blocked by deleted (Bug #2087: sd needs to detect and handle SBC-3 correctly)
Actions #22

Updated by Yuri Pankov about 6 years ago

  • Related to Bug #2087: sd needs to detect and handle SBC-3 correctly added
Actions #23

Updated by Yuri Pankov about 6 years ago

  • Status changed from Pending RTI to Resolved
Actions

Also available in: Atom PDF