Project

General

Profile

Bug #1970

Boot fails from an rpool on Hitachi 4k sector drives

Added by Alex Viskovatoff almost 9 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2012-01-08
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
needs-triage
Gerrit CR:

Description

The drives are two Hitachi Desktar 7K1000 D drives in a mirrored pool. They are novel in that they have a single platter, but I do not see why that should make a difference.

The reason I do not think that this is the same problem as bug 1772 (OI cannot import a data zpool created by S11X) is that installation of OI fails on this drive. The drive is recognized by the installer and can be selected, but creation of a pool fails. I was able to create a pool and write to it however by using zpool(1) and zfs(1) while running OI from the live CD.

The way I got OI BEs on these drives was that I had a mirrored pool of Seagate drives with both OI and S11E BEs on it. When one of the drives failed, I replaced it with a Hitachi drive and allowed a resilver to proceed under S11. (Before doing this, I upgraded from S11E to S11.) Then I replaced the other Seagate drive with an identical Hitachi drive in the pool. Needless to say, I can mount the OI BEs under S11. I am at zpool version 28.

This is the error message produced when trying to boot OI:

NOTICE: zfs_parse_bootfs: error 22
Cannot mount root on rpool/588 fstype zfs

panic[cpu0]/thread=fffffffffbc2f260: vfs_mountroot: cannot mount root

Warning - stack not written to the dump buffer
fffffffffbc718e0 genunix:vfs_mountroot+33e ()
fffffffffbc71910 genunix:main+136 ()
fffffffffbc71920: unix:_locore_start+90 ()

panic: entering debugger (no dump device, continue to reboot)

By using kmdb, I was able to determine that what is failing is vdev_open. The calling order is zfs_parse_bootfs (zfs_vfsops.c) -> dsl_dsobj_to_dsname (dsl_dataset.c) -> spa_open (spa.c) -> spa_open_common -> spa_load_best -> spa_load -> spa_load_impl -> vdev_open (vdev.c).


Related issues

Is duplicate of illumos gate - Bug #2671: zpool import should not fail if vdev ashift has increasedResolvedGeorge Wilson2012-05-02

Actions
#1

Updated by Rich Lowe almost 9 years ago

One theory here is that moving to 4K drives via resilver has meant that we wrote the data with the old ashift but to the 4k drive, and now fail the ashift sanity check in vdev_open.

(if I recall correctly, this theory was Trisk's)

#2

Updated by Rich Lowe almost 9 years ago

Information from IRC suggests that zdb of the pool on a Solaris system (such that it can be read), shows the vdevs with an ashift of 9. We would, iirc, expect it to be 12, which may bolster Albert's theory.

#3

Updated by Alex Viskovatoff almost 9 years ago

zdb produces the following:

MOS Configuration:
version: 28
name: 'rpool'
state: 0
txg: 10500658
pool_guid: 3784953886730671621
timestamp: 1327544225
hostid: 941991334
hostname: ''
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 3784953886730671621
children0:
type: 'mirror'
id: 0
guid: 16204652709271235991
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 1000123400192
is_log: 0
children0:
type: 'disk'
id: 0
guid: 2038299349745552891
path: '/dev/dsk/c3t0d0s0'
devid: 'id1,sd@AHitachi_HDS721010DLE630=______MSE5215V047Z2E/a'
phys_path: '/pci@0,0/pci1043,8239@5/disk@0,0:a'
whole_disk: 0
DTL: 241
children1:
type: 'disk'
id: 1
guid: 4085559267791052200
path: '/dev/dsk/c4t0d0s0'
devid: 'id1,sd@AHitachi_HDS721010DLE630=______MSE5215V04H7EE/a'
phys_path: '/pci@0,0/pci1043,8239@5,1/disk@0,0:a'
whole_disk: 0
DTL: 491

#4

Updated by Albert Lee almost 9 years ago

Because ashift is only used in the DMU allocation path, it's probably a sane to permit a smaller ashift in the label (we should still be able to read back the unaligned blocks) and use ours anyway. Whether to write the corrected ashift back to the label is a different question.

#5

Updated by Albert Lee over 8 years ago

  • Status changed from New to Closed

Duplicate of #2671.

Also available in: Atom PDF