Project

General

Profile

Bug #9880

Bug #8115: parallel zfs mount

Race in ZFS parallel mount

Added by Andy Fiddaman 12 months ago. Updated 11 months ago.

Status:
Closed
Priority:
High
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2018-10-10
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:

Description

There is a race condition in the ZFS parallel mount code which shows up if you have zoned datasets with the same mountpoint as those in the global zone. The code makes the incorrect assumption that mount points are globally unique. This results in mount failures during boot or pool import.

Basically the thread that is responsible for trying to mount a NGZ /a can be tasked with mounting the GZ /a/b and it can so that before the GZ /a thread has completed.

Take the following set of filesystems which have been sorted by mountpoint by the existing code - mountpoints followed by the dataset in brackets. The letters at the left are my annotations.

With this set of filesystems, `zfs_foreach_mountpoint()` will create tasks a-h, task g will create A-D and D will create _s-_z. The problem is that, for example, t can run before B.

a   / (rpool/ROOT/r151024l)
b   / (rpool/ROOT/r151028.pre2)
c   / (rpool/ROOT/r151026.l1tf)
d   /data (data/zone/build/export)
e   /data (data/zone/reci/export)
f   /data (data)
g   /data (data/zone/ns1/export)
 A  /data/sendmail (data/zone/build/export/sendmail)
 B  /data/sendmail (data/sendmail)
 C  /data/sendmail (data/zone/ns1/export/sendmail)
 D  /data/sendmail (data/zone/reci/export/sendmail)
  s /data/sendmail/clientmqueue (data/zone/reci/export/sendmail/clientmqueue)
  t /data/sendmail/clientmqueue (data/sendmail/clientmqueue)
  u /data/sendmail/clientmqueue (data/zone/ns1/export/sendmail/clientmqueue)
  v /data/sendmail/clientmqueue (data/zone/build/export/sendmail/clientmqueue)
  w /data/sendmail/mqueue (data/zone/ns1/export/sendmail/mqueue)
  x /data/sendmail/mqueue (data/zone/build/export/sendmail/mqueue)
  y /data/sendmail/mqueue (data/sendmail/mqueue)
  z /data/sendmail/mqueue (data/zone/reci/export/sendmail/mqueue)
h   /home (data/home)
 Z  /home/af (data/home/af)

The fix I've gone for at the moment is to change the sort so that filesystems with the `zoned` attribute are sorted to the bottom. In the global zone, that results in the expected sorted list of filesystems and has the additional benefit that we can stop creating tasks once we see a zoned filesystem in the list. In a non-global zone, only the delegated filesystems are seen so the list is just traversed as normal.

History

#2

Updated by Electric Monk 11 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit bc4c0ff1343a311cc24933908ac6c4455af09031

commit  bc4c0ff1343a311cc24933908ac6c4455af09031
Author: Andy Fiddaman <omnios@citrus-it.co.uk>
Date:   2018-10-20T21:48:39.000Z

    9880 Race in ZFS parallel mount
    Reviewed by: Jason King <jason.king@joyent.com>
    Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
    Approved by: Joshua M. Clulow <josh@sysmgr.org>

Also available in: Atom PDF