Project

General

Profile

Actions

Bug #9880

closed

Bug #8115: parallel zfs mount

Race in ZFS parallel mount

Added by Andy Fiddaman about 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2018-10-10
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

There is a race condition in the ZFS parallel mount code which shows up if you have zoned datasets with the same mountpoint as those in the global zone. The code makes the incorrect assumption that mount points are globally unique. This results in mount failures during boot or pool import.

Basically the thread that is responsible for trying to mount a NGZ /a can be tasked with mounting the GZ /a/b and it can so that before the GZ /a thread has completed.

Take the following set of filesystems which have been sorted by mountpoint by the existing code - mountpoints followed by the dataset in brackets. The letters at the left are my annotations.

With this set of filesystems, `zfs_foreach_mountpoint()` will create tasks a-h, task g will create A-D and D will create _s-_z. The problem is that, for example, t can run before B.

a   / (rpool/ROOT/r151024l)
b   / (rpool/ROOT/r151028.pre2)
c   / (rpool/ROOT/r151026.l1tf)
d   /data (data/zone/build/export)
e   /data (data/zone/reci/export)
f   /data (data)
g   /data (data/zone/ns1/export)
 A  /data/sendmail (data/zone/build/export/sendmail)
 B  /data/sendmail (data/sendmail)
 C  /data/sendmail (data/zone/ns1/export/sendmail)
 D  /data/sendmail (data/zone/reci/export/sendmail)
  s /data/sendmail/clientmqueue (data/zone/reci/export/sendmail/clientmqueue)
  t /data/sendmail/clientmqueue (data/sendmail/clientmqueue)
  u /data/sendmail/clientmqueue (data/zone/ns1/export/sendmail/clientmqueue)
  v /data/sendmail/clientmqueue (data/zone/build/export/sendmail/clientmqueue)
  w /data/sendmail/mqueue (data/zone/ns1/export/sendmail/mqueue)
  x /data/sendmail/mqueue (data/zone/build/export/sendmail/mqueue)
  y /data/sendmail/mqueue (data/sendmail/mqueue)
  z /data/sendmail/mqueue (data/zone/reci/export/sendmail/mqueue)
h   /home (data/home)
 Z  /home/af (data/home/af)

The fix I've gone for at the moment is to change the sort so that filesystems with the `zoned` attribute are sorted to the bottom. In the global zone, that results in the expected sorted list of filesystems and has the additional benefit that we can stop creating tasks once we see a zoned filesystem in the list. In a non-global zone, only the delegated filesystems are seen so the list is just traversed as normal.

Actions

Also available in: Atom PDF