Project

General

Profile

Bug #6011

potential ZFS deadlock when ZFS writes take a page fault

Added by Rich Lowe over 5 years ago. Updated over 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2015-06-16
Due date:
% Done:

0%

Estimated time:
Difficulty:
Hard
Tags:
needs-triage
Gerrit CR:

Description

An OmniOS user has a system which appeared to wedge solidly when attempting to use procfs.

Investigation finds that an application of theirs is in munmap(), holding its as lock for write, and waiting for a ZFS txg to open for I/O. Approximately 60 other threads are waiting on this lock, primarily as readers. including others doing ZFS I/O via zfs_read/zfs_write.

The txg quiesce thread is waiting on tc_cv, and would be woken by the zfs_write I/O threads.

We have a classic deadlock, the thread with the as lock for WRITE waiting on txg open (tc_cv), and the threads that'd signal the cv waiting on the as lock.

It appears that triggering this requires memory pressure, such that we have the right combination of page faults. This seems to not occur absent memory pressure.

A dump may be available from the user (esproul) upon request, but cannot be shared by me.

#1

Updated by Rich Lowe over 5 years ago

This is OmniOS 151006, revision b281e50 in the https://github.com/omniti-labs/illumos-omnios repo, so somewhat old. But I see no indication of later fixes to this issue.

Robert pointed out #4161, which sounds similar but I think is different.

Also available in: Atom PDF