ztest should use watchpoints to protect frozen arc bufs
There is a relatively frequent failure in ztest where the arc detects that a buffer was modified while frozen and fails an assert. Unfortunately by the time the core dump happens the culprit (the guy who modified the buffer contents) is long gone.
This feature would add watch points to all arc buffers so that we can detect the invalid modification as it is happening.
Updated by Eric Schrock over 9 years ago
- Status changed from In Progress to Resolved
user: Matthew Ahrens <email@example.com>
date: Thu Aug 30 05:13:49 2012 -0700
3112 ztest does not honor ZFS_DEBUG
3113 ztest should use watchpoints to protect frozen arc bufs
3114 some leaked nvlists in zfsdev_ioctl
3115 poll(2) returns prematurely in presence of spurious wakeups
Reviewed by: Adam Leventhal <firstname.lastname@example.org>
Reviewed by: Matt Amdur <Matt.Amdur@delphix.com>
Reviewed by: George Wilson <email@example.com>
Reviewed by: Christopher Siden <firstname.lastname@example.org>
Approved by: Eric Schrock <email@example.com>
Updated by Brian Behlendorf about 9 years ago
Was the root cause of the 'buffer modified while frozen!' error ever identified? I see that the arc watchpoint patch was merged but I don't see see a patch which addressed the original issue. Did I miss it?
I ask because after porting the nop-write changes to ZoL I'm easily able to reproduce the issue. However, prior to the nop-write change I don't recall ever hitting this failure. It looks like a long standing issue but I can't move forward with the nop-write changes until I can resolve the ztest failures. I'm hoping I just missing the fix in your tree.
Updated by George Wilson almost 9 years ago
The buffer modified while frozen issue still exists. I have been trying to isolate this but have not had much time recently to debug it further. The watchpoint was added to assist in debugging this problem but with it enabled I'm unable to hit the modified while frozen issue.
Updated by Brian Behlendorf almost 9 years ago
Ok, that's good to know. After adapting the debug patch for Linux to use mprotect(2) I'm still able to fairly easily reproduce the issue. I'll spend some time chasing it down next week. Interestingly, I'm able to reproduce the issue without triggering the debug code.