deadlock on ZFS during concurrent rename and mkdir
wget https://www.netbsd.org/~riastradh/tmp/dirconc.c cc -pthread -o dirconc dirconc.c mkdir foo dirconc ~/foo > /dev/null
Hangs a 2 core vm for me.
The system is
SunOS omniosce 5.11 omnios-r151034-0d278a0cc5 i86pc i386 i86pcso not the freshest possible, but I can't update right now. However, I do suspect it is readily reproducible by other people.
Does not run into anything on tmpfs presumably thanks to mount point-wide rename lock employed there.
As a data point, it does NOT reproduce on FreeBSD HEAD (running OpenZFS). I have not tried older version.
Updated by Dan McDonald about 2 years ago
I've tried this on a non-debug OmniOS bloody (omnios-master-583b18de89) in /tmp (but without redirecting stdout to /dev/null) and in my ZFS home directory (with redirecting stdout to /dev/null) which DOES seem to deadlock/hang-on-cv_wait. I.e. an unkillable process.
A coredump induced by `reboot -d` from this is available here: https://kebe.com/~danmcd/webrevs/13243/vmdump.1
I re-reproduced it on a DEBUG kernel and it misbehaved the same way. Kernel dump is https://kebe.com/~danmcd/webrevs/13243/vmdump.2
I noticed the bug's filer has made changes in OpenZFS particular to the FreeBSD vnop code. Our fix will likely be somewhere similar.