dnode_next_offset can detect fictional holes
dnode_next_offset is used in a variety of places to iterate over the holes or allocated blocks in a dnode. It operates under the premise that it can iterate over the blockpointers of a dnode in open context while holding only the dn_struct_rwlock as reader. Unfortunately, this premise does not hold.
When we create the zio for a dbuf, we pass in the actual block pointer in the indirect block above that dbuf. When we later zero the bp in zio_write_compress, we are directly modifying the bp. The state of the bp is now inconsistent from the perspective of dnode_next_offset: the bp will appear to be a hole until zio_dva_allocate finally finishes filling it in. In the meantime, dnode_next_offset can detect a hole in the dnode when none exists.
I was able to experimentally demonstrate this behavior with the following setup:
1. Create a file with 1 million dbufs.
2. Create a thread that randomly dirties L2 blocks by writing to the first L0 block under them.
3. Observe dnode_next_offset, waiting for it to skip over a hole in the middle of a file.
4. Do dnode_next_offset in a loop until we skip over such a non-existent hole.
The fix is to ensure that it is valid to iterate over the indirect blocks in a dnode while holding the dn_struct_rwlock by passing the zio a copy of the BP and updating the actual BP in dbuf_write_ready while holding the lock.
Updated by Electric Monk about 6 years ago
- Status changed from New to Closed
- % Done changed from 0 to 100
commit 11ceac77ea8034bf2fe9bdd6d314f5d1e5ceeba3 Author: Alex Reece <firstname.lastname@example.org> Date: 2016-05-12T17:07:29.000Z 6844 dnode_next_offset can detect fictional holes Reviewed by: Matthew Ahrens <email@example.com> Reviewed by: George Wilson <firstname.lastname@example.org> Reviewed by: Boris Protopopov <email@example.com> Approved by: Dan McDonald <firstname.lastname@example.org>