Bug #6844
closeddnode_next_offset can detect fictional holes
100%
Description
dnode_next_offset is used in a variety of places to iterate over the holes or allocated blocks in a dnode. It operates under the premise that it can iterate over the blockpointers of a dnode in open context while holding only the dn_struct_rwlock as reader. Unfortunately, this premise does not hold.
When we create the zio for a dbuf, we pass in the actual block pointer in the indirect block above that dbuf. When we later zero the bp in zio_write_compress, we are directly modifying the bp. The state of the bp is now inconsistent from the perspective of dnode_next_offset: the bp will appear to be a hole until zio_dva_allocate finally finishes filling it in. In the meantime, dnode_next_offset can detect a hole in the dnode when none exists.
I was able to experimentally demonstrate this behavior with the following setup:
1. Create a file with 1 million dbufs.
2. Create a thread that randomly dirties L2 blocks by writing to the first L0 block under them.
3. Observe dnode_next_offset, waiting for it to skip over a hole in the middle of a file.
4. Do dnode_next_offset in a loop until we skip over such a non-existent hole.
The fix is to ensure that it is valid to iterate over the indirect blocks in a dnode while holding the dn_struct_rwlock by passing the zio a copy of the BP and updating the actual BP in dbuf_write_ready while holding the lock.