Panic on zfs receive of a recursive deduplicated stream
After importing the changeset 13973:4972ab336f54 (3464 zfs synctask code needs restructuring) to FreeBSD (CURRENT) we have received user reports on kernel panics while receiving recursive deduplicated streams. Andriy Gapon's (avg@FreeBSD.org) investigation revealed that the call in dmu_send.c from restore_byte_byref() to dmu_objset_from_ds() happens without a dataset lock held.
This originates in add_ds_to_guidmap(), where the dataset is released after placing a long hold on it. This does not correspond to the in-code documentation in dsl_pool.c "DSL Pool Configuration Lock" regarding the use of long hold on datasets.
The panic is easy to reproduce, simply receive a recursive deduplicated stream (zfs send -R -D | zfs receive -d).
One of the possible solutions might by to release the dataset hold in free_guid_map_onexit() instead of add_ds_to_guidmap() , see attached patch.
Updated by Christopher Siden about 8 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
commit de8d9cf Author: Matthew Ahrens <email@example.com> Date: Wed Apr 10 14:54:56 2013 3645 dmu_send_impl: possibilty of pool hold leak 3692 Panic on zfs receive of a recursive deduplicated stream Reviewed by: Adam Leventhal <firstname.lastname@example.org> Reviewed by: Christopher Siden <email@example.com> Reviewed by: Dan McDonald <firstname.lastname@example.org> Approved by: Richard Lowe <email@example.com>