Project

General

Profile

Bug #7199

dsl_dataset_rollback_sync may try to free already free blocks

Added by Andriy Gapon over 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
zfs - Zettabyte File System
Start date:
2016-07-20
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

dsl_dataset_rollback_sync may try to free already freed blocks when it calls dsl_destroy_head_sync_impl to destroy a temporary clone.
That happens if a snapshot to which we are rolling back and from which the clone is created has some ZIL records.


Related issues

Related to illumos gate - Feature #7197: ztest should test rollbackNew2016-07-20

Actions
Related to illumos gate - Bug #7200: no blocks must be born in a txg after a snapshot is createdClosed2016-07-20

Actions

History

#1

Updated by Andriy Gapon over 3 years ago

#2

Updated by Andriy Gapon over 3 years ago

I based this bug report on the following panic that I observed on FreeBSD.

panic: freeing free block; rs=0xfffff800288dba40
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper+0x27
kdb_backtrace+0x39
vpanic+0x11f
range_tree_verify+0x5a
metaslab_check_free+0xc5
zio_free+0x26
zio_free_zil+0x71
zil_free_log_block+0x25
zil_parse+0x159
zil_destroy_sync+0x55
dsl_destroy_head_sync_impl+0x50b
dsl_dataset_rollback_sync+0x155
dsl_sync_task_sync+0x11b
dsl_pool_sync+0x45f
spa_sync+0x56e
txg_sync_thread+0x303
fork_exit+0xad
#3

Updated by Andriy Gapon about 3 years ago

It seems that this problem occurs only in a very low probability scenario where two (or more) rollback sync-tasks for the same dataset are executed in the same txg sync pass.

As an aside, it would be nice if the kernel-side code was aware of a desired target snapshot and refused to perform the rollback if the snapshot doesn't exist or is not the latest. At present, the check is done in the userland, but it is insufficient for potential concurrent rollback requests.

So, my theory is this. The first rollback sync-task replaces the dataset with a clone created from the previous snapshot. The dataset points to the snapshot's physical / on-disk objset as a result. A new objset_t is created in memory, its os_zil_header is zeroed out, but os_phys->os_zil_header is left intact (obviously). When the second rollback is done, the dataset created by the previous rollback gets destroyed after the clone-swap operation. The clone-swap operation evicts objsets of both datasets. So, when dsl_destroy_head_sync_impl calls dmu_objset_from_ds, the latter creates a new objset instance. As a result, its os_zil_header and os_zil are based off os_phys->os_zil_header, which could be non-zero. Subsequently, zil_destroy_sync acts on a stale ZIL chain that happened to be stored in the snapshot's on-disk objset.
In other words, because both rollbacks are performed back-to-back, dmu_objset_sync is never called after os_zil_header of the intermediate dataset is zeroed out in memory. Thus, the snapshot's objset is never "forked", so the dataset keeps pointing to it instead of a new copy. And the second rollback tries to walk the snapshot's log chain instead of seeing a an empty ZIL header.

Does this sound plausible?

#4

Updated by Andriy Gapon about 3 years ago

  • Related to Bug #7200: no blocks must be born in a txg after a snapshot is created added
#5

Updated by Electric Monk about 3 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit bfaed0b91e57062c38bc16b4f89db3c8f0052a9b

commit  bfaed0b91e57062c38bc16b4f89db3c8f0052a9b
Author: Andriy Gapon <andriy.gapon@clusterhq.com>
Date:   2016-11-22T00:10:08.000Z

    7199 dsl_dataset_rollback_sync may try to free already free blocks
    7200 no blocks must be born in a txg after a snaphot is created
    Reviewed by: Matthew Ahrens <mahrens@delphix.com>
    Reviewed by: Brad Lewis <brad.lewis@delphix.com>
    Approved by: Gordon Ross <gordon.w.ross@gmail.com>

Also available in: Atom PDF