Project

General

Profile

Bug #1313

Integer overflow in txg_delay()

Added by Martin Matuška about 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
High
Category:
zfs - Zettabyte File System
Start date:
2011-08-01
Due date:
% Done:

0%

Estimated time:
Difficulty:
Bite-size
Tags:
needs-triage

Description

The function txg_delay() is used to delay txg (transaction group) threads in ZFS.
The timeout value for this function is calculated using:

int timeout = ddi_get_lbolt() + ticks;

Later, the actual wait is performed:

        while (ddi_get_lbolt() < timeout &&
            tx->tx_syncing_txg < txg-1 && !txg_stalled(dp))
                (void) cv_timedwait(&tx->tx_quiesce_more_cv, &tx->tx_sync_lock,
                    timeout - ddi_get_lbolt());

The ddi_get_lbolt() function returns current uptime in clock ticks and is typed as clock_t.
The clock_t type on 64-bit architectures is int64_t.

The "timeout" variable will overflow depending on the tick frequency (e.g. for 1000 it will overflow in 28.855 days). This will make the expression "ddi_get_lbolt() < timeout" always false - txg threads will not be delayed anymore at all. This leads to a slowdown in ZFS writes.

The attached patch initializes timeout as clock_t to match the return value of ddi_get_lbolt().


Files

txg.c.patch (489 Bytes) txg.c.patch txg_delay() overflow bugfix Martin Matuška, 2011-08-01 02:37 PM

History

#1

Updated by Garrett D'Amore about 8 years ago

Martin, do you want to integrate this, or should I?

- Garrett
#2

Updated by Gordon Ross almost 8 years ago

  • Status changed from New to Resolved
  • Assignee set to Martin Matuška
changeset:   13487:78d9278724d7
user:        Martin Matuska <mm@FreeBSD.org>
date:        Tue Oct 18 18:08:05 2011 -0700
description:
    1313 Integer overflow in txg_delay()
    Reviewed by: Matthew Ahrens <matt@delphix.com>
    Reviewed by: Dan McDonald <danmcd@nexenta.com>
    Approved by: Eric Schrock <Eric.Schrock@delphix.com>

Also available in: Atom PDF