Project

General

Profile

Actions

Bug #6161

closed

zero-sized kmem_alloc() in zfs`spa_load_l2cache

Added by Alexander Kolbasov almost 6 years ago. Updated about 7 hours ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2015-08-24
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:

Description

The illumos-based system crashed with

attempted to kmem_alloc() size of 0
ffffff007aee8a00 vpanic()
ffffff007aee8a40 0xfffffffffbdb1bfe()
ffffff007aee8ad0 spa_load_l2cache+0x92(ffffff17c21c3000)
ffffff007aee8b60 spa_vdev_remove+0x25d(ffffff17c21c3000, 6e9a6f1dbde5bc2d, 0)
ffffff007aee8bb0 zfs_ioc_vdev_remove+0x39(ffffff18b7e11000)
ffffff007aee8c60 zfsdev_ioctl+0x4ff(a500000000, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48)
ffffff007aee8ca0 cdev_ioctl+0x39(a500000000, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48)
ffffff007aee8cf0 spec_ioctl+0x60(ffffff1225c2df00, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48, 0)
ffffff007aee8d80 fop_ioctl+0x55(ffffff1225c2df00, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48, 0)
ffffff007aee8ea0 ioctl+0x9b(9, 5a0c, febcca68)
ffffff007aee8f00 _sys_sysenter_post_swapgs+0x241()

Related issues

Has duplicate illumos gate - Bug #12054: spa_load_l2cache() can zero-length allocateDuplicate

Actions
Has duplicate illumos gate - Bug #13933: zero-sized kmem_alloc() in zfs`spa_load_l2cacheDuplicate

Actions
Actions #1

Updated by Alexander Kolbasov almost 6 years ago

The problem occurs here:

static void
spa_load_l2cache(spa_t *spa)
{
    nvlist_t **l2cache;
    uint_t nl2cache;
    int i, j, oldnvdevs;
    uint64_t guid;
    vdev_t *vd, **oldvdevs, **newvdevs;
    spa_aux_vdev_t *sav = &spa->spa_l2cache;

    ASSERT(spa_config_held(spa, SCL_ALL, RW_WRITER) == SCL_ALL);

    if (sav->sav_config != NULL) {
        VERIFY(nvlist_lookup_nvlist_array(sav->sav_config,
            ZPOOL_CONFIG_L2CACHE, &l2cache, &nl2cache) == 0);
        newvdevs = kmem_alloc(nl2cache * sizeof (void *), KM_SLEEP); // <-- Here
    } else 
             ,,,
    }

The l2cache config lists an empty array:

0xffffff171be2f730::nvlist -v

name='l2cache' type=nvpair list_array items=0

So the length is zero.

Actions #2

Updated by Alexander Kolbasov almost 6 years ago

See https://smartos.org/bugview/OS-4539 for a discussion about similar issue.

Actions #3

Updated by Alexander Kolbasov almost 6 years ago

Suggested fix:

diff --git a/usr/src/uts/common/fs/zfs/spa.c b/usr/src/uts/common/fs/zfs/spa.c
index 38b1610..b637ad8 100644
--- a/usr/src/uts/common/fs/zfs/spa.c
+++ b/usr/src/uts/common/fs/zfs/spa.c
@@ -1468,10 +1468,11 @@ static void
 spa_load_l2cache(spa_t *spa)
 {
        nvlist_t **l2cache;
-       uint_t nl2cache;
+       uint_t nl2cache = 0;
        int i, j, oldnvdevs;
        uint64_t guid;
-       vdev_t *vd, **oldvdevs, **newvdevs;
+       vdev_t *vd, **oldvdevs;
+       vdev_t **newvdevs = NULL;
        spa_aux_vdev_t *sav = &spa->spa_l2cache;

        ASSERT(spa_config_held(spa, SCL_ALL, RW_WRITER) == SCL_ALL);
@@ -1479,10 +1480,10 @@ spa_load_l2cache(spa_t *spa)
        if (sav->sav_config != NULL) {
                VERIFY(nvlist_lookup_nvlist_array(sav->sav_config,
                    ZPOOL_CONFIG_L2CACHE, &l2cache, &nl2cache) == 0);
-               newvdevs = kmem_alloc(nl2cache * sizeof (void *), KM_SLEEP);
-       } else {
-               nl2cache = 0;
-               newvdevs = NULL;
+               if (nl2cache > 0) {
+                       newvdevs = kmem_alloc(nl2cache * sizeof (void *),
+                                               KM_SLEEP);
+               }
        }

        oldvdevs = sav->sav_vdevs;
Actions #4

Updated by Alexander Kolbasov almost 6 years ago

The fix wasn't complete, we also need

diff --git a/usr/src/uts/common/fs/zfs/spa.c b/usr/src/uts/common/fs/zfs/spa.c
index b637ad8..2dfccc9 100644
--- a/usr/src/uts/common/fs/zfs/spa.c
+++ b/usr/src/uts/common/fs/zfs/spa.c
@@ -1582,12 +1582,14 @@ spa_load_l2cache(spa_t *spa)
        VERIFY(nvlist_remove(sav->sav_config, ZPOOL_CONFIG_L2CACHE,
            DATA_TYPE_NVLIST_ARRAY) == 0);

-       l2cache = kmem_alloc(sav->sav_count * sizeof (void *), KM_SLEEP);
-       for (i = 0; i < sav->sav_count; i++)
-               l2cache[i] = vdev_config_generate(spa,
-                   sav->sav_vdevs[i], B_TRUE, VDEV_CONFIG_L2CACHE);
-       VERIFY(nvlist_add_nvlist_array(sav->sav_config,
-           ZPOOL_CONFIG_L2CACHE, l2cache, sav->sav_count) == 0);
+       if (sav->sav_count != 0) {
+               l2cache = kmem_alloc(sav->sav_count * sizeof (void *), KM_SLEEP);
+               for (i = 0; i < sav->sav_count; i++)
+                       l2cache[i] = vdev_config_generate(spa,
+                           sav->sav_vdevs[i], B_TRUE, VDEV_CONFIG_L2CACHE);
+               VERIFY(nvlist_add_nvlist_array(sav->sav_config,
+                   ZPOOL_CONFIG_L2CACHE, l2cache, sav->sav_count) == 0);
+       }
 out:
        for (i = 0; i < sav->sav_count; i++)
                nvlist_free(l2cache[i]);
Actions #5

Updated by Andy Fiddaman 19 days ago

  • Has duplicate Bug #12054: spa_load_l2cache() can zero-length allocate added
Actions #6

Updated by Andy Fiddaman 19 days ago

  • Has duplicate Bug #13933: zero-sized kmem_alloc() in zfs`spa_load_l2cache added
Actions #7

Updated by Andy Fiddaman 19 days ago

  • Subject changed from spa_load_l2cache() crashes trying to perform zero-sized allocation to zero-sized kmem_alloc() in zfs`spa_load_l2cache
  • Status changed from New to In Progress
  • Assignee set to Andy Fiddaman
  • Difficulty changed from Medium to Bite-size
  • Tags deleted (needs-triage)
Actions #8

Updated by Andy Fiddaman 19 days ago

Adding new notes from duplicate issue:

OmniOS r151039 Version omnios-upstream_merge-2021070801-adc6d6fc9b 64-bit
Copyright (c) 2012-2017 OmniTI Computer Consulting, Inc.
Copyright (c) 2017-2021 OmniOS Community Edition (OmniOSce) Association.
DEBUG enabled
WARNING: kmem_alloc(): sleeping allocation with size of 0; see kmem_zerosized_ls
WARNING: kmem_alloc(): sleeping allocation with size of 0; see kmem_zerosized_ls
> ::kmalog zerosized

T-0.000000000  addr=0  0
         kmem_log_event+0x9b
         kmem_alloc+0x17d
         spa_load_l2cache+0x328
         spa_ld_open_aux_vdevs+0x158
         spa_load_impl+0x17e
         spa_load+0x5d
         spa_load_best+0x6c
         spa_open_common+0x118
         spa_open+0x15
         dsl_pool_hold+0x3f
         dsl_dsobj_to_dsname+0x35
         zfs_parse_bootfs+0xb4
         zfs_mountroot+0xf6
         fsop_mountroot+0x13
         rootconf+0x170

T-0.001949228  addr=0  0
         kmem_log_event+0x9b
         kmem_alloc+0x17d
         spa_load_l2cache+0x97
         spa_ld_open_aux_vdevs+0x158
         spa_load_impl+0x17e
         spa_load+0x5d
         spa_load_best+0x6c
         spa_open_common+0x118
         spa_open+0x15
         dsl_pool_hold+0x3f
         dsl_dsobj_to_dsname+0x35
         zfs_parse_bootfs+0xb4
         zfs_mountroot+0xf6
         fsop_mountroot+0x13
         rootconf+0x170

Slightly different stack to when this bug was first created (at the point that kmem_alloc(0) was fatal), but the same fix is required.

Actions #9

Updated by Electric Monk 19 days ago

  • Gerrit CR set to 1592
Actions #10

Updated by Andy Fiddaman about 11 hours ago

Tested the change in the associated Gerrit review by running the ZFS test suite iteratively with DEBUG bits until no more tests passed. I was left with 13 failures, the same failures that are present when running bits from before the change.

For reference:

Test: /opt/zfs-tests/tests/functional/acl/nontrivial/zfs_acl_chmod_inherit_003_pos (run as root) [00:58] [FAIL]
Test: /opt/zfs-tests/tests/functional/bootfs/bootfs_006_pos (run as root) [00:03] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zfs_clone/zfs_clone_010_pos (run as root) [00:07] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zfs_property/zfs_written_property_001_pos (run as root) [00:11] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zpool_expand/zpool_expand_001_pos (run as root) [00:25] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zpool_initialize/zpool_initialize_verify_initialized (run as root) [00:05] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zpool_scrub/zpool_scrub_multiple_copies (run as root) [00:05] [FAIL]
Test: /opt/zfs-tests/tests/functional/l2arc/persist_l2arc_008_pos (run as root) [00:57] [FAIL]
Test: /opt/zfs-tests/tests/functional/no_space/enospc_002_pos (run as root) [00:49] [FAIL]
Test: /opt/zfs-tests/tests/functional/refreserv/refreserv_004_pos (run as root) [00:02] [FAIL]
Test: /opt/zfs-tests/tests/functional/removal/removal_with_add (run as root) [00:15] [FAIL]
Test: /opt/zfs-tests/tests/functional/rsend/rsend_008_pos (run as root) [00:01] [FAIL]
Test: /opt/zfs-tests/tests/functional/vdev_zaps/vdev_zaps_007_pos (run as root) [00:07] [FAIL]
Actions #11

Updated by Electric Monk about 7 hours ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit f3a2bc1eccb884e62be4c0b42935466b79b1342d

commit  f3a2bc1eccb884e62be4c0b42935466b79b1342d
Author: Andy Fiddaman <omnios@citrus-it.co.uk>
Date:   2021-07-27T15:15:07.000Z

    6161 zero-sized kmem_alloc() in zfs`spa_load_l2cache
    Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
    Reviewed by: Jason King <jason.brian.king@gmail.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF