Project

General

Profile

Actions

Bug #6161

closed

zero-sized kmem_alloc() in zfs`spa_load_l2cache

Added by Alexander Kolbasov about 6 years ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
zfs - Zettabyte File System
Start date:
2015-08-24
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:

Description

The illumos-based system crashed with

attempted to kmem_alloc() size of 0
ffffff007aee8a00 vpanic()
ffffff007aee8a40 0xfffffffffbdb1bfe()
ffffff007aee8ad0 spa_load_l2cache+0x92(ffffff17c21c3000)
ffffff007aee8b60 spa_vdev_remove+0x25d(ffffff17c21c3000, 6e9a6f1dbde5bc2d, 0)
ffffff007aee8bb0 zfs_ioc_vdev_remove+0x39(ffffff18b7e11000)
ffffff007aee8c60 zfsdev_ioctl+0x4ff(a500000000, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48)
ffffff007aee8ca0 cdev_ioctl+0x39(a500000000, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48)
ffffff007aee8cf0 spec_ioctl+0x60(ffffff1225c2df00, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48, 0)
ffffff007aee8d80 fop_ioctl+0x55(ffffff1225c2df00, 5a0c, febcca68, 100003, ffffff1229561108, ffffff007aee8e48, 0)
ffffff007aee8ea0 ioctl+0x9b(9, 5a0c, febcca68)
ffffff007aee8f00 _sys_sysenter_post_swapgs+0x241()

Related issues

Has duplicate illumos gate - Bug #12054: spa_load_l2cache() can zero-length allocateDuplicate

Actions
Has duplicate illumos gate - Bug #13933: zero-sized kmem_alloc() in zfs`spa_load_l2cacheDuplicate

Actions
Actions #1

Updated by Alexander Kolbasov about 6 years ago

The problem occurs here:

static void
spa_load_l2cache(spa_t *spa)
{
    nvlist_t **l2cache;
    uint_t nl2cache;
    int i, j, oldnvdevs;
    uint64_t guid;
    vdev_t *vd, **oldvdevs, **newvdevs;
    spa_aux_vdev_t *sav = &spa->spa_l2cache;

    ASSERT(spa_config_held(spa, SCL_ALL, RW_WRITER) == SCL_ALL);

    if (sav->sav_config != NULL) {
        VERIFY(nvlist_lookup_nvlist_array(sav->sav_config,
            ZPOOL_CONFIG_L2CACHE, &l2cache, &nl2cache) == 0);
        newvdevs = kmem_alloc(nl2cache * sizeof (void *), KM_SLEEP); // <-- Here
    } else 
             ,,,
    }

The l2cache config lists an empty array:

0xffffff171be2f730::nvlist -v

name='l2cache' type=nvpair list_array items=0

So the length is zero.

Actions #2

Updated by Alexander Kolbasov about 6 years ago

See https://smartos.org/bugview/OS-4539 for a discussion about similar issue.

Actions #3

Updated by Alexander Kolbasov about 6 years ago

Suggested fix:

diff --git a/usr/src/uts/common/fs/zfs/spa.c b/usr/src/uts/common/fs/zfs/spa.c
index 38b1610..b637ad8 100644
--- a/usr/src/uts/common/fs/zfs/spa.c
+++ b/usr/src/uts/common/fs/zfs/spa.c
@@ -1468,10 +1468,11 @@ static void
 spa_load_l2cache(spa_t *spa)
 {
        nvlist_t **l2cache;
-       uint_t nl2cache;
+       uint_t nl2cache = 0;
        int i, j, oldnvdevs;
        uint64_t guid;
-       vdev_t *vd, **oldvdevs, **newvdevs;
+       vdev_t *vd, **oldvdevs;
+       vdev_t **newvdevs = NULL;
        spa_aux_vdev_t *sav = &spa->spa_l2cache;

        ASSERT(spa_config_held(spa, SCL_ALL, RW_WRITER) == SCL_ALL);
@@ -1479,10 +1480,10 @@ spa_load_l2cache(spa_t *spa)
        if (sav->sav_config != NULL) {
                VERIFY(nvlist_lookup_nvlist_array(sav->sav_config,
                    ZPOOL_CONFIG_L2CACHE, &l2cache, &nl2cache) == 0);
-               newvdevs = kmem_alloc(nl2cache * sizeof (void *), KM_SLEEP);
-       } else {
-               nl2cache = 0;
-               newvdevs = NULL;
+               if (nl2cache > 0) {
+                       newvdevs = kmem_alloc(nl2cache * sizeof (void *),
+                                               KM_SLEEP);
+               }
        }

        oldvdevs = sav->sav_vdevs;
Actions #4

Updated by Alexander Kolbasov about 6 years ago

The fix wasn't complete, we also need

diff --git a/usr/src/uts/common/fs/zfs/spa.c b/usr/src/uts/common/fs/zfs/spa.c
index b637ad8..2dfccc9 100644
--- a/usr/src/uts/common/fs/zfs/spa.c
+++ b/usr/src/uts/common/fs/zfs/spa.c
@@ -1582,12 +1582,14 @@ spa_load_l2cache(spa_t *spa)
        VERIFY(nvlist_remove(sav->sav_config, ZPOOL_CONFIG_L2CACHE,
            DATA_TYPE_NVLIST_ARRAY) == 0);

-       l2cache = kmem_alloc(sav->sav_count * sizeof (void *), KM_SLEEP);
-       for (i = 0; i < sav->sav_count; i++)
-               l2cache[i] = vdev_config_generate(spa,
-                   sav->sav_vdevs[i], B_TRUE, VDEV_CONFIG_L2CACHE);
-       VERIFY(nvlist_add_nvlist_array(sav->sav_config,
-           ZPOOL_CONFIG_L2CACHE, l2cache, sav->sav_count) == 0);
+       if (sav->sav_count != 0) {
+               l2cache = kmem_alloc(sav->sav_count * sizeof (void *), KM_SLEEP);
+               for (i = 0; i < sav->sav_count; i++)
+                       l2cache[i] = vdev_config_generate(spa,
+                           sav->sav_vdevs[i], B_TRUE, VDEV_CONFIG_L2CACHE);
+               VERIFY(nvlist_add_nvlist_array(sav->sav_config,
+                   ZPOOL_CONFIG_L2CACHE, l2cache, sav->sav_count) == 0);
+       }
 out:
        for (i = 0; i < sav->sav_count; i++)
                nvlist_free(l2cache[i]);
Actions #5

Updated by Andy Fiddaman 3 months ago

  • Has duplicate Bug #12054: spa_load_l2cache() can zero-length allocate added
Actions #6

Updated by Andy Fiddaman 3 months ago

  • Has duplicate Bug #13933: zero-sized kmem_alloc() in zfs`spa_load_l2cache added
Actions #7

Updated by Andy Fiddaman 3 months ago

  • Subject changed from spa_load_l2cache() crashes trying to perform zero-sized allocation to zero-sized kmem_alloc() in zfs`spa_load_l2cache
  • Status changed from New to In Progress
  • Assignee set to Andy Fiddaman
  • Difficulty changed from Medium to Bite-size
  • Tags deleted (needs-triage)
Actions #8

Updated by Andy Fiddaman 3 months ago

Adding new notes from duplicate issue:

OmniOS r151039 Version omnios-upstream_merge-2021070801-adc6d6fc9b 64-bit
Copyright (c) 2012-2017 OmniTI Computer Consulting, Inc.
Copyright (c) 2017-2021 OmniOS Community Edition (OmniOSce) Association.
DEBUG enabled
WARNING: kmem_alloc(): sleeping allocation with size of 0; see kmem_zerosized_ls
WARNING: kmem_alloc(): sleeping allocation with size of 0; see kmem_zerosized_ls
> ::kmalog zerosized

T-0.000000000  addr=0  0
         kmem_log_event+0x9b
         kmem_alloc+0x17d
         spa_load_l2cache+0x328
         spa_ld_open_aux_vdevs+0x158
         spa_load_impl+0x17e
         spa_load+0x5d
         spa_load_best+0x6c
         spa_open_common+0x118
         spa_open+0x15
         dsl_pool_hold+0x3f
         dsl_dsobj_to_dsname+0x35
         zfs_parse_bootfs+0xb4
         zfs_mountroot+0xf6
         fsop_mountroot+0x13
         rootconf+0x170

T-0.001949228  addr=0  0
         kmem_log_event+0x9b
         kmem_alloc+0x17d
         spa_load_l2cache+0x97
         spa_ld_open_aux_vdevs+0x158
         spa_load_impl+0x17e
         spa_load+0x5d
         spa_load_best+0x6c
         spa_open_common+0x118
         spa_open+0x15
         dsl_pool_hold+0x3f
         dsl_dsobj_to_dsname+0x35
         zfs_parse_bootfs+0xb4
         zfs_mountroot+0xf6
         fsop_mountroot+0x13
         rootconf+0x170

Slightly different stack to when this bug was first created (at the point that kmem_alloc(0) was fatal), but the same fix is required.

Actions #9

Updated by Electric Monk 3 months ago

  • Gerrit CR set to 1592
Actions #10

Updated by Andy Fiddaman about 2 months ago

Tested the change in the associated Gerrit review by running the ZFS test suite iteratively with DEBUG bits until no more tests passed. I was left with 13 failures, the same failures that are present when running bits from before the change.

For reference:

Test: /opt/zfs-tests/tests/functional/acl/nontrivial/zfs_acl_chmod_inherit_003_pos (run as root) [00:58] [FAIL]
Test: /opt/zfs-tests/tests/functional/bootfs/bootfs_006_pos (run as root) [00:03] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zfs_clone/zfs_clone_010_pos (run as root) [00:07] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zfs_property/zfs_written_property_001_pos (run as root) [00:11] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zpool_expand/zpool_expand_001_pos (run as root) [00:25] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zpool_initialize/zpool_initialize_verify_initialized (run as root) [00:05] [FAIL]
Test: /opt/zfs-tests/tests/functional/cli_root/zpool_scrub/zpool_scrub_multiple_copies (run as root) [00:05] [FAIL]
Test: /opt/zfs-tests/tests/functional/l2arc/persist_l2arc_008_pos (run as root) [00:57] [FAIL]
Test: /opt/zfs-tests/tests/functional/no_space/enospc_002_pos (run as root) [00:49] [FAIL]
Test: /opt/zfs-tests/tests/functional/refreserv/refreserv_004_pos (run as root) [00:02] [FAIL]
Test: /opt/zfs-tests/tests/functional/removal/removal_with_add (run as root) [00:15] [FAIL]
Test: /opt/zfs-tests/tests/functional/rsend/rsend_008_pos (run as root) [00:01] [FAIL]
Test: /opt/zfs-tests/tests/functional/vdev_zaps/vdev_zaps_007_pos (run as root) [00:07] [FAIL]
Actions #11

Updated by Electric Monk about 2 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

git commit f3a2bc1eccb884e62be4c0b42935466b79b1342d

commit  f3a2bc1eccb884e62be4c0b42935466b79b1342d
Author: Andy Fiddaman <omnios@citrus-it.co.uk>
Date:   2021-07-27T15:15:07.000Z

    6161 zero-sized kmem_alloc() in zfs`spa_load_l2cache
    Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
    Reviewed by: Jason King <jason.brian.king@gmail.com>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions

Also available in: Atom PDF