Project

General

Profile

Bug #7185

IP DCEs leak from halted non-global zones

Added by Dan McDonald about 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
networking
Start date:
2016-07-14
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

Steps to reproduce:

1.) Boot a zone.
2.) Establish a connection over localhost (my first leak had 127.0.0.1, but this one seemed to be over v6).
3.) While connection to itself is established, halt the zone.
4.) "reboot -d" and the subsequent dump will have a leak of at least one DCE.

Stacks look like this:

dce_cache leak: 1 buffer, 152 bytes
ADDR BUFADDR TIMESTAMP THREAD
CACHE LASTLOG CONTENTS
ffffff025a8da898 ffffff025a83ebb8 49f5eeb4e8 ffffff027d6b1440
ffffff025a8ae348 ffffff024bdab2c0 0
kmem_cache_alloc_debug+0x2e0
kmem_cache_alloc+0x320
dce_lookup_and_add_v4+0xe9
ip_set_destination_v4+0x392
ip_attr_connect+0x109
conn_connect+0x122
tcp_set_destination+0x70
tcp_connect_ipv4+0x11f
tcp_do_connect+0x505
tcp_connect+0xc9
so_connect+0xfe
socket_connect+0x3c
connect+0xb1

dce_cache leak: 1 buffer, 152 bytes
ADDR BUFADDR TIMESTAMP THREAD
CACHE LASTLOG CONTENTS
ffffff025a8da7c0 ffffff025a83ec68 42da3c1fb9 ffffff027cc6c3c0
ffffff025a8ae348 ffffff024e7d0080 0
kmem_cache_alloc_debug+0x2e0
kmem_cache_alloc+0x320
dce_lookup_and_add_v6+0x171
ip_set_destination_v6+0x549
ip_attr_connect+0x172
conn_connect+0x122
tcp_set_destination+0x70
tcp_connect_ipv6+0x1a9
tcp_do_connect+0x542
tcp_connect+0xc9
so_connect+0xfe
socket_connect+0x3c
connect+0xb1

It's not clear yet if CLOSED connections also leak, but I think this is likely a bug in the netstack-teardown code that forgets to clean up ip_xmit_attr DCE holds.

History

#1

Updated by Dan McDonald about 4 years ago

Also happens with off-link connections initiated by the zone as well. Here's a leaked IPv4 DCE:

dce_u = {
dceu_v6addr = a08:3e5:0:0::
dceu_v4addr = 10.8.3.229
}
#2

Updated by Dan McDonald about 4 years ago

One possible fix is to have dce_stack_destroy() check for stragglers hanging off of it. It's defensive programming, and does not account for the CAUSE of the leak.


329 void
330 dce_stack_destroy(ip_stack_t *ipst)
331 {
332    int i;
333    for (i = 0; i < ipst->ips_dce_hashsize; i++) {

            /* XXX KEBE SAYS clean v4 & v6 hash buckets here. */

334        rw_destroy(&ipst->ips_dce_hash_v4[i].dcb_lock);
335        rw_destroy(&ipst->ips_dce_hash_v6[i].dcb_lock);
336     }
337 kmem_free(ipst->ips_dce_hash_v4,
338        ipst->ips_dce_hashsize * sizeof (dcb_t));
339    ipst->ips_dce_hash_v4 = NULL;
340    kmem_free(ipst->ips_dce_hash_v6,
341        ipst->ips_dce_hashsize * sizeof (dcb_t));
342    ipst->ips_dce_hash_v6 = NULL;
343    ipst->ips_dce_hashsize = 0;
344
345    ASSERT(ipst->ips_dce_default->dce_refcnt == 1);
346    kmem_cache_free(dce_cache, ipst->ips_dce_default);
347    ipst->ips_dce_default = NULL;
348 }
#3

Updated by Dan McDonald about 4 years ago

This bug predates #7061 and #7062, because OmniOS r151018 can reproduce this bug, and '018 does not have those fixes. This eliminates those two recent fixes as a cause of this bug.

#4

Updated by Dan McDonald over 3 years ago

The DCE netstack teardown code assumes (incorrectly) that all DCEs have been unlinked from the hash buckets by interface (ill_t to be precise) teardowns. The dce_cleanup() function is only instantiated for IPv6 DCEs, not IPv4 ones. (In fact, the leaks will not show any IPv6 DCEs.)

Per the code in ip_dce.c, DCEs only get freed when a netstack responds to memory pressure, or when it detects a hash-bucket is too deep - thanks to checks-and-sets in dce_lookup_and_add_v46() functions.

#5

Updated by Electric Monk over 3 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 4510c7eb59fd7173e2f25391b94e238c416d4f2e

commit  4510c7eb59fd7173e2f25391b94e238c416d4f2e
Author: Dan McDonald <danmcd@omniti.com>
Date:   2017-04-10T17:43:31.000Z

    7185 IP DCEs leak from halted non-global zones
    Reviewed by: Jason King <jason.brian.king@gmail.com>
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Approved by: Gordon Ross <gordon.w.ross@gmail.com>

Also available in: Atom PDF