Project

General

Profile

Bug #8653

Use after free in UDP socket close.

Added by Andy Fiddaman over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
networking
Start date:
2017-09-12
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

I have a couple of servers that run as outbound DNS resolvers for my network and run unbound (https://www.unbound.net/) in a native zone under OmniOS r151022.
Unbound is linked against `libev` and that is configured to use event ports.

The servers are crashing at least a week (sometimes once a day) and the stack trace always looks the same:

@> $c
mutex_enter+0xb()
udp_do_close+0x57(ffffff32cf4f1600)
udp_close+0x28(ffffff32cf4f1600, 83, ffffff076c8bc2a0)
so_close+0x8c(ffffff0cb24c62f8, 83, ffffff076c8bc2a0)
socket_close_internal+0x58(ffffff0cb24c62f8, 83, ffffff076c8bc2a0)
socket_vop_close+0x128(ffffff19b3128000, 83, 1, 0, ffffff076c8bc2a0, 0)
fop_close+0x61(ffffff19b3128000, 83, 1, 0, ffffff076c8bc2a0, 0)
closef+0x97(ffffff07916b2068)
closeandsetf+0x3e6(25, 0)
close+0x13(25)
sys_syscall+0x296()

ffffff32cf4f1600::whatis

ffffff32cf4f1600 is ffffff32cf450000+a1600, freed from the heap vmem arena:
ADDR TYPE START END SIZE
THREAD TIMESTAMP
ffffff16368fdcb0 FREE ffffff32cf450000 ffffff32cf4f5000 675840@

I'm running a debug kernel so tracing the kmem allocations I can see the initial socket being created:

ADDR BUFADDR TIMESTAMP THREAD
CACHE LASTLOG CONTENTS
ffffff06ea6c19c0 ffffff0cb24c62f8 63a0b17e1604a ffffff073e7c43e0
ffffff07202f2448 ffffff06ea65ea80 ffffff06fa89f800
kmem_cache_alloc_debug+0x2e0
kmem_cache_alloc+0xdd
socket_sonode_create+0x72
socket_create+0xd1
so_socket+0x228

and then the same thread creates and destroys the UDP conn_t

ffffff06ea6c1a80 ffffff32cf4f1600 63a0b17e1b908 ffffff073e7c43e0
ffffff071bd08448 0 0
kmem_slab_alloc_impl+0x2f4
kmem_slab_alloc+0x65
kmem_cache_alloc+0x283
ipcl_conn_create+0x53
udp_do_open+0xe2
udp_create+0x6f
socket_init_common+0x1c0
so_init+0x25
socket_create+0xeb
so_socket+0x228
ffffff06ea6d0000 ffffff32cf4f1600 63a0b44b5adc1 ffffff073e7c43e0
ffffff071bd08448 ffffff06ea6c1a80 ffffff06fa8c5430
kmem_slab_free+0x3a9
kmem_slab_free_constructed+0x8a
kmem_cache_free+0xe0
ipcl_conn_destroy+0x4d6
ipcl_hash_remove+0x104
ip_quiesce_conn+0xd7
udp_do_close+0x57
udp_close+0x28
so_close+0x8c
socket_close_internal+0x58
socket_vop_close+0x128
fop_close+0x61
closef+0x97
closeandsetf+0x3e6
close+0x13

And then the crash occurs - same thread, just a little later on.

@panic[cpu7]/thread=ffffff073e7c43e0:
ffffff0021398c10 mutex_enter+0xb()
ffffff0021398c40 udp_do_close+0x57(ffffff32cf4f1600)
ffffff0021398c80 udp_close+0x28(ffffff32cf4f1600, 83, ffffff076c8bc2a0)
ffffff0021398ce0 so_close+0x8c(ffffff0cb24c62f8, 83, ffffff076c8bc2a0)
ffffff0021398d30 socket_close_internal+0x58(ffffff0cb24c62f8, 83,
ffffff076c8bc2a0)
ffffff0021398da0 socket_vop_close+0x128(ffffff19b3128000, 83, 1, 0,
ffffff076c8bc2a0, 0)
ffffff0021398e20 fop_close+0x61(ffffff19b3128000, 83, 1, 0, ffffff076c8bc2a0, 0
)
ffffff0021398e60 closef+0x97(ffffff07916b2068)
ffffff0021398ed0 closeandsetf+0x3e6(25, 0)
ffffff0021398ef0 close+0x13(25)
ffffff0021398f00 sys_syscall+0x296()

CPU ADDRESS TIMESTAMP TYPE VC HANDLER PC
7 ffffff07224d5028 e7e29312bc4e trap e #pf mutex_enter+b@

So far, the socket has always been an IPv6 one

> ffffff0cb24c62f8::print -t struct sonode so_family
short so_family = 0x1a

I'm running some dtrace scripts to try and get more information when this next occurs and will post here with anything new.

History

#1

Updated by Dan McDonald over 2 years ago

Found an IPv6-socket-specific free-without-reference case, and managed to make the kernel panic using mdb to simulate a race condition. This is a bug in a codepath which isn't hit all that often due to the raciness of the backup code check in ip_fanout_udp_multi_v4().

ip_fanout_udp_multi_v4() has TWO conn-fanout scans. The first is for AF_INET sockets and AF_INET6 sockets with v4-mapped addresses in their joined multicast groups. This is the common-case. A second conn-fanout scan is for AF_INET6 sockets that still may match, but didn't in the first scan. A look at the locking that occurs in the function shows a window for a sneaky process to bind to the multicast group after the first scan but before the second.

Here's an annotated and edited snippet from the second scan path in ip_fanout_udp_multi_v4():


5373    IN6_IPADDR_TO_V4MAPPED(ipha->ipha_src, &v6faddr);
5374    connfp = &ipst->ips_ipcl_udp_fanout[IPCL_UDP_HASH(lport, ipst)];
5375    mutex_enter(&connfp->connf_lock);
5376    connp = connfp->connf_head;
5377    /*
5378     * IPv4 multicast packet being delivered to an AF_INET6
5379     * in6addr_any endpoint.
5380     * Need to check conn_wantpacket(). Note that we use conn_wantpacket()
5381     * and not conn_wantpacket_v6() since any multicast membership is
5382     * for an IPv4-mapped multicast address.
5383     */
5384    while (connp != NULL) {
5385        if (IPCL_UDP_MATCH_V6(connp, lport, ipv6_all_zeros,
5386            fport, v6faddr) &&
5387            conn_wantpacket(connp, ira, ipha) &&
5388            (!(ira->ira_flags & IRAF_SYSTEM_LABELED) ||
5389            tsol_receive_local(mp, &laddr, IPV4_VERSION, ira, connp)))
5390            break;
5391        connp = connp->conn_next;
5392    }
5393
5394    if (connp == NULL) {
5395        /*
5396         * No one bound to this port.  Is
5397         * there a client that wants all
5398         * unclaimed datagrams?
5399         */
5400        mutex_exit(&connfp->connf_lock);
            /* ERROR CASE EDITED... */
5416        return;
5417    }
5418    ASSERT(IPCL_IS_NONSTR(connp) || connp->conn_rq != NULL);
5419    /* XXX KEBE ASKS, WHY NO REFHOLD?!?!?!? */
5420    /*
5421     * If SO_REUSEADDR has been set on the first we send the
5422     * packet to all clients that have joined the group and
5423     * match the port.
5424     */
5425    if (connp->conn_reuseaddr) {
5426        conn_t        *first_connp = connp;
5427        conn_t        *next_connp;
5428        mblk_t        *mp1;
5429
5430        CONN_INC_REF(connp);   /* XXX KEBE SAYS OH IT'S HERE! BUT WHAT IF NO reuseaddr?!? */
5431        connp = connp->conn_next;
5432        for (;;) {
                /* check for other sockets edited... *? 
5467        }
5468    }
5469
5470    /* Last one.  Send it upstream. */
5471    mutex_exit(&connfp->connf_lock);
5472    IP_STAT(ipst, ip_udp_fanmb);
5473    ip_fanout_udp_conn(connp, mp, ipha, NULL, ira);
        /* XXX KEBE SAYS OH HELL NO! RELEASE WITHOUT FIRST HOLDING if no reuseaddr! */
5474    CONN_DEC_REF(connp);
#2

Updated by Dan McDonald over 2 years ago

Moving the CONN_INC_REF on line 5430 to right before line 5418 will prevent release-without-hold problems on this code path.

I tested this by forcing the earlier conn_t scan to fail, causing this secondary scan to invoke. Once the application with the open socket quit the DEBUG kernel panicked immediately with reference-count assertion failure.

It's theoretically possible to force a rapidly open/close/open loop around a multicast socket, combined with lots of packets, to slip a socket open in between the two scans naturally, but I've had little luck so far with it.

#3

Updated by Dan McDonald over 2 years ago

  • Assignee set to Dan McDonald
#4

Updated by Dan McDonald over 2 years ago

This patch has been running on Andy's servers for 23 days now. Before this patch, the servers would crash at least once a week per his initial report.

#5

Updated by Electric Monk over 2 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit da61ddb06d5ac64f86b2dfb126769267c1ebcd67

commit  da61ddb06d5ac64f86b2dfb126769267c1ebcd67
Author: Dan McDonald <danmcd@joyent.com>
Date:   2018-01-09T01:40:03.000Z

    8653 Use after free in UDP socket close.
    Reviewed by: Andy Fiddaman <omnios@citrus-it.net>
    Reviewed by: Richard Lowe <richlowe@richlowe.net>
    Approved by: Gordon Ross <gordon.ross@nexenta.com>

Also available in: Atom PDF