Project

General

Profile

Actions

Bug #8276

closed

rpcbind leaks memory due to libumem per thread caching.

Added by Youzhong Yang over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Start date:
2017-05-26
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:
External Bug:

Description

Here is how to reproduce it:

1. cd /var/tmp ; wget http://www.cs.duke.edu/ari/fstress/download/fstress-export.tgz ; tar xf fstress-export.tgz ; cd /var/tmp/fstress
2. export FSTRESS_HOME=/var/tmp/fstress
3. replace src-testprogs/readdir-test.c with the following code:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <rpc/rpc.h>

#include "porting.h" 
#include "nfs_constants.h" 
#include "dns.h" 
#include "msg.h" 
#include "rpc.h" 
#include "nfs.h" 

int
main(int argc, char *argv[])
{
        char *hostname = argv[1];
        int loops = 2000000000;
        struct in_addr addr;
        int sock, socktype = SOCK_STREAM;
        int i;
        int cnt_local = 0;
        int cnt_burst = 32;

        if (argc < 2) {
                fprintf(stderr, "usage: readdir-test <hostname> [loops] [cnt_burst]\n");
                return -1;
        }

        if (argc >= 3) loops = atoi(argv[2]);
        if (argc >= 4) cnt_burst = atoi(argv[3]);

        printf("loops = %d\n", loops);
        printf("bursts = %d\n", cnt_burst);

        if (dns_name2addr(hostname, &addr) != 0) {
                fprintf(stderr, "dns_name2addr(\"%s\", ...) failed\n",
                        hostname);
                return -1;
        }

        for (i = 0; i < loops; i++) {
                if ((sock = rpc_client(addr, NFS_PROG, NFS_VER3, socktype, 0)) < 0) {
                        fprintf(stderr, "rpc_client failed\n");
                        return -1;
                }

                close(sock);
                cnt_local++;
                if (cnt_local >= cnt_burst) {
                        cnt_local = 0;
                        sleep(1);
                }
        }
        return 0;
}

4 . gmake
5. obj-SunOS-i86pc/readdir-test localhost

Run the following command to monitor the rpcbind memory usage:

# while true; do mdb -e ::umastat -p $(pgrep -f rpcbind) | grep 'umem_alloc_16 '; sleep 2; done

If there are more than one rpcbind processes running, get the pid of the main one by 'svcs -pv rpc/bind'.

You will see that the 'bufs in use' keeps climbing up.

The memory leaking can be stopped by disabling libumem per thread caching. Just edit the start up script, /lib/svc/method/rpc-bind, change the startup line to be the following:

env UMEM_OPTIONS="perthread_cache=0" /usr/sbin/rpcbind > /dev/msglog 2>&1

Actions #1

Updated by Youzhong Yang over 6 years ago

figured out why it leaks:

              libc.so.1`thr_keycreate
              libc.so.1`pthread_key_create_once_np+0x47
              libnsl.so.1`thr_get_storage+0x21
              libnsl.so.1`__t_errno+0x3b
              libnsl.so.1`__rpc_get_ltaddr+0xca
              libnsl.so.1`set_src_addr+0x24
              libnsl.so.1`svc_dg_reply+0xee
              libnsl.so.1`svc_sendreply+0x44
              rpcbind`rpcb_service_4+0x2ed
              libnsl.so.1`_svc_prog_dispatch+0x153
              libnsl.so.1`_svc_run_mt+0x593
              libc.so.1`_thrp_setup+0x88
              libc.so.1`_lwp_start

rpcbind stores tsd(thread specific data), whose destructor is set to 'free' function. However, in _thrp_exit(), releasing tmem and tsd is as follows:
    tmem_exit();        /* deallocate tmem allocations */
    tsd_exit();        /* deallocate thread-specific data */
    tls_exit();        /* deallocate thread-local storage */

which causes the leaks. The fix is simple, switch the order of tmem_exit() and tsd_exit().

I tested the fix, so far so good. Will prepare a review request.

Actions #3

Updated by Electric Monk over 6 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit 7c4ab494ff60bbbcc0889e71388ae63e903bbf57

commit  7c4ab494ff60bbbcc0889e71388ae63e903bbf57
Author: Youzhong Yang <yyang@mathworks.com>
Date:   2017-06-07T20:00:54.000Z

    8276 rpcbind leaks memory due to libumem per thread caching.
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Gordon Ross <gordon.w.ross@gmail.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Actions

Also available in: Atom PDF