Bug #5533
openLOADPRIVDATA macro could return NULL on memory stress
0%
Description
During memory stress condition,
cron service is dumping core multiple times.
mdb ./cron.webmail_mtc.20110.1413125400.core Loading modules: [ libc.so.1 ld.so.1 ] > $C fb5ac7d8 libc_hwcap1.so.1`ucred_size+0x1b(8d0bad0, f6173000, fb5ac7f8, f6088733, 8d0bad0, f5fd2000) fb5ac7e8 libc_hwcap1.so.1`_ucred_alloc+0x19(8d0bad0, f5fd2000, fb5ac828, f5fae7ba, ffffffff, f5f90018) fb5ac7f8 libc_hwcap1.so.1`ucred_get+0x19(ffffffff, f5f90018, f60d8df5, f5fae79f) fb5ac828 libbsm.so.1`adt_init+0x8e(8d0bad0, 1, 8ce83b8, f5facfd5) fb5ac868 libbsm.so.1`adt_start_session+0x62(fb5ac8b8, 0, 1, f5d21a18) fb5b2178 pam_unix_cred.so.1`pam_sm_setcred+0x211(8d0a828, 1, 0, 0) fb5b21e8 libpam.so.1`run_stack+0x5e5(8d0a828, 0, 1, 1, 10, 2) fb5b2218 libpam.so.1`pam_setcred+0x31(8d0a828, 1, 806b430, 806e04c) fb5b2238 set_user_cred+0xa1(8ce6098, 0, fb5b4338, 8056fda) fb5b4338 ex+0x3d6(8cea720, fb5b4380, fb5b4428, 8053ab7) fb5b4428 main+0x3a6(1, fb5b4460, fb5b4468, fb5b441c) fb5b4454 _start+0x7d(1, fb5b4524, 0, fb5b4533, fb5b454a, fb5b4551) > ucred_size+0x1b::dis libc_hwcap1.so.1`ucred_size: pushl %ebp libc_hwcap1.so.1`ucred_size+1: movl %esp,%ebp libc_hwcap1.so.1`ucred_size+3: pushl %ebx libc_hwcap1.so.1`ucred_size+4: subl $0x4,%esp libc_hwcap1.so.1`ucred_size+7: andl $0xfffffff0,%esp libc_hwcap1.so.1`ucred_size+0xa:call +0x0 <libc_hwcap1.so.1`ucred_size+0xf> libc_hwcap1.so.1`ucred_size+0xf:popl %ebx libc_hwcap1.so.1`ucred_size+0x10: addl $0xea386,%ebx libc_hwcap1.so.1`ucred_size+0x16: call -0xa2c8 <libc_hwcap1.so.1`__priv_getdata> libc_hwcap1.so.1`ucred_size+0x1b: movl 0xc(%eax),%eax libc_hwcap1.so.1`ucred_size+0x1e: leal -0x4(%ebp),%esp libc_hwcap1.so.1`ucred_size+0x21: popl %ebx libc_hwcap1.so.1`ucred_size+0x22: leave libc_hwcap1.so.1`ucred_size+0x23: ret 0xf6088c8f: nop 0xf6088c90: nop 0xf6088c91: nop 0xf6088c92: nop 0xf6088c93: nop 0xf6088c94: nop > $r %cs = 0x0043 %eax = 0x00000000 %ds = 0x004b %ebx = 0xf6173000 %ss = 0x004b %ecx = 0xf6174ce0 libc_hwcap1.so.1`pd_lock %es = 0x004b %edx = 0x00000000 %fs = 0x0000 %esi = 0x08d0bad0 %gs = 0x01c3 %edi = 0x00000001 %eip = 0xf6088c86 libc_hwcap1.so.1`ucred_size+0x1b %ebp = 0xfb5ac7d8 %kesp = 0x00000000 %eflags = 0x00010286 id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0 status=<of,df,IF,tf,SF,zf,af,PF,cf> %esp = 0xfb5ac7d0 %trapno = 0xe %err = 0x4
Problem piece of code is here (file /illumos-gate/usr/src/lib/libc/port/gen/ucred.c):
369size_t 370ucred_size(void) 371{ 372 priv_data_t *d; 373 374 LOADPRIVDATA(d); 375 376 return (d->pd_ucredsize); 377} 378
We should add NULL test for priv_data_t pointer after line 376 like that:
376 return ((d != NULL) ? d->pd_ucredsize : 0);
Then ucred_alloc() will fail as expected (due to malloc(0)).
Another place in the same file:
213const priv_set_t * 214ucred_getprivset(const ucred_t *uc, priv_ptype_t set) 215{ 216 /* LINTED: alignment */ 217 const prpriv_t *pr = UCPRIV(uc); 218 int pset = priv_getsetbyname(set); 219 priv_data_t *d; 220 221 if (pr == NULL || pset == -1) { 222 errno = EINVAL; 223 return (NULL); 224 } 225 226 LOADPRIVDATA(d); 227 228 return ((const priv_set_t *) 229 &pr->pr_sets[d->pd_pinfo->priv_setsize * pset]); 230}
It seems that all places with LOADPRIVDATA macro should be fixed too.
Files
Updated by Serghei Samsi over 8 years ago
- Assignee changed from Serghei Samsi to Marcel Telka
Updated by Marcel Telka over 8 years ago
- Category set to lib - userland libraries
- Status changed from New to In Progress
Updated by Marcel Telka over 8 years ago
The similar issue is with GETPRIVDATA macro too.
Updated by Marcel Telka over 8 years ago
Usually, the LOADPRIVDATA/GETPRIVDATA macros are used directly or indirectly in various API functions. There are two sorts of LOADPRIVDATA/GETPRIVDATA users:
I) those who could fail, because they have defined the failure modes of operation, for example getppriv(2)/setppriv(2)
To fix these, we will just fail in a case either the LOADPRIVDATA or GETPRIVDATA fails.
II) those who cannot fail, for example priv_union(3c)
In this case the LOADPRIVDATA/GETPRIVDATA calls cannot fail. The LOADPRIVDATA/GETPRIVDATA implementation is simple: in a case the privdata global variable is not set, the implementation attempts to initialize it. Once the privdata is set for the first time, it cannot change, so all subsequent LOADPRIVDATA/GETPRIVDATA calls will just succeed.
To make sure the LOADPRIVDATA/GETPRIVDATA calls won't fail for these cases we will just preload the privdata global variable during the application start in libc_init().
Updated by Marcel Telka over 8 years ago
Testing of the fix
First, I checked that the fix works as expected and the privdata global
variable is always set during the application start. To confirm this I used
the following script named privtest.mdb:
#!/bin/sh cat <<EOF | mdb $1 ::status ::bp main :r privdata::print EOF
Here is an example of results on a system without the fix:
$ ./privtest.mdb /bin/ls debugging executable file (32-bit) file: /usr/bin/ls status: idle mdb: stop at main mdb: target stopped at: main: leal 0x4(%esp),%ecx 0 $
Here is the output on a system with the fix:
$ ./privtest.mdb /bin/ls debugging executable file (32-bit) file: /usr/bin/ls status: idle mdb: stop at main mdb: target stopped at: main: leal 0x4(%esp),%ecx 0xfee00108 $
Next, I ran the onu-ed system for a while to make sure that everything is
working stable and there are no random core files generated due to the code
added to libc_init().
Finally, I tried to exhaust the memory to see how an application start would
fail. I ran the following 'executor' application:
#include <unistd.h> int main(void) { sleep(10); execlp("/bin/bash", "bash", (char *)0); return 0; }
During the sleep(10) period I started this 'allocator':
#include <stdlib.h> #include <unistd.h> #include <stdio.h> int main(void) { int s = 1024*1024; for (;;) { void *p = malloc(s); if (p == NULL && s == 1024*1024) { s = 1024; continue; } if (p == NULL) { printf("malloc failed\\\\n"); sleep(10); } } return 0; }
And this happened:
# ./executor & [1] 103755 # ./allocator *** libc thread failure: cannot preload the priv data malloc failed ^C [1]+ Abort (core dumped) ./executor # echo ::status | mdb core debugging core file of bash (32-bit) from openindiana file: /usr/bin/bash initial argv: bash threading model: native threads status: process terminated by SIGABRT (Abort), pid=103755 uid=0 code=-1 panic message: *** libc thread failure: cannot preload the priv data #
So it worked as expected.
Updated by Marcel Telka over 8 years ago
- Status changed from In Progress to Pending RTI
Updated by Gordon Ross over 8 years ago
- Status changed from Pending RTI to In Progress
Kicked back (from "pending rti" to "in progress") for questions and possibly more work.
I have a general design question about this fix.
Would it be feasible to change the implementation so that LOADPRIVDATA will never fail?
For example, if we know the maximum size of the data that would be loaded by this, perhaps we could initialize it to point to storage that's reserved at compile time? (i.e. BSS space, or initialized data if we know what the content should be at compile time).
That's a solution I've seen elsewhere for problems of this sort, i.e. when you can't load locale data, you're left with the compiled-in C locale defaults. I'm thinking of something like that.
Could something like that work here?
Updated by Marcel Telka over 7 years ago
- File 0001-5533-LOADPRIVDATA-macro-could-return-NULL-on-memory-.patch 0001-5533-LOADPRIVDATA-macro-could-return-NULL-on-memory-.patch added
- Assignee deleted (
Marcel Telka)
For reference I attached my version of the fix.