Bug #5123

IP DCE does not scale - part deux

Added by Jason Matthews about 4 years ago. Updated 9 months ago.

Status:ClosedStart date:2014-08-26
Priority:HighDue date:
Assignee:Robert Mustacchi% Done:

100%

Category:networking
Target version:-
Difficulty:Medium Tags:needs-triage

Description

My web proxy tier falls over dead, on 151a9, in less than one day with the DCE cache consuming presumably consuming all available memory. Performance degrades beyond what is acceptable after several hours.

This seems related to https://www.illumos.org/issues/3925

I generated two flame graphs. I isolated one instance of snmpd on cpu # 1 using processor sets and generated a flame graph against that CPU. snmpd is a good canary in the coal mine as it seems to drive a CPU spike quite easily with this problem.

In the second graph, i disabled CPUs 2-11, and put a small amount of http traffic on it then collected the data for the flamegraph using the processes documented at http://smartos.org/2012/02/28/using-flamegraph-to-solve-ip-scaling-issue-dce/

below are some kernel stats

(global zone)
root@www007:/home/jason/FlameGraph# netstat -nd |wc -l
24
root@www007:/home/jason/FlameGraph# zlogin web001.apsalar.com 'netstat -nd |wc -l'
6801990
root@www007:/home/jason/FlameGraph# zlogin web007.apsalar.com 'netstat -nd |wc -l'
6784496

root@www007:/home/jason/FlameGraph# echo ::kmastat |mdb -k |grep dce
dce_cache 152 13580871 13580892 2139512832B 16923847 0

The dce entries are not expiring.

root@www007:/home/jason/FlameGraph# for i in web001.apsalar.com web007.apsalar.com; do zlogin $i 'hostname; netstat -nd | sort -nr -k 3 |head'; done

web001
10.0.36.26 0 79438
10.0.36.24 0 79438
205.158.23.124 0 79437
10.0.36.38 0 79437
10.0.36.29 0 79436
10.0.36.35 0 79434
10.0.36.60 0 79432
10.0.36.55 0 79432
10.0.36.32 0 79431
10.0.36.1 0 79431

web007
99.61.176.193 0 79454
99.34.136.8 0 79454
99.234.9.11 0 79454
99.229.198.42 0 79454
99.227.215.32 0 79454
99.171.141.167 0 79454
99.157.98.187 0 79454
99.12.124.250 0 79454
99.1.40.70 0 79454
98.26.22.38 0 79454

Notably, the work around documented in #3925 which I use on 151a1 does not seem to work on 151a9. I have to run the job every other minute on 151a1 to keep it from falling over. Running it every minute on 151a9 doesnt prevent catastrophic failure.

kern.snmpd.svg - flamegraph of snmpd (6.53 KB) Jason Matthews, 2014-08-26 07:08 PM

kernel.loaded.svg - flamegraph of cpu 1 (79.8 KB) Jason Matthews, 2014-08-26 07:08 PM


Related issues

Duplicated by illumos gate - Bug #8923: tcpListenDrop counter continusly increases when we put our production webserver on OI Closed 2017-12-14

History

#1 Updated by Jason Matthews about 4 years ago

I forgot to add this bit about memory consumption rate...

root@www007:~# for i in `seq 1 10000`; do d=`date`; k=`echo ::kmastat |mdb -k |grep -i dce`; echo "$d :: $k"; sleep 900; done
August 24, 2014 01:10:10 PM PDT :: dce_cache 152 101 24908 3923968B 1671560 0
August 24, 2014 01:25:11 PM PDT :: dce_cache 152 7558 42874 6754304B 2161783 0
August 24, 2014 01:40:12 PM PDT :: dce_cache 152 7214 49036 7725056B 2806584 0
August 24, 2014 01:55:13 PM PDT :: dce_cache 152 101241 101244 15949824B 3444216 0
August 24, 2014 02:10:14 PM PDT :: dce_cache 152 444774 444782 70070272B 3787750 0
August 24, 2014 02:25:15 PM PDT :: dce_cache 152 744666 744666 117313536B 4087642 0
August 24, 2014 02:40:16 PM PDT :: dce_cache 152 1022769 1022788 161128448B 4365745 0
August 24, 2014 02:55:17 PM PDT :: dce_cache 152 1286295 1286298 202641408B 4629271 0
August 24, 2014 03:10:18 PM PDT :: dce_cache 152 1540981 1540994 242765824B 4883957 0
August 24, 2014 03:25:19 PM PDT :: dce_cache 152 1779385 1779388 280322048B 5122361 0
August 24, 2014 03:40:20 PM PDT :: dce_cache 152 2010767 2010788 316776448B 5353743 0
August 24, 2014 03:55:21 PM PDT :: dce_cache 152 2239344 2239354 352784384B 5582320 0

#2 Updated by anil choudhary 10 months ago

we are also facing same issue in latest release
kstat -p unix:0:dce_cache:buf*
unix:0:dce_cache:buf_avail 21
unix:0:dce_cache:buf_constructed 0
unix:0:dce_cache:buf_inuse 18683449
unix:0:dce_cache:buf_max 18683470
unix:0:dce_cache:buf_size 152
unix:0:dce_cache:buf_total 18683470

#3 Updated by anil choudhary 10 months ago

pkg info entire
Name: entire
Summary: incorporation to lock all system packages to same build (empty
package)
Description: incorporation to lock all system packages to same build (empty
package)
State: Installed
Publisher: openindiana.org
Version: 0.5.11
Branch: 2017.0.0.0
Packaging Date: March 6, 2017 at 02:50:37 PM
Size: 0.00 B
FMRI: pkg://openindiana.org/:20170306T145037Z

#4 Updated by Avnindra Singh 9 months ago

To solve issue 'DCE cache clean worker thread waiting forever':

::stacks -c tcp_ixa_cleanup_getmblk| ::findstack -v
stack pointer for thread fffffe4228d75c40: fffffe4228d759b0
[ fffffe4228d759b0 _resume_from_idle+0x112() ]
fffffe4228d759e0 swtch+0x141()
fffffe4228d75a20 cv_wait+0x70(fffffea3900acd6a, fffffea3900acd60)
fffffe4228d75a80 tcp_ixa_cleanup_getmblk+0x93(fffffea408a6e080)
fffffe4228d75ad0 conn_ixa_cleanup+0x8d(fffffea408a6e080, 0)
fffffe4228d75b40 ipcl_walk+0xc3(fffffffff7c03140, 0, fffffea35ec5b000)
fffffe4228d75b80 ip_dce_reclaim_stack+0x91(fffffea35ec5b000)
fffffe4228d75bc0 ip_dce_reclaim+0x5c()
fffffe4228d75c20 dce_reclaim_worker+0xf0(0)
fffffe4228d75c30 thread_start+8()

applying following code change found in Joyent repository is working.

--- usr/src/uts/common/inet/ip/ip_attr.c 2018-01-07 23:58:59.896034000 -0800
+++ ../joyent/usr/src/uts/common/inet/ip/ip_attr.c 2018-01-05 00:08:25.960813000 -0800
@ -909,6 +909,11 @
*/
if (ixa->ixa_free_flags & IXA_FREE_CRED)
crhold(ixa->ixa_cred);
+
+ /*
+ * There is no cleanup in progress on this new copy.
+ */
+ ixa->ixa_tcpcleanup = IXATC_IDLE;
}

I wonder why this wasn't chosen to be merged into illumos-gate source?

Thanks.
Avnindra

#5 Updated by Dan McDonald 9 months ago

"I wonder why this wasn't chosen to be merged into illumos-gate source?"

Good question. The patch you describe is this one:

OS-1082 dce_reclaim_thread stops making forward progress

Link to change is here:

https://github.com/joyent/illumos-joyent/commit/41f820513968e4706fe65181d7525ca35d10d2bd

and I'm kinda surprised it didn't make it up with the original upstream of #3925.

Since I'm new-ish at Joyent, I'll consult with folks who were here back then, but at first glance, perhaps this bugfix should just be the upstream of OS-1082.

#6 Updated by Dan McDonald 9 months ago

FURTHERMORE, apparently OmniOS had OS-1082 pulled in as well.

#7 Updated by Dan McDonald 9 months ago

This just got overlooked for upstreaming. If this indeed solves the issue(s) on this bug, I will upstream https://smartos.org/bugview/OS-1082 as the fix for this bug. Please confirm this @Avnindra?

#8 Updated by Avnindra Singh 9 months ago

Dan McDonald wrote:

This just got overlooked for upstreaming. If this indeed solves the issue(s) on this bug, I will upstream https://smartos.org/bugview/OS-1082 as the fix for this bug. Please confirm this @Avnindra?

Dan, yes it does. With this change, we're running our webserver in same environment for more than a week now, and reclaim thread is working as desired.

#9 Updated by Electric Monk 9 months ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit e179dd622b2c34e74c6ad1f4ce0fdd9493d91332

commit  e179dd622b2c34e74c6ad1f4ce0fdd9493d91332
Author: Keith M Wesolowski <keith.wesolowski@joyent.com>
Date:   2018-01-20T17:48:25.000Z

    5123 IP DCE does not scale - part deux
    Reviewed by: Dan McDonald <danmcd@joyent.com>
    Reviewed by: Avnindra Singh <Avnindra.Singh@exponential.com>
    Reviewed by: Sanjay Pokhriyal <Sanjay.Pokhriyal@exponential.com>
    Approved by: Gordon Ross <gordon.ross@nexenta.com>

#10 Updated by Marcel Telka 8 months ago

  • Duplicated by Bug #8923: tcpListenDrop counter continusly increases when we put our production webserver on OI added

Also available in: Atom