Project

General

Profile

Bug #4715

Minor fixes in ILB code (usr/src/uts/common/inet/ilb)

Added by Serghei Samsi over 5 years ago. Updated over 5 years ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
networking
Start date:
2014-03-29
Due date:
% Done:

20%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

ILB code lacks cleanups in some places [mutex_destroy(), list_destroy() etc].
An example is ilb_sticky_hash_fini().

History

#1

Updated by Serghei Samsi over 5 years ago

Another example is missing mutex_exit() for nsh_lock in line 244 (file usr/src/uts/common/inet/ilb/ilb_nat.c):

240 if ((tmp->nse_port_arena = vmem_create(arena_name,
241 (void *)NAT_PORT_START, NAT_PORT_SIZE, 1, NULL, NULL, NULL, 1,
242 VM_SLEEP | VMC_IDENTIFIER)) == NULL) {
243 kmem_free(tmp, sizeof (*tmp));
244 return (NULL);
245 }
246
247 list_insert_tail(head, tmp);
248 mutex_exit(&ilbs->ilbs_nat_src[idx].nsh_lock);

This could potentially lead to hang in IP module.

#2

Updated by Serghei Samsi over 5 years ago

  • % Done changed from 0 to 20

Please review:

http://mtc.md/sscdvp/webrevs/issue_4715/

Regards,
Serghei Samsi

#3

Updated by Serghei Samsi over 5 years ago

  • Status changed from New to Feedback
#4

Updated by Serghei Samsi over 5 years ago

During the patch tests two panic were detected on ILB-enabled zone shutdown:

Apr 15 10:13:18 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cb9fe0 unix:real_mode_stop_cpu_stage2_end+a6e3 ()
Apr 15 10:13:18 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba0f0 unix:trap+13c0 ()
Apr 15 10:13:18 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba100 unix:cmntrap+1ca ()
Apr 15 10:13:18 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba2a0 unix:mutex_enter+b ()
Apr 15 10:13:18 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba3e0 ip:ilb_check_conn+c1 ()
Apr 15 10:13:18 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba520 ip:ilb_check+11f ()
Apr 15 10:13:18 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba600 ip:ilb_check_v4+fc ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba720 ip:ill_input_short_v4+6c3 ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba980 ip:ip_input_common_v4+3ba ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cba9c0 ip:ip_input+2b ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cbaad0 dls:i_dls_link_rx+20d ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cbab20 mac:mac_rx_deliver+37 ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cbaba0 mac:mac_rx_soft_ring_drain+1a9 ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cbac20 mac:mac_soft_ring_worker+219 ()
Apr 15 10:13:19 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006cbac30 unix:thread_start+8 ()

Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88820 unix:real_mode_stop_cpu_stage2_end+a6e3 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88930 unix:trap+bfb ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88940 unix:cmntrap+1ca ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88a90 unix:mutex_enter+b ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88ae0 ip:ilb_rule_del_common+73 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88b20 ip:ilb_stack_shutdown+9e ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88b90 genunix:netstack_apply_shutdown+e1 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88bd0 genunix:apply_all_modules_reverse+49 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88c10 genunix:netstack_zone_shutdown+133 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88c90 genunix:zsd_apply_shutdown+165 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88ce0 genunix:zsd_apply_all_keys+5f ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88d30 genunix:zone_zsd_callbacks+ff ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88d70 genunix:zone_shutdown+106 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88eb0 genunix:zone+217 ()
Apr 15 22:24:16 illumos-dev0 genunix: [ID 655072 kern.notice] ffffff0006c88f00 unix:brand_sys_syscall32+272 ()

Both problems are due to the patch introduced resource releasing.
Old code doesn't released that resources, and so had no panic.
I have solved that by splitting old code _fini into _shutdown and _fini routines for the following:
ilb_conn_hash_fini
ilb_sticky_hash_fini
ilb_nat_src_fini

New behavior:
_shutdown routines are called in _shutdown callback of ilb's netstack.
_fini routines are called in _destroy callback of ilb's netstack.

Please review updated webrev:
http://mtc.md/sscdvp/webrevs/issue_4715/

Regards,
Serghei Samsi

Also available in: Atom PDF