Project

General

Profile

Bug #4735

ILB unconditionally drops all ICMP/ICMPv6 traffic destinated to VIP address

Added by Serghei Samsi over 5 years ago. Updated over 5 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
networking
Start date:
2014-04-09
Due date:
% Done:

30%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

One of drawbacks is the stopping ilb_probe from work (for health-check configured with ICMP ping).
Also ping and traceroute -I commands don't work if sending out from VIP address.

However line 1534 of file usr/src/uts/common/inet/ilb/ilb.c reveals this was expected behavior:

* For other ICMP messages, drop them.

ILB handles exclusively TCP/UDP traffic, also it has a functionality to respond to ICMP echo requests from ILB clients, it should drop ICMP traffic that belongs to one of ILB rules, the rest should be simply passed to IP input in order to decide what to do.

History

#1

Updated by Serghei Samsi over 5 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 20

Please review

http://mtc.md/sscdvp/webrevs/issue_4735/

Regards,
Serghei Samsi

#2

Updated by Serghei Samsi over 5 years ago

Some testing results are available:

Now ilb_probe with ICMP works as well as ping & traceroute -I commands:

root@ilb-test:~# ilbadm show-hc-result
RULENAME HCNAME SERVERID STATUS FAIL LAST NEXT RTT
pkgrepor10 defhc _pkgrepoing0.0 alive 0 23:00:52 23:01:33 1218
pkgrepor10 defhc _pkgrepoing0.1 alive 0 23:01:24 23:02:02 2085
pkgrepor11 defhc _pkgrepoing0.0 alive 0 23:01:21 23:01:48 2193
pkgrepor11 defhc _pkgrepoing0.1 alive 0 23:01:15 23:01:48 2415
pkgrepor15 defhc _pkgrepoing1.0 unreach 13 23:01:11 23:01:40 0
pkgrepor15 defhc _pkgrepoing1.1 unreach 16 23:01:30 23:01:49 0
root@ilb-test:~# traceroute -I -n 172.20.4.46
traceroute to 172.20.4.46 (172.20.4.46), 30 hops max, 40 byte packets
1 172.17.10.6 1.539 ms 1.544 ms 1.975 ms
2 172.17.10.254 1.012 ms 0.983 ms 1.049 ms
3 172.20.4.46 1.312 ms 0.883 ms 0.964 ms

Testing IP configuration (NOTE: VIP is the one IP in testing zone, and so VIP is a single valid source for ICMP message exchange):
root@ilb-test:~# ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
uplink2/_a from-gz ok 172.17.11.120/23
lo0/v6 static ok ::1/128
root@ilb-test:~# netstat -rn

Routing Table: IPv4
Destination Gateway Flags Ref Use Interface
-------------------- -------------------- ----- ----- ---------- ---------
default 172.17.10.6 UGZ 2 333 uplink2
127.0.0.1 127.0.0.1 UH 2 0 lo0
172.17.10.0 172.17.11.120 U 6 182 uplink2
172.20.4.18 172.17.10.254 UHD 2 50

Routing Table: IPv6
Destination/Mask Gateway Flags Ref Use If
--------------------------- --------------------------- ----- --- ------- -----
::1 ::1 UH 2 132 lo0

Testing ILB configuration:
create-healthcheck -h hc-test=ping,hc-timeout=5,hc-count=3,hc-interval=30 defhc
create-servergroup pkgrepoing0
add-server -s server=172.17.11.210:80 pkgrepoing0
add-server -s server=172.20.4.18:80 pkgrepoing0
create-rule -e -p -i vip=172.17.11.120,port=80,protocol=tcp -m lbalg=hash-ip-port,type=NAT,proxy-src=172.17.11.120-172.17.11.120,pmask=/32 -h hc-name=defhc,hc-port=ANY -t conn-drain=70,nat-timeout=70,persist-timeout=70 -o servergroup=pkgrepoing0 pkgrepor10
create-rule -e -p -i vip=172.17.11.120,port=8080,protocol=tcp -m lbalg=hash-ip-port,type=NAT,proxy-src=172.17.11.120-172.17.11.120,pmask=/32 -h hc-name=defhc,hc-port=ANY -t conn-drain=70,nat-timeout=70,persist-timeout=70 -o servergroup=pkgrepoing0 pkgrepor11
create-servergroup pkgrepoing1
add-server -s server=[2a02:a30:1:1::1]:8080 pkgrepoing1
add-server -s server=[2a02:a30:1:1::2]:8080 pkgrepoing1
create-rule -e -p -i vip=2a02:a30:1:1::219,port=80,protocol=tcp -m lbalg=hash-ip-port,type=NAT,proxy-src=2a02:a30:1:1::18-2a02:a30:1:1::18,pmask=/128 -h hc-name=defhc,hc-port=ANY -t conn-drain=70,nat-timeout=70,persist-timeout=70 -o servergroup=pkgrepoing1 pkgrepor15

#3

Updated by Serghei Samsi over 5 years ago

  • % Done changed from 20 to 30

System panic is detected after that patch applied:
panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff0006fb6ee0 addr=f occurred in module "unix" due to a NULL pointer dereference.
It isn't me (you could blame me for my initiative of course;), it is due to existing ECHO_REPLY functionality of ILB. On receiving ECHO_REQUEST ICMP message it tries to swap source address with destination.
The patch only activated that code.
Adding the test on ipha_t length (file ilb.c) seems to solve that panic problem.

I will update webrev ASAP.

#4

Updated by Serghei Samsi over 5 years ago

Please review update webrev:

http://mtc.md/sscdvp/webrevs/issue_4735/

Regards,
Serghei Samsi

Also available in: Atom PDF