Bug #1631


kernel panic in tcp_input_data

Added by Denis Kozadaev almost 11 years ago. Updated almost 11 years ago.

Start date:
Due date:
% Done:


Estimated time:
(Total: 0.00 h)
Gerrit CR:


Using oi_151a as a desktop with one non-global zone
Two cores are available


debugging crash dump vmcore.1 (64-bit) from witch
operating system: 5.11 oi_151a (i86pc)
image uuid: bd8a28a2-8634-439a-aa94-d58ea0bf48f6
panic message:
BAD TRAP: type=e (#pf Page fault) rp=ffffff000d6befc0 addr=20 occurred in module "ip" due to a
NULL pointer dereference
dump content: kernel pages only

ffffff000d6befc0::print "struct regs"

r_savfp = 0xffffff000d6bf220
r_savpc = 0xfffffffff7bf02f5
r_rdi = 0
r_rsi = 0xffffff02ccd813a0
r_rdx = 0x1
r_rcx = 0
r_r8 = 0xffffff000d6bf110
r_r9 = 0xffffff000d6bf10c
r_rax = 0
r_rbx = 0xffffff02ca1a9512
r_rbp = 0xffffff000d6bf220
r_r10 = 0
r_r11 = 0x4a
r_r12 = 0xffffff02ccbb3440
r_r13 = 0xffffff02ccd813a0
r_r14 = 0x1
r_r15 = 0xffffff04d5eb9794
__r_fsbase = 0xfffffffffb84c87e
__r_gsbase = 0x45a8930000000000
r_ds = 0
r_es = 0
r_fs = 0
r_gs = 0
r_trapno = 0xe
r_err = 0
r_rip = 0xfffffffff7bf02f5
r_cs = 0x30
r_rfl = 0x10246
r_rsp = 0xffffff000d6bf0b0
r_ss = 0x38


tcp_input_data+0x387a: leaq 0xfffffffffffffeec(%rbp),%r9
tcp_input_data+0x3881: movl %eax,0xfffffffffffffeec(%rbp)
tcp_input_data+0x3887: movq 0xfffffffffffffe98(%rbp),%r8
tcp_input_data+0x388e: movq 0x1a8(%r8),%r10
tcp_input_data+0x3895: movq 0x1b0(%r8),%rdi
tcp_input_data+0x389c: movslq %r14d,%rdx
tcp_input_data+0x389f: leaq 0xfffffffffffffef0(%rbp),%r8
tcp_input_data+0x38a6: movq %r13,%rsi
tcp_input_data+0x38a9: xorl %ecx,%ecx
tcp_input_data+0x38ab: xorl %eax,%eax
tcp_input_data+0x38ad: call *0x20(%r10) <- HERE
tcp_input_data+0x38b1: xorq %r8,%r8
tcp_input_data+0x38b4: cmpq %rax,%r8
tcp_input_data+0x38b7: jge +0x26 <tcp_input_data+0x38df>
tcp_input_data+0x38b9: cmpl $0x0,0xfffffffffffffeec(%rbp)
tcp_input_data+0x38c0: je +0x1eb <tcp_input_data+0x3ab1>
tcp_input_data+0x38c6: movq %r12,%rdi
tcp_input_data+0x38c9: call +0x10e2 <tcp_rwnd_reopen>
tcp_input_data+0x38ce: orl 0xfffffffffffffee4(%rbp),%eax
tcp_input_data+0x38d4: movl %eax,0xfffffffffffffee4(%rbp)
tcp_input_data+0x38da: jmp +0x1d2 <tcp_input_data+0x3ab1>

*panic_thread::findstack -v

stack pointer for thread ffffff000d6bfc40: ffffff000d6bed20
ffffff000d6bee10 panic+0x94()
ffffff000d6beea0 die+0xdd(e, ffffff000d6befc0, 20, 0)
ffffff000d6befb0 trap+0x1799(ffffff000d6befc0, 20, 0)
ffffff000d6befc0 0xfffffffffb8001d6()
ffffff000d6bf220 tcp_input_data+0x38ad(ffffff02ccbb3140, ffffff02ccd813a0, ffffff0295f06e80,
ffffff000d6bf2b0 squeue_enter+0x440(ffffff0295f06e80, ffffff02ccd813a0, ffffff02ccd813a0, 1,
ffffff000d6bf510, 4, 4)
ffffff000d6bf380 ip_fanout_v4+0x48d(ffffff02ccd813a0, ffffff02ca1a94fe, ffffff000d6bf510)
ffffff000d6bf400 ire_recv_local_v4+0x366(ffffff02a35b1df8, ffffff02ccd813a0, ffffff02ca1a94fe
, ffffff000d6bf510)
ffffff000d6bf4e0 ill_input_short_v4+0x6ce(ffffff02ccd813a0, ffffff02ca1a94fe,
ffffff02ca1a950e, ffffff000d6bf510, ffffff000d6bf6a0)
ffffff000d6bf710 ip_input+0x23b(ffffff02a3518ca8, ffffff02a11dc040, ffffff02ccd813a0, 0)
ffffff000d6bf7a0 mac_rx_soft_ring_process+0x17a(ffffff0296d3c2c8, ffffff02984334c0,
ffffff02ccd813a0, ffffff02ccd813a0, 1, 0)
ffffff000d6bf8e0 mac_rx_srs_proto_fanout+0x46f(ffffff0298481d00, ffffff02ccd813a0)
ffffff000d6bf960 mac_rx_srs_drain+0x26e(ffffff0298481d00, 800)
ffffff000d6bf9f0 mac_rx_srs_process+0x180(ffffff029814c008, ffffff0298481d00,
ffffff02ccd813a0, 0)
ffffff000d6bfa40 mac_rx_classify+0x159(ffffff029814c008, 0, ffffff02ccd813a0)
ffffff000d6bfaa0 mac_rx_flow+0x54(ffffff029814c008, 0, ffffff029e394320)
ffffff000d6bfaf0 mac_rx_common+0x1f6(ffffff029814c008, 0, ffffff029e394320)
ffffff000d6bfb40 mac_rx+0xac(ffffff029814c008, 0, ffffff029e394320)
ffffff000d6bfb90 iprb_intr+0x123(ffffff0295b33000, 0)
ffffff000d6bfbe0 av_dispatch_autovect+0x7c(13)
ffffff000d6bfc20 dispatch_hardint+0x33(13, 0)
ffffff000d605a70 switch_sp_and_call+0x13()
ffffff000d605ac0 do_interrupt+0xb8(ffffff000d605ad0, 1)
ffffff000d605ad0 _interrupt+0xba()
ffffff000d605bc0 mach_cpu_idle+6()
ffffff000d605bf0 cpu_idle+0xaf()
ffffff000d605c00 cpu_idle_adaptive+0x19()
ffffff000d605c20 idle+0x114()
ffffff000d605c30 thread_start+8()

source: usr/src/uts/common/inet/tcp/tcp_input.c:4629

Subtasks 1 (0 open1 closed)

Bug #1775: 1631's fix missed the case of data on the handshake's ACKResolvedDan McDonald2011-11-14

Actions #1

Updated by Dan McDonald almost 11 years ago

Do you have a URL for the cores? It looks like a conn_t may be accessed when it's not supposed to be. (e.g. yet-another conn_t race condition.)

Actions #2

Updated by Denis Kozadaev almost 11 years ago

yes, of course:
they are identical, maybe the traffic is different

Actions #3

Updated by Dan McDonald almost 11 years ago

Sorry for the delay here. One other question. Were you doing anything in particular to cause this panic? Or is it just randomly happening?

Actions #4

Updated by Denis Kozadaev almost 11 years ago

Well, I don't know why it panics.
At this moment I have already 3 dumps at the same point.
In non-global zone there is two services (innd and ircd).
I think it is the cause, some time ago (without non-global zone)
in the computer it does not panic at all.
The zone has 3 vnics over 2 real ethernet cards and one
IPv4 tunnel (v6 over v4), it plumbed, but does not use.

Actions #5

Updated by Dan McDonald almost 11 years ago

This is a race between a socket close and data arriving and hitting the closed socket.

The TCP connection is in state TCPS_CLOSED, but somehow tcp_input_data() got reached and managed to run all the way down to line 4629, while a close routine managed to blank-out the conn_upcalls before tcp_input_data() dereferenced them.

Not sure about the fix yet, but at least think I know what's going on. BTW, the packet being demuxed is a SYN/ACK on the second core dump. Some latencies, I guess.

Actions #6

Updated by Dan McDonald almost 11 years ago

BTW, the socket in question is an identd (port 113) in coredump #2 (vmdump.1). I'll check the first after lunch.

Actions #7

Updated by Denis Kozadaev almost 11 years ago

uptime is 15 minutes,
I have another dump with the same cause :)
But I don't think you need it

Actions #8

Updated by Dan McDonald almost 11 years ago

Both coredumps involve TCP sockets that were most-likely initiated by you to remote "identd" servers. (I'll bite my tongue about the ident protocol...)

Anyway, both TCP states are set to 0, aka TCPS_CLOSED. (Oddly enough both seem to be to the same remote network block...)

Also, both packets are SYN/ACK packets with one byte of data after them. That in and of itself is odd, but shouldn't cause a kernel panic. I still suspect it's a race, but I can try some SYN/ACK + data experiments.

(As a workaround, can you disable whatever is connecting to remote identd?)

Actions #9

Updated by Denis Kozadaev almost 11 years ago

I can block port 113 in ipf, but ircd uses it.
I hope you will fix the bug in a short time
(ircd is a part of RusNet)

Actions #10

Updated by Dan McDonald almost 11 years ago

You should be able to configure ircd to NOT use the "ident" service, right? That's a lot better than using ipf (or ipsecconf(1M), which would suffice here as well).

Actions #11

Updated by Denis Kozadaev almost 11 years ago

/* * define NO_IDENT if you don't want to support ident (RFC1413). * it is a VERY bad idea to do so, since this will make it impossible to * efficientely track abusers.
maybe you are right
or I can just move it to some other server with other OS
but I would not like to change OI to some other...

Actions #12

Updated by Dan McDonald almost 11 years ago

Sorry for asking so many questions. I am having a hard time seeing how this could've happened in the first place. (And ident is easy to spoof, I don't trust it.)

If you're feeling brave, can you (keeping ident in place), put this in /etc/system and then reboot:

set kmem_flags=0xf

this will turn on kmem debugging. This will help me diagnose the race condition by letting me know who allocated and freed various kernel structures. I will need the coredump after you reboot the system with this set. Once you boot, you can remove the line (and the subsquent reboot will have kmem debugging turned off). A coredump with kmem debugging on is far more valuable than one without.


Actions #13

Updated by Denis Kozadaev almost 11 years ago

done, now wait for a new panic
I see you are an expert in the kernel
may I ask you via e-mail about deep core of the kernel? :)
a question about drivers (writing drivers)

Actions #14

Updated by Dan McDonald almost 11 years ago

Thank you. I'm a TCP/IP stack guy first and foremost. Drivers are a little beyond my expertise, but I can try and answer some questions. Your best bet is to join the Illumos developers list or visit #illumos on IRC.

Hopefully it'll core quickly and I can see more about what's broken.

Actions #15

Updated by Denis Kozadaev almost 11 years ago
557842432 Oct 19 05:35 (GMT+4) vmdump.5
and now I have to block port 113 as you said

Actions #16

Updated by Dan McDonald almost 11 years ago

Hopefully you removed the kmem_flags line from /etc/system also.

I have prior commitments this morning, so I won't be able to dive back in until afternoon (US/Eastern).

Actions #17

Updated by Denis Kozadaev almost 11 years ago

I had already commented the line in /etc/system when the system booted
stayed for future using if I hit another bug
to make a dump with maximal info

Actions #18

Updated by Dan McDonald almost 11 years ago

  • Category set to kernel
  • Assignee set to Dan McDonald
  • % Done changed from 0 to 40
  • Difficulty changed from Medium to Hard

Found out what's going on. I'm not 100% sure what the best solution is yet, though. Like I said before, the ident protocol, and people's blind faith upon it, is partially the problem. A particular IP range (common in all of your coredumps) is misbehaving (whether on purpose or not is unknown) and tickling this bug.

Solving it will deal with the misbehavior more gracefully. What's hard is the precise solution. Hang in there, I'm on this.

Actions #19

Updated by Dan McDonald almost 11 years ago

  • % Done changed from 40 to 80

I believe I have a fix. I need to check for packet leaks, but I think this fix will completely cure what ails you. Please mail me for a URL to a patched "ip" module to drop into /kernel/drv/amd64/ip.

Actions #20

Updated by Rich Lowe almost 11 years ago

  • Status changed from New to Resolved
  • % Done changed from 80 to 100
  • Tags deleted (needs-triage)

Resolved in r13494 commit:9dc2083cc403


Also available in: Atom PDF