Project

General

Profile

Bug #3627

ipnet taskq can outlive ipnet_stack_t (or netstack in general) and panic

Added by Rich Lowe over 7 years ago. Updated about 7 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
networking
Start date:
2013-03-14
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

An OI user hit a crash which looks like:

unix:die+dd () unix:trap+1799  () unix:cmntrap+e6  ()
unix:mutex_enter+b () ipnet:ipnet_dispatch+8c ()
genunix:taskq_thread+248() unix:thread_start+8 ()

Ancient memories suggest this is because an ipnet device (likely observability related) has been destroyed, and removed from the netstack, but a taskq associated with it has survived. These memories may be wildly innaccurate.

We've asked the user to gather a crash dump, hopefully a reference will be added to this bug when he does.

#2

Updated by Dan McDonald over 7 years ago

I'll have to look, but does ipnet_dispatch need to take a netstack that's been reference-held? Or does that cause more problems?

#3

Updated by Dan McDonald about 7 years ago

We've found a related performance problem when ipnet observability is enabled. One possible solution should also eliminate this bug as a side-effect. I need to confirm the solution solves our performance problem first, however.

#4

Updated by Rich Lowe about 7 years ago

Is this the perf problem ira hit where we consume a basically infinite amount of memory?

A correct fix for that should also fix this, yes ('cos the whole taskq can go away, I think).

#5

Updated by Dan McDonald about 7 years ago

Rich Lowe wrote:

Is this the perf problem ira hit where we consume a basically infinite amount of memory?

I'd need a link to a mail, but it's highly likely yes.

A correct fix for that should also fix this, yes ('cos the whole taskq can go away, I think).

Alas, my worker-thread attempt failed to keep up, it seems. Stay tuned.

Also available in: Atom PDF