Project

General

Profile

Bug #1501

taskq_create_proc ... TQ_DYNAMIC puts tasks in p0

Added by Gordon Ross about 8 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Start date:
2011-09-13
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

Perhaps debatable whether this is a bug or feature request,
but I was surprised about this:

When taskq_create_proc() is called with a non-global zone's curzone->zone_zsched, and TQ_DYNAMIC, then the threads servicing that taskq are in the global zone, not the specified zone.

This causes problems in some code I'm trying to make zone-aware, that wants to ask "what zone am I" from these threads.

History

#1

Updated by Gordon Ross about 8 years ago

  • Subject changed from taskq_create_proc ... TQ_DYNAMIC puts tasks p0 to taskq_create_proc ... TQ_DYNAMIC puts tasks in p0
#2

Updated by Electric Monk over 4 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit a32725da2e029a7b2cff859759c591f9a11fe521

commit  a32725da2e029a7b2cff859759c591f9a11fe521
Author: Gordon Ross <gwr@nexenta.com>
Date:   2015-04-12T17:10:05.000Z

    1501 taskq_create_proc ... TQ_DYNAMIC puts tasks in p0
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Dan McDonald <danmcd@omniti.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Approved by: Garrett D'Amore <garrett@damore.org>

#3

Updated by Gordon Ross over 4 years ago

  • Status changed from Closed to In Progress
  • Assignee set to Gordon Ross

Unfortunately, my first attempt at fixing this had problems, so we backed it out.
I don't know all the details of why it happened yet, but the first fix can apparently
cause a hang when a non-global zone does a door_upcall.

#4

Updated by Gordon Ross over 4 years ago

My first attempt to fix this caused a problem when a thread from a
dynamic taskq attempted a door_upcall (which smbsrv uses).

My change to taskq_bucket_extend() naively used the same
logic as taskq_thread_create(). Unfortunately the way that
taskq_bucket_extend creates the thread in TS_STOP state
and later starts it ends up leaving the TP_HOLDLWP flag set
in t_proc_flag when we create and start a new LWP.

Later, we end up in an endless loop with a stack like this:

  (typicall some interrupt)
  ffffff0006436140 0xf()
  ffffff00064362a0 stop+0x26()
  ffffff0006436350 issig_forreal+0x38c()
  ffffff0006436370 issig+0x3d(0)
  ffffff0006436420 door_upcall+0x4bf(ffffff0174dd0200, ffffff00064364b0, 0, ffffffffffffffff, 0)
  ffffff00064364a0 door_ki_upcall_limited+0x67(ffffff0160bc7be8, ffffff00064364b0, 0, ffffffffffffffff, 0)
  ffffff0006436510 smb_kdoor_upcall_private+0x65(ffffff0176ed7580, ffffff0006436530)
  ffffff00064365f0 smb_kdoor_send+0x73(ffffff0176ed7580, ffffff0006436610)
  ffffff0006436710 smb_kdoor_upcall+0x1c7(ffffff0176ed7580, 2, ffffff0006436780, fffffffff81c4060, ffffff0174228958, fffffffff81c43e0)
  ffffff0006436750 smb_get_token+0x56(ffffff016185f5a8, ffffff0006436780)
  ffffff0006436870 smb_authenticate_core+0x18e(ffffff016460ecc8, ffffff016a54df28)
  ffffff00064368c0 smb_authenticate+0x44(ffffff016460ecc8, ffffff016a54df28)
  ffffff0006436940 smb_com_session_setup_andx+0x2e(ffffff016460ecc8)
  ffffff0006436a30 smb_dispatch_request+0x64a(ffffff016460ecc8)
  ffffff0006436a70 smb_session_worker+0xd8(ffffff016460ecc8)
  ffffff0006436b20 taskq_d_thread+0x123(ffffff016a2f5f58)
  ffffff0006436b30 thread_start+8()

Rich Lowe helped me figure out that having the
TP_HOLDLWP flag set when in door_upcall
makes us loop in issig_forreal() calling stop()
in an endless loop.

#5

Updated by Gordon Ross over 4 years ago

New, version of this out for review. See: https://www.illumos.org/rb/r/24

It turns out that there's no need to create an LWP in taskq_bucket_extend() which is how most threads are created for dynamic taskqs. In fact, the only place we really need an LWP (as opposed to a simple kthread) is when we're creating a taskq with the TASKQ_DUTY_CYCLE flag. That's not supported in combination with DYNAMIC so we don't need to make taskq_bucket_extend deal with LWPs at all. All I really need to do here was make sure the proc passed to taskq_create_common (and stored in tq_proc) is used when we're creating threads.

This version of the fix works with the "smb server in a zone" code (coming to illumos soon).

#6

Updated by Electric Monk over 4 years ago

  • Status changed from In Progress to Closed

git commit e548823371e6dfbf1717dabbe24870fec98c6051

commit  e548823371e6dfbf1717dabbe24870fec98c6051
Author: Gordon Ross <gwr@nexenta.com>
Date:   2015-04-28T00:49:40.000Z

    1501 taskq_create_proc ... TQ_DYNAMIC puts tasks in p0 (take 2)
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Dan McDonald <danmcd@omniti.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
    Approved by: Albert Lee <trisk@omniti.com>

#7

Updated by Electric Monk over 2 years ago

git commit c883468ae561a0b211f3633eb2a24aa4306661f2

commit  c883468ae561a0b211f3633eb2a24aa4306661f2
Author: Gordon Ross <gwr@nexenta.com>
Date:   2015-04-21T18:59:04.000Z

    backout: 1501 taskq_create_proc ... TQ_DYNAMIC puts tasks in p0 (needs work)

Also available in: Atom PDF