Project

General

Profile

Bug #8675

proto_promiscoff_req() doesn't always exit the MAC perimeter

Added by Robert Mustacchi over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
networking
Start date:
2017-09-22
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

The dld_promiscoff_req() function in dld has a long history, going back to illumos#1918, and a follow-on fix illumos#4182. 1918 introduced a move of the MAC perimeter entry to much earlier in the function. 4182 plugged a common-case hole of forgetting to exit the perimeter, but there are several other holes, hence this bug.

Here are some additional notes from when I was looking at this yesterday that help point the way to the issue Dan mentioned:

We have one thread in vnd:

> ::stacks -m vnd
THREAD           STATE    SOBJ                COUNT
ffffff1d170f0040 SLEEP    CV                      1
                 swtch+0x18a
                 cv_wait+0x89
                 vnd_s_close+0x93
                 qdetach+0xa8
                 strclose+0x334
                 device_close+0xb9
                 spec_close+0x1d9
                 fop_close+0x61
                 ldi_close+0xcd
                 vnd_ioctl_attach+0x447
                 vnd_ioctl+0x2a9
                 cdev_ioctl+0x39
                 spec_ioctl+0x60
                 fop_ioctl+0x55
                 ioctl+0x9b
                 _sys_sysenter_post_swapgs+0x23e
> ffffff1d170f0040::findstack -v
stack pointer for thread ffffff1d170f0040: ffffff008cfd2750
[ ffffff008cfd2750 _resume_from_idle+0x112() ]
  ffffff008cfd2780 swtch+0x18a()
  ffffff008cfd27b0 cv_wait+0x89(ffffff1e986f4b84, ffffff1e986f4b78)
  ffffff008cfd2810 vnd_s_close+0x93(ffffff1edcfb92b0, 40000003, ffffff1ece195110)
  ffffff008cfd2890 qdetach+0xa8(ffffff1edcfb92b0, 1, 40000003, ffffff1ece195110, 0)
  ffffff008cfd2930 strclose+0x334(ffffff1f00b76600, 40000003, ffffff1ece195110)
  ffffff008cfd2980 device_close+0xb9(ffffff1eaf1dd600, 40000003, ffffff1ece195110)
  ffffff008cfd2a10 spec_close+0x1d9(ffffff1eaf1dd600, 40000003, 1, 0, ffffff1ece195110, 0)
  ffffff008cfd2a90 fop_close+0x61(ffffff1eaf1dd600, 40000003, 1, 0, ffffff1ece195110, 0)
  ffffff008cfd2ae0 ldi_close+0xcd(ffffff1ee8a6eb50, 3, ffffff1ece195110)
  ffffff008cfd2be0 vnd_ioctl_attach+0x447(ffffff1ea2a88a28, 8047b58, ffffff1ece195110, 100003)
  ffffff008cfd2c60 vnd_ioctl+0x2a9(11b0000003b, 766e6401, 8047b58, 100003, ffffff1ece195110, ffffff008cfd2e48)
  ffffff008cfd2ca0 cdev_ioctl+0x39(11b0000003b, 766e6401, 8047b58, 100003, ffffff1ece195110, ffffff008cfd2e48)
  ffffff008cfd2cf0 spec_ioctl+0x60(ffffff1ece2d2900, 766e6401, 8047b58, 100003, ffffff1ece195110, ffffff008cfd2e48, 0)
  ffffff008cfd2d80 fop_ioctl+0x55(ffffff1ece2d2900, 766e6401, 8047b58, 100003, ffffff1ece195110, ffffff008cfd2e48, 0)
  ffffff008cfd2ea0 ioctl+0x9b(3, 766e6401, 8047b58)
  ffffff008cfd2f00 _sys_sysenter_post_swapgs+0x23e()

That thread is in the process of shutting down and making a series of
STREAMS messages. If we look at its state:

> ffffff1edcfb92b0::print queue_t q_ptr | ::print vnd_str_t vns_state
vns_state = 0t14 (VNS_S_SAP_PROMISCOFF_SENT)

We see that it's issued a promisc off request. So the interesting
question is where is that and why haven't we heard back. If we look at
the dld module, we'll see where our promisc off request is:

> ::stacks -m dld
THREAD           STATE    SOBJ                COUNT
ffffff008b1d6c40 SLEEP    CV                      1
                 swtch+0x18a
                 cv_wait+0x89
                 dld_taskq_dispatch+0x126
                 thread_start+8

ffffff1d6712e060 SLEEP    CV                      1
                 swtch+0x18a
                 cv_wait+0x89
                 i_mac_perim_enter+0x6b
                 mac_perim_enter_by_mh+0x23
                 mac_perim_enter_by_macname+0x33
                 drv_ioc_prop_common+0x193
                 drv_ioc_getprop+0x26
                 drv_ioctl+0x1e4
                 cdev_ioctl+0x39
                 spec_ioctl+0x60
                 fop_ioctl+0x55
                 ioctl+0x9b
                 _sys_sysenter_post_swapgs+0x23e

ffffff008b15ec40 SLEEP    CV                      1
                 swtch+0x18a
                 cv_wait+0x89
                 i_mac_perim_enter+0x6b
                 mac_perim_enter_by_mh+0x23
                 proto_promiscoff_req+0x97
                 dld_proto+0x245
                 dld_wput_nondata_task+0xb3
                 taskq_d_thread+0xf3
                 thread_start+8
> ffffff008b15ec40::findstack -v
stack pointer for thread ffffff008b15ec40: ffffff008b15e9f0
[ ffffff008b15e9f0 _resume_from_idle+0x112() ]
  ffffff008b15ea20 swtch+0x18a()
  ffffff008b15ea50 cv_wait+0x89(ffffff1eff8e2ab4, ffffff1eff8e2aa0)
  ffffff008b15ea90 i_mac_perim_enter+0x6b(ffffff1f08d059c8)
  ffffff008b15eac0 mac_perim_enter_by_mh+0x23(ffffff1f08d059c8, ffffff008b15eae8)
  ffffff008b15eb30 proto_promiscoff_req+0x97(ffffff1eafe29538, ffffff1d38371e00)
  ffffff008b15eb50 dld_proto+0x245(ffffff1eafe29538, ffffff1d38371e00)
  ffffff008b15eb90 dld_wput_nondata_task+0xb3(ffffff1eafe29538)
  ffffff008b15ec20 taskq_d_thread+0xf3(ffffff1d0ec3f140)
  ffffff008b15ec30 thread_start+8()
> ffffff1eafe29538::print dld_str_t ds_mch | ::print mac_client_impl_t mci_name
mci_name = [ "vndtest1" ]

OK, so we're blocked on the MAC perimeter. Who has it? Also of note, by
looking at the vn_lower_mh, we can see that this is in fact an etherstub
that we're created over (which makes sense because this came from the
vnd test suite).

> ffffff1f08d059c8::print mac_impl_t mi_driver | ::print vnic_t vn_mc_handles[0] | ::print mac_client_impl_t mci_mip | ::print mac_impl_t mi_driver
 | ::print vnic_t vn_lower_mh
vn_lower_mh = 0
> ffffff1f08d059c8::print mac_impl_t mi_driver | ::print vnic_t vn_mc_handles[0] | ::print mac_client_impl_t mci_mip | ::print mac_impl_t mi_perim_
owner
mi_perim_owner = 0xffffff008b12ec40
> 0xffffff008b12ec40::findstack -v
mdb: couldn't read kthread_t from ffffff008b12ec40

So, that's odd. Who's that kthread_t?!

> 0xffffff008b12ec40::whatis
ffffff008b12ec40 is ffffff008b129000+5c40, allocated from the segkp vmem arena:
            ADDR TYPE            START              END             SIZE
                                THREAD        TIMESTAMP                 
ffffff1d0fe149b0 ALLC ffffff008b129000 ffffff008b12f000            24576
                      ffffff008dc9fc40      11a1e89f0ed
                 vmem_hash_insert+0xae
                 vmem_seg_alloc+0x23e
                 vmem_xalloc+0x749
                 vmem_alloc+0x145
                 segkp_get_internal+0x159
                 segkp_cache_get+0x103
                 thread_create+0x5ac
                 mt_config_children+0x1e8
                 unconfig_grand_children+0x45
                 devi_unconfig_common+0xb1
                 mt_config_thread+0x259
                 thread_start+8

Well, that's not a kthread_t!

--

DTrace script to catch this in the wild:

#!/usr/sbin/dtrace -s

proto_promiscoff_req:entry
{
        self->trace = 1;
        self->perim = 0;
}

mac_perim_enter_by_mh:entry
/self->trace == 1/
{
        self->perim = 1;
        self->mphstack = arg1;
}

mac_perim_enter_by_mh:return
/self->trace == 1/
{
        self->mph = *((void **)self->mphstack);
}

mac_perim_exit:entry
/self->trace == 1/
{
        self->perim = 0;
}

proto_promiscoff_req:return
/self->trace == 1 && self->perim == 0/
{
        self->trace = 0;
}

proto_promiscoff_req:return
/self->trace == 1 && self->perim == 1/
{
        printf("proto_promisc_off() returned w/o releasing mac_perim!\n");
        printf("mph pointer is 0x%p (mac_impl_t *)", self->mph);
        self->trace = 0;
}
#1

Updated by Electric Monk over 3 years ago

  • Status changed from New to Closed

git commit cb6094b49723102fbd2e4f7c29be6a0a9dcabdfc

commit  cb6094b49723102fbd2e4f7c29be6a0a9dcabdfc
Author: Dan McDonald <danmcd@joyent.com>
Date:   2017-10-10T17:21:44.000Z

    8675 proto_promiscoff_req() doesn't always exit the MAC perimeter
    Reviewed by: Ryan Zezeski <rpz@joyent.com>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Andy Stormont <astormont@racktopsystems.com>
    Approved by: Richard Lowe <richlowe@richlowe.net>

Also available in: Atom PDF