Actions
Bug #13218
closed"Stack smashing detected" panic when creating vnic over aggr with 4 mlxcx links
Start date:
Due date:
% Done:
100%
Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:
External Bug:
Description
Created an aggr, similar to:
dladm create-aggr -t -l mlxcx0 -l mlxcx1 -l mlxcx2 -l mlxcx3 aggr0
and then a vnic
dladm create-vnic -t -l aggr0 vnic0
and the system panic'ed with the stack:
> $C ffffd002e3d25210 vpanic() ffffd002e3d25220 0xfffffffffb85c822() ffffd002e3d25490 0xfffffffffb9cb0b6() ffffd002e3d25540 mac_fanout_setup+0x65(ffffd095a3cef390, ffffd095841650c8, ffffd095841650d4, fffffffffb9d9c00, ffffd095a3cef390, 0, 0) ffffd002e3d255c0 mac_srs_group_setup+0xf7(ffffd095a3cef390, ffffd095841650c8, 1) ffffd002e3d256a0 mac_datapath_setup+0x70d(ffffd095a3cef390, ffffd095841650c8, 1) ffffd002e3d25750 mac_client_datapath_setup+0x298(ffffd095a3cef390, 0, ffffd002e3d25b74, ffffd0953ee26000, 0, ffffd09484a4f8a8) ffffd002e3d25800 i_mac_unicast_add+0x58e(ffffd095a3cef390, ffffd002e3d25b74, 0, ffffd09a1f574d80, 0, ffffd002e3d258b0) ffffd002e3d25880 mac_unicast_add+0x6e(ffffd095a3cef390, ffffd002e3d25b74, 0, ffffd09a1f574d80, 0, ffffd002e3d258b0) ffffd002e3d25920 vnic_unicast_add+0x1ff(ffffd09a1f574c60, 3, ffffd002e3d25b68, 3, ffffd002e3d25b64, ffffd002e3d25b74, 4300000000, ffffd002e3d25b70, ffffd00200000000, 0) ffffd002e3d25b00 vnic_dev_create+0x3d9(b, a, ffffd002e3d25b6c, ffffd002e3d25b64, ffffd002e3d25b74, ffffd002e3d25b68, ffffffff00000003, 0, ffffd09a00000000, 0, ffffd09595d76044, ffffd09300000000, ffffd002e3d25b70, ffffd0952a46c1d0) ffffd002e3d25be0 vnic_ioc_create+0xda(ffffd09595d76000, 8041550, 100003, ffffd0952a46c1d0, ffffd002e3d25dd8) ffffd002e3d25c80 drv_ioctl+0x1ef(1200000000, 1710001, 8041550, 100003, ffffd0952a46c1d0, ffffd002e3d25dd8) ffffd002e3d25cc0 cdev_ioctl+0x2b(1200000000, 1710001, 8041550, 100003, ffffd0952a46c1d0, ffffd002e3d25dd8) ffffd002e3d25d10 spec_ioctl+0x45(ffffd0947d365880, 1710001, 8041550, 100003, ffffd0952a46c1d0, ffffd002e3d25dd8, 0) ffffd002e3d25da0 fop_ioctl+0x5b(ffffd0947d365880, 1710001, 8041550, 100003, ffffd0952a46c1d0, ffffd002e3d25dd8, 0) ffffd002e3d25ec0 ioctl+0x153(3, 1710001, 8041550) ffffd002e3d25f10 _sys_sysenter_post_swapgs+0x14f()
Recreated with
dtrace -m mac
running, and from the dtrace in the dump:> ffffd09519a960c0::dtrace -c f CPU ID FUNCTION:NAME 15 42345 mac_start_ring:return 15 42147 mac_hwring_start:return 15 42345 mac_start_ring:return . . 15 41142 mac_flow_cpu_init:entry 15 41136 mac_compute_soft_ring_count:entry 15 41888 mac_client_stat_get:entry 15 41070 mac_client_ifspeed:entry 15 42322 mac_stat_get:entry 15 42323 mac_stat_get:return 15 41071 mac_client_ifspeed:return 15 41889 mac_client_stat_get:return 15 41137 mac_compute_soft_ring_count:return 15 41130 mac_next_bind_cpu:entry 15 41131 mac_next_bind_cpu:return . . 15 41130 mac_next_bind_cpu:entry 15 41131 mac_next_bind_cpu:return 15 41138 mac_tx_cpu_init:entry 15 41139 mac_tx_cpu_init:return
Looks like there was no return from
mac_flow_cpu_init()
. A code snippetstatic void mac_flow_cpu_init(flow_entry_t *flent, cpupart_t *cpupart) { mac_soft_ring_set_t *rx_srs; processorid_t cpuid; int i, j, k, srs_cnt, nscpus, maxcpus, soft_ring_cnt = 0; mac_cpus_t *srs_cpu; mac_resource_props_t *emrp = &flent->fe_effective_props; uint32_t cpus[MRP_NCPUS]; . . . nscpus = 0; for (srs_cnt = 0; srs_cnt < flent->fe_rx_srs_cnt; srs_cnt++) { rx_srs = flent->fe_rx_srs[srs_cnt]; srs_cpu = &rx_srs->srs_cpu; for (j = 0; j < srs_cpu->mc_ncpus; j++) { cpus[nscpus++] = srs_cpu->mc_cpus[j]; } }
An array on the stack with no guard in the code to avoid the overflow. And
MR_NCPUS
is defined as 128.A look at the
flow_entry_t
> ffffd095841650c8::print -at flow_entry_t fe_rx_srs fe_rx_srs_cnt ffffd0958416b698 void *[128] fe_rx_srs = [ 0xffffd0979de95cc0, 0xffffd0979de95000, 0xffffd09a1f247340, 0xffffd09a1f246680, 0xffffd09a1f2459c0, 0xffffd09a1f244d00, 0xffffd09a1f244040, 0xffffd09a1f243300, 0xffffd09a1f242640, 0xffffd09a1f241980, 0xffffd09a1f240cc0, 0xffffd09a1f240000, 0xffffd09a1f23f340, 0xffffd09a1f23e680, 0xffffd09a1f23d9c0, 0xffffd09a1f23cd00, 0xffffd09a1f23c040, 0xffffd09a1f23b300, 0xffffd09a1f23a640, 0xffffd09a1f239980, 0xffffd09a1f238cc0, 0xffffd09a1f238000, 0xffffd09a1f237340, 0xffffd09a1f236680, 0xffffd09a1f2359c0, 0xffffd09a1f234d00, 0xffffd09a1f234040, 0xffffd09a1f233300, 0xffffd09a1f232640, 0xffffd09a1f231980, 0xffffd09a1f230cc0, 0xffffd09a1f230000, ... ] ffffd0958416ba98 int fe_rx_srs_cnt = 0x41
fe_rx_srs_cnt
is 0x41 (65) and if look at mc_ncpus
for each mc_cpus[]
array we see:> ffffd0958416b698::array "void *" 0x41|::print -t 'void *'|::print -t mac_soft_ring_set_t srs_cpu.mc_ncpus uint32_t srs_cpu.mc_ncpus = 0x2 uint32_t srs_cpu.mc_ncpus = 0x2 uint32_t srs_cpu.mc_ncpus = 0x2 uint32_t srs_cpu.mc_ncpus = 0x2 . . . uint32_t srs_cpu.mc_ncpus = 0x2 uint32_t srs_cpu.mc_ncpus = 0x2
mc_ncpus
is 0x2 for each if the 0x41 array entries, this means nscpus
from the code snippet will end up as 0x82 = 130 which is greater than MRP_NCPUS
, hence the array overflow and stack corruption.
Related issues
Updated by Paul Winder almost 3 years ago
- Related to Bug #13222: Increase maximum number of fanout CPUs from 128 to 256 added
Updated by Electric Monk almost 3 years ago
- Status changed from In Progress to Closed
- % Done changed from 0 to 100
git commit 3714f7be8e09c39a0ea7ce7ef44cb495ce250913
commit 3714f7be8e09c39a0ea7ce7ef44cb495ce250913 Author: Paul Winder <paul@winder.uk.net> Date: 2020-12-16T14:23:43.000Z 13218 "Stack smashing detected" panic when creating vnic over aggr with 4 mlxcx links 13222 Increase maximum number of fanout CPUs from 128 to 256 Reviewed by: Robert Mustacchi <rm@fingolfin.org> Reviewed by: Garrett D'Amore <garrett@damore.org> Approved by: Dan McDonald <danmcd@joyent.com>
Actions