Project

General

Profile

Actions

Bug #6273

closed

obytes64 drops despite vnic not being destroyed

Added by Robert Mustacchi almost 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Category:
networking
Start date:
2015-09-30
Due date:
% Done:

100%

Estimated time:
Difficulty:
Hard
Tags:
Gerrit CR:
External Bug:

Description

We encountered several cases where the obytes64 value for a vnic would drop to zero while the vnic was still active. The various kstat identifiers also didn't change, which for a counter is a serious bug.

I've installed a D script to run overnight to try and capture the theory that I originally established. This script should at least help us narrow in on whether or not this theory is viable:

[root@RM08219 (us-east-2) ~]# cat /var/tmp/rm/OS-2481.d
fbt::vnic_dev_create:entry
{
        self->t = 1;
}

fbt::mac_tx_srs_stat_recreate:entry
/self->t != 1/
{
        trace(timestamp);
        trace(walltimestamp);
        printf("\n%x %Y.%09d\n", timestamp, walltimestamp, walltimestamp % 1000000000);
        stack();
}

fbt::vnic_dev_create:return
/self->t/
{
        self->t = 0;
}

Past me had a theory that this was due to a stat being clobbered and in fact, the output of the above OS-2481.d was valuable and still sitting on the shrimp. Here's a few lines from it, the stack is what's important:

CPU     ID                    FUNCTION:NAME
  5  37137   mac_tx_srs_stat_recreate:entry  3633581624715199 1397097051623999321
ce8b92600e7bf 2014 Apr 10 02:30:51.623999321

              mac`i_mac_group_add_ring+0x291
              mac`mac_group_mov_ring+0x5b
              mac`mac_release_tx_group+0xfb
              mac`mac_datapath_teardown+0x3ca
              mac`mac_client_datapath_teardown+0x66
              mac`mac_unicast_remove+0x27a
              vnic`vnic_dev_delete+0x14b
              vnic`vnic_ioc_delete+0x28
              dld`drv_ioctl+0x1e4
              genunix`cdev_ioctl+0x39
              specfs`spec_ioctl+0x60
              genunix`fop_ioctl+0x55
              genunix`ioctl+0x9b
              unix`sys_syscall32+0xff

  5  37137   mac_tx_srs_stat_recreate:entry  3633581625230065 1397097051624999498
ce8b92608c2f1 2014 Apr 10 02:30:51.624999498

              mac`i_mac_group_add_ring+0x291
              mac`mac_group_mov_ring+0x5b
              mac`mac_release_tx_group+0xfb
              mac`mac_datapath_teardown+0x3ca
              mac`mac_client_datapath_teardown+0x66
              mac`mac_unicast_remove+0x27a
              vnic`vnic_dev_delete+0x14b
              vnic`vnic_ioc_delete+0x28
              dld`drv_ioctl+0x1e4
              genunix`cdev_ioctl+0x39
              specfs`spec_ioctl+0x60
              genunix`fop_ioctl+0x55
              genunix`ioctl+0x9b
              unix`sys_syscall32+0xff

There are various different tx groups that exist over a physical device based upon the way mac has decided to hand out transmit rings and their availability. In many places where we have ixgbe on the scene, it ends up working out that we assign on the order of 8-16 things their own group with one ring and then leave everyone else using the default ring. The following bit of mdb from emy10 shows the rough clustering of soft ring sets (eg. mac clients basically) to theses tx groups:

> ::walk mac_client_impl_cache | ::printf "%p\n" mac_client_impl_t mci_flent->fe_tx_ring_g
roup ! sort | uniq -c
   4 0
   1 ffffff0d2c4c7b80
   1 ffffff0d2c4c7c10
   1 ffffff0d2c4c7ca0
   1 ffffff0d2c4c7d30
   1 ffffff0d2c4c7dc0
   1 ffffff0d2c4c7e50
   1 ffffff0d2c4c7ee0
  59 ffffff0d2c4c7f70
   1 ffffff0d34b3c980
   1 ffffff0d34b3ca10
   1 ffffff0d34b3caa0
   1 ffffff0d34b3cb30
   1 ffffff0d34b3cbc0
   1 ffffff0d34b3cc50
   1 ffffff0d34b3cce0
  52 ffffff0d34b3cd70
   1 ffffff0d39505010
   1 ffffff0d47e2ddc0

In the cases where we only have one mac client on the group, the group correspondingly only has a single client. This is important as if a non-default group has its last client removed then we change the state transition that we take while going through mac_datapath_teardown and end up releasing the transmit group entirely.

When the transmit group is removed, its ring is restored to the default group. However, that triggers us to go through and recreate the srs tx stat. Now, for the link kstat for a vnic goes through the following path to get the answer to that number of bytes that's been transmitted being the obytes member and the MAC_STAT_OBYTES:

dls_devnet_stat_update()
 +-> dls_stat_update()
   +-> mac_stat_get()
     +-> driver specific stat, vnic_m_stat()
       +-> mac_client_stat_get()

The mac_client_stat_get() function just reads from the srs tx stat, which happens to be what we just clobbered for everything in the default group. It's pretty easy to see this by doing the following commands before and after halting a zone that has one of the singleton groups identified above:

# kstat -n mac_misc_stat -s obytes -p | sort > misc.post
# kstat -m link -s obytes -p | sort > link.pre

If we do this, let's look at a few of the stats for some zones. We'll note that we added to the defunct tx stats, but the overall link stat was truncated:

pre:
  z103_net1:0:mac_misc_stat:obytes        0                              
post:
  z103_net1:0:mac_misc_stat:obytes        261775867                      

And the corresponding change in the link stat:

pre:
  link:0:z103_net1:obytes 261773435                                      
post:
  link:0:z103_net1:obytes 2852                                           

The solution here is that when someone is asking for the internal mac client stat, we need to consult anything that's in the defunct group when answering it. This allows mac to still break them apart into different categories, but allows consumers to obtain the logical number of bytes actually used by the GLDv3 device despite any changes to ring assignments.

Actions

Also available in: Atom PDF