performance counter support for AMD CPU families > 0x11
The cpc backend for AMD CPUs currently only supports CPU families up to and including 0x11. The current backend can be easily extended for families 0x12 and 0x14, the counters and the supported events are very similar to that of family 0x10.
AMD family 0x15 is different enough to justify creating a new cpc backend. It supports 6 core and 4 northbridge counters, but not all events are supported on all counters as in earlier CPU families. While the 6 core counters are similar to the previously existing counters on earlier families, the 4 northbridge counters are shared by all cores in a chip and can't be used for sampling on individual cpus or threads. The northbridge counters also differ from the core counters in interrupt and attribute support. This doesn't fit well with the existing cpc framework, but it could be made to work with some limitations.I see the following problems with supporting the northbridge (NB) counters:
- They should not be used by different cores on the same chip at the same time. This means that only one instance of cpustat(1M) using these counters should be run per chip at one time, and the results may be unexpected as events from all cores on the chip are counted. Similar limitations apply to cputrack(1M), with additional complications arising from thread placement issues. NB counters could be used by several threads on the same chip if they are not running at the same time, but I don't think it's feasible to implement support for that in the dispatcher. Running a thread using NB counters on a chip where the NB counters are already in use would cause silent failure to activate the NB counters for said thread, so blocking concurrent use of them even across chips might be the way to go.
- They support less attributes than the core counters, but attribute support is currently not specified per-counter. It would be possible to extend the cpc interfaces to support that.
- They interrupt all cores at the same time when they overflow. This could be handled by remembering on which CPU they were enabled, and making the interrupt handler ignore the counters on all other CPUs. This could be difficult as there is no way to find out which counter caused the overflow interrupt. A simpler approach would be no interrupt support at all for now, which would be another counter-specific difference that needs to be supported by the cpc interface.
To avoid all this, the support for NB counters could also be omitted completely for now. After all, CPC stands for CPU performance counters. A new framework for device performance counters could be introduced, or the existing framework could be extended to allow more than one device backend to be active at the same time. But I think this is way out of scope for this bug.
No data to display