Feature #12452
closedWant support for AMD Zen 2 CPC Events
100%
Description
We should update the AMD PMC data files to add support for Zen 2 based on the Rome and Matisse. Similar to the Zen 1 change in 10896, I have constructed the data files myself from the publicly available references.
Related issues
Updated by Robert Mustacchi over 2 years ago
- Related to Feature #10896: Want support for AMD Zen CPC events added
Updated by Robert Mustacchi over 2 years ago
I tested this in a few different ways that I've included below. First, without the change present I ran cpustat -h
and verified that the system didn't think there were any CPU Performance counters. Then on one with it cpustat -h
reports:
rm@elbereth ~ $ ssh rm@beowulf The Illumos Project SunOS 5.11 /ws/rm/zen2-pmc Apr. 02, 2020 SunOS Internal Development: rm 2020-Apr-02 [zen2-pmc] You have new mail. rm@beowulf:~$ cpustat -h cpustat: setrlimit failed - Not owner Usage: cpustat -c spec [-c spec]... [-p period] [-T u|d] [-sntD] [interval [count]] -c spec specify processor events to be monitored -n suppress titles -p period cycle through event list periodically -s run user soaker thread for system-only events -t include tsc register -T d|u Display a timestamp in date (d) or unix time_t (u) -D enable debug mode -h print extended usage information Use cputrack(1) to monitor per-process statistics. CPU performance counter interface: AMD Family 17h event specification syntax: [picn=]<eventn>[,attr[n][=<val>]][,[picn=]<eventn>[,attr[n][=<val>]],...] Generic Events: event[0-5]: PAPI_br_cn PAPI_br_ins PAPI_tot_cyc PAPI_tot_ins PAPI_tlb_dm PAPI_tlb_im PAPI_tot_cyc See generic_events(3CPC) for descriptions of these events Platform Specific Events: event[0-5]: FpRetSseAvxOps FpRetSseAvxOps.MacFLOPs FpRetSseAvxOps.DivFLOPs FpRetSseAvxOps.MultFLOPs FpRetSseAvxOps.AddSubFLOPs FpRetiredSerOps FpRetiredSerOps.SseBotRet FpRetiredSerOps.SseCtrlRet FpRetiredSerOps.X87BotRet FpRetiredSerOps.X87CtrlRet FpDispFaults FpDispFaults.YmmSpillFault FpDispFaults.YmmFillFault FpDispFaults.XmmFillFault FpDispFaults.x87FillFault LsBadStatus2 LsBadStatus2.StliOther LsLocks LsRetClClush LsRetCpuid LsDispatch LsSmiRx LsIntTaken LsRdTsc LsSTLF LsStCommitCancel2 LsStCommitCancel2.StCommitCancelWcbFull LsDcAccesses LsMabAlloc LsMabAlloc.DcPrefetcher LsMabAlloc.Stores LsMabAlloc.Loads LsRefillsFromSys LsRefillsFromSys.LS_MABRESP_RMT_DRAM LsRefillsFromSys.LS_MABRESP_RMT_CACHE LsRefillsFromSys.LS_MABRESP_LCL_DRAM LsRefillsFromSys.LS_MABRESP_LCL_CACHE LsRefillsFromSys.MABRESP_LCL_L2 LsL1DTlbMiss LsL1DTlbMiss.TlbReload1GL2Miss LsL1DTlbMiss.TlbReload2ML2Miss LsL1DTlbMiss.TlbReloadCoalescedPageMiss LsL1DTlbMiss.TlbReload4KL2Miss LsL1DTlbMiss.TlbReload1GL2Hit LsL1DTlbMiss.TlbReload2ML2Hit LsL1DTlbMiss.TlbReloadCoalescedPageHit LsL1DTlbMiss.TlbReload4KL2Hit LsMisalAccesses LsPrefInstrDisp LsPrefInstrDisp.PrefetchNTA LsPrefInstrDisp.PrefetchW LsPrefInstrDisp.Prefetch LsInefSwPref LsInefSwPref.MabMchCnt LsInefSwPref.DataPipeSwPfDcHit LsSwPfDcFills LsSwPfDcFills.LS_MABRESP_RMT_DRAM LsSwPfDcFills.LS_MABRESP_RMT_CACHE LsSwPfDcFills.LS_MABRESP_LCL_DRAM LsSwPfDcFills.LS_MABRESP_LCL_CACHE LsSwPfDcFills.MABRESP_LCL_L2 LsHwPfDcFills LsHwPfDcFills.LS_MABRESP_RMT_DRAM LsHwPfDcFills.LS_MABRESP_RMT_CACHE LsHwPfDcFills.LS_MABRESP_LCL_DRAM LsHwPfDcFills.LS_MABRESP_LCL_CACHE LsHwPfDcFills.MABRESP_LCL_L2 LsNotHaltedCyc LsTlbFlush IcCacheFillL2 IcCacheFillSys BpL1TlbMissL2TlbHit BpL1TlbMissL2TlbMiss BpL1TlbMissL2TlbMiss.IF1G BpL1TlbMissL2TlbMiss.IF2M BpL1TlbMissL2TlbMiss.IF4K BpL1BTBCorrect BpL2BTBCorrect BpDynIndPred BpDeReDirect BpL1TlbFetchHit BpL1TlbFetchHit.IF1G BpL1TlbFetchHit.IF2M BpL1TlbFetchHit.IF4K DeDisUopQueueEmptyDi0 DeDisUopsFromDecoder DeDisUopsFromDecoder.OpCacheDispatched DeDisUopsFromDecoder.DecoderDispatched DeDisDispatchTokenStalls1 DeDisDispatchTokenStalls1.FPMiscRsrcStall DeDisDispatchTokenStalls1.FPSchRsrcStall DeDisDispatchTokenStalls1.FpRegFileRsrcStall DeDisDispatchTokenStalls1.TakenBrnchBufferRsrc DeDisDispatchTokenStalls1.IntSchedulerMiscRsrcStall DeDisDispatchTokenStalls1.StoreQueueRsrcStall DeDisDispatchTokenStalls1.LoadQueueRsrcStall DeDisDispatchTokenStalls1.IntPhyRegFileRsrcStall DeDisDispatchTokenStalls0 DeDisDispatchTokenStalls0.ScAguDispatchStall DeDisDispatchTokenStalls0.RetireTokenStall DeDisDispatchTokenStalls0.AGSQTokenStall DeDisDispatchTokenStalls0.ALUTokenStall DeDisDispatchTokenStalls0.ALSQ3_0_TokenStall DeDisDispatchTokenStalls0.ALSQ2RsrcStall DeDisDispatchTokenStalls0.ALSQ1RsrcStall ExRetInstr ExRetCops ExRetBrn ExRetBrnMisp ExRetBrnTkn ExRetBrnTknMisp ExRetBrnFar ExRetNearRet ExRetNearRetMispred ExRetBrnIndMisp ExRetMmxFpInstr ExRetMmxFpInstr.SseInstr ExRetMmxFpInstr.MmxInstr ExRetMmxFpInstr.X87Instr ExRetCond ExDivBusy ExDivCount ExTaggedIbsOps ExTaggedIbsOps.IbsCountRollover ExTaggedIbsOps.IbsTaggedOpsRet ExTaggedIbsOps.IbsTaggedOps ExRetFusBrnchInst L2RequestG1 L2RequestG1.RdBlkL L2RequestG1.RdBlkX L2RequestG1.LsRdBlkC_S L2RequestG1.CacheableIcRead L2RequestG1.ChangeToX L2RequestG1.PrefetchL2Cmd L2RequestG1.L2HwPf L2RequestG1.Group2 L2RequestG2 L2RequestG2.Group1 L2RequestG2.LsRdSized L2RequestG2.LsRdSizedNC L2RequestG2.IcRdSized L2RequestG2.IcRdSizedNC L2RequestG2.SmcInval L2RequestG2.BusLocksOriginator L2RequestG2.BusLocksResponses L2CacheReqStat L2CacheReqStat.LsRdBlkCS L2CacheReqStat.LsRdBlkLHitX L2CacheReqStat.LsRdBlkLHitS L2CacheReqStat.LsRdBlkX L2CacheReqStat.LsRdBlkC L2CacheReqStat.IcFillHitX L2CacheReqStat.IcFillHitS L2CacheReqStat.IcFillMiss L2PfHitL2 L2PfMissL2HitL2 L2PfMissL2L3 attributes: edge pc inv cmask umask nouser sys See "Preliminary Processor Programming Reference (PPR) for AMD Family 17h Model 31h, Revision B0 Processors" (AMD publication 55803), "Processor Programming Reference (PPR) for AMD Family 17h Model 71h, Revision B0 Processors" (AMD publication 56176), and amd_f17h_zen2_events(3CPC)
This matches what we expect. I then tested one of the PAPI counters -- total cycles. To do this I bound a shell to cpu 0. I started up cputrack as I'll include below. A few seconds in, I lauched a bash while :; do :; done
loop and watched as the cycle count went to the turbo boost limit of the processor (3.2 GHz):
rm@beowulf:~$ pfexec cpustat -c PAPI_tot_cyc 1 30 | grep ' 0 ' 1.000 0 tick 10413 2.000 0 tick 73507 3.000 0 tick 36979 4.000 0 tick 140720 5.000 0 tick 125885 6.000 0 tick 106126 7.000 0 tick 125494 8.000 0 tick 140529 9.000 0 tick 1636613247 10.000 0 tick 3177561332 11.000 0 tick 3177848817 12.000 0 tick 3178011789 13.000 0 tick 3178239888 14.000 0 tick 3178218914 15.000 0 tick 3178334000 16.000 0 tick 3177945700 17.000 0 tick 3178337902 18.000 0 tick 3178005176 19.000 0 tick 3178241498 20.000 0 tick 3178326925 21.000 0 tick 3178410671 22.000 0 tick 1479389657 23.000 0 tick 16441 24.000 0 tick 13102 25.000 0 tick 17766 26.000 0 tick 17350
Updated by Electric Monk about 2 years ago
- Status changed from New to Closed
git commit 31aa620247ae407b2bee2dccd71693d1938f54d6
commit 31aa620247ae407b2bee2dccd71693d1938f54d6 Author: Robert Mustacchi <rm@fingolfin.org> Date: 2020-04-08T00:58:56.000Z 12452 Want support for AMD Zen 2 CPC Events Reviewed by: Patrick Mooney <pmooney@pfmooney.com> Approved by: Gordon Ross <gordon.w.ross@gmail.com>