Project

General

Profile

Feature #12452

Want support for AMD Zen 2 CPC Events

Added by Robert Mustacchi 8 months ago. Updated 8 months ago.

Status:
Closed
Priority:
Normal
Category:
kernel
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

We should update the AMD PMC data files to add support for Zen 2 based on the Rome and Matisse. Similar to the Zen 1 change in 10896, I have constructed the data files myself from the publicly available references.


Related issues

Related to illumos gate - Feature #10896: Want support for AMD Zen CPC eventsClosedRobert Mustacchi

Actions
#1

Updated by Robert Mustacchi 8 months ago

  • Related to Feature #10896: Want support for AMD Zen CPC events added
#2

Updated by Robert Mustacchi 8 months ago

I tested this in a few different ways that I've included below. First, without the change present I ran cpustat -h and verified that the system didn't think there were any CPU Performance counters. Then on one with it cpustat -h reports:

rm@elbereth ~ $ ssh rm@beowulf
The Illumos Project     SunOS 5.11      /ws/rm/zen2-pmc Apr. 02, 2020
SunOS Internal Development: rm 2020-Apr-02 [zen2-pmc]
You have new mail.
rm@beowulf:~$ cpustat -h
cpustat: setrlimit failed - Not owner
Usage:
        cpustat -c spec [-c spec]... [-p period] [-T u|d]
                [-sntD] [interval [count]]

        -c spec   specify processor events to be monitored
        -n        suppress titles
        -p period cycle through event list periodically
        -s        run user soaker thread for system-only events
        -t        include tsc register
        -T d|u    Display a timestamp in date (d) or unix time_t (u)
        -D        enable debug mode
        -h        print extended usage information

        Use cputrack(1) to monitor per-process statistics.

        CPU performance counter interface: AMD Family 17h

        event specification syntax:
        [picn=]<eventn>[,attr[n][=<val>]][,[picn=]<eventn>[,attr[n][=<val>]],...]

        Generic Events:

        event[0-5]: PAPI_br_cn PAPI_br_ins PAPI_tot_cyc PAPI_tot_ins 
                 PAPI_tlb_dm PAPI_tlb_im PAPI_tot_cyc 

        See generic_events(3CPC) for descriptions of these events

        Platform Specific Events:

        event[0-5]: FpRetSseAvxOps FpRetSseAvxOps.MacFLOPs 
                 FpRetSseAvxOps.DivFLOPs FpRetSseAvxOps.MultFLOPs 
                 FpRetSseAvxOps.AddSubFLOPs FpRetiredSerOps 
                 FpRetiredSerOps.SseBotRet FpRetiredSerOps.SseCtrlRet 
                 FpRetiredSerOps.X87BotRet FpRetiredSerOps.X87CtrlRet 
                 FpDispFaults FpDispFaults.YmmSpillFault 
                 FpDispFaults.YmmFillFault FpDispFaults.XmmFillFault 
                 FpDispFaults.x87FillFault LsBadStatus2 
                 LsBadStatus2.StliOther LsLocks LsRetClClush LsRetCpuid 
                 LsDispatch LsSmiRx LsIntTaken LsRdTsc LsSTLF 
                 LsStCommitCancel2 LsStCommitCancel2.StCommitCancelWcbFull 
                 LsDcAccesses LsMabAlloc LsMabAlloc.DcPrefetcher 
                 LsMabAlloc.Stores LsMabAlloc.Loads LsRefillsFromSys 
                 LsRefillsFromSys.LS_MABRESP_RMT_DRAM 
                 LsRefillsFromSys.LS_MABRESP_RMT_CACHE 
                 LsRefillsFromSys.LS_MABRESP_LCL_DRAM 
                 LsRefillsFromSys.LS_MABRESP_LCL_CACHE 
                 LsRefillsFromSys.MABRESP_LCL_L2 LsL1DTlbMiss 
                 LsL1DTlbMiss.TlbReload1GL2Miss 
                 LsL1DTlbMiss.TlbReload2ML2Miss 
                 LsL1DTlbMiss.TlbReloadCoalescedPageMiss 
                 LsL1DTlbMiss.TlbReload4KL2Miss 
                 LsL1DTlbMiss.TlbReload1GL2Hit 
                 LsL1DTlbMiss.TlbReload2ML2Hit 
                 LsL1DTlbMiss.TlbReloadCoalescedPageHit 
                 LsL1DTlbMiss.TlbReload4KL2Hit LsMisalAccesses 
                 LsPrefInstrDisp LsPrefInstrDisp.PrefetchNTA 
                 LsPrefInstrDisp.PrefetchW LsPrefInstrDisp.Prefetch 
                 LsInefSwPref LsInefSwPref.MabMchCnt 
                 LsInefSwPref.DataPipeSwPfDcHit LsSwPfDcFills 
                 LsSwPfDcFills.LS_MABRESP_RMT_DRAM 
                 LsSwPfDcFills.LS_MABRESP_RMT_CACHE 
                 LsSwPfDcFills.LS_MABRESP_LCL_DRAM 
                 LsSwPfDcFills.LS_MABRESP_LCL_CACHE 
                 LsSwPfDcFills.MABRESP_LCL_L2 LsHwPfDcFills 
                 LsHwPfDcFills.LS_MABRESP_RMT_DRAM 
                 LsHwPfDcFills.LS_MABRESP_RMT_CACHE 
                 LsHwPfDcFills.LS_MABRESP_LCL_DRAM 
                 LsHwPfDcFills.LS_MABRESP_LCL_CACHE 
                 LsHwPfDcFills.MABRESP_LCL_L2 LsNotHaltedCyc LsTlbFlush 
                 IcCacheFillL2 IcCacheFillSys BpL1TlbMissL2TlbHit 
                 BpL1TlbMissL2TlbMiss BpL1TlbMissL2TlbMiss.IF1G 
                 BpL1TlbMissL2TlbMiss.IF2M BpL1TlbMissL2TlbMiss.IF4K 
                 BpL1BTBCorrect BpL2BTBCorrect BpDynIndPred BpDeReDirect 
                 BpL1TlbFetchHit BpL1TlbFetchHit.IF1G BpL1TlbFetchHit.IF2M 
                 BpL1TlbFetchHit.IF4K DeDisUopQueueEmptyDi0 
                 DeDisUopsFromDecoder 
                 DeDisUopsFromDecoder.OpCacheDispatched 
                 DeDisUopsFromDecoder.DecoderDispatched 
                 DeDisDispatchTokenStalls1 
                 DeDisDispatchTokenStalls1.FPMiscRsrcStall 
                 DeDisDispatchTokenStalls1.FPSchRsrcStall 
                 DeDisDispatchTokenStalls1.FpRegFileRsrcStall 
                 DeDisDispatchTokenStalls1.TakenBrnchBufferRsrc 
                 DeDisDispatchTokenStalls1.IntSchedulerMiscRsrcStall 
                 DeDisDispatchTokenStalls1.StoreQueueRsrcStall 
                 DeDisDispatchTokenStalls1.LoadQueueRsrcStall 
                 DeDisDispatchTokenStalls1.IntPhyRegFileRsrcStall 
                 DeDisDispatchTokenStalls0 
                 DeDisDispatchTokenStalls0.ScAguDispatchStall 
                 DeDisDispatchTokenStalls0.RetireTokenStall 
                 DeDisDispatchTokenStalls0.AGSQTokenStall 
                 DeDisDispatchTokenStalls0.ALUTokenStall 
                 DeDisDispatchTokenStalls0.ALSQ3_0_TokenStall 
                 DeDisDispatchTokenStalls0.ALSQ2RsrcStall 
                 DeDisDispatchTokenStalls0.ALSQ1RsrcStall ExRetInstr 
                 ExRetCops ExRetBrn ExRetBrnMisp ExRetBrnTkn 
                 ExRetBrnTknMisp ExRetBrnFar ExRetNearRet 
                 ExRetNearRetMispred ExRetBrnIndMisp ExRetMmxFpInstr 
                 ExRetMmxFpInstr.SseInstr ExRetMmxFpInstr.MmxInstr 
                 ExRetMmxFpInstr.X87Instr ExRetCond ExDivBusy ExDivCount 
                 ExTaggedIbsOps ExTaggedIbsOps.IbsCountRollover 
                 ExTaggedIbsOps.IbsTaggedOpsRet ExTaggedIbsOps.IbsTaggedOps 
                 ExRetFusBrnchInst L2RequestG1 L2RequestG1.RdBlkL 
                 L2RequestG1.RdBlkX L2RequestG1.LsRdBlkC_S 
                 L2RequestG1.CacheableIcRead L2RequestG1.ChangeToX 
                 L2RequestG1.PrefetchL2Cmd L2RequestG1.L2HwPf 
                 L2RequestG1.Group2 L2RequestG2 L2RequestG2.Group1 
                 L2RequestG2.LsRdSized L2RequestG2.LsRdSizedNC 
                 L2RequestG2.IcRdSized L2RequestG2.IcRdSizedNC 
                 L2RequestG2.SmcInval L2RequestG2.BusLocksOriginator 
                 L2RequestG2.BusLocksResponses L2CacheReqStat 
                 L2CacheReqStat.LsRdBlkCS L2CacheReqStat.LsRdBlkLHitX 
                 L2CacheReqStat.LsRdBlkLHitS L2CacheReqStat.LsRdBlkX 
                 L2CacheReqStat.LsRdBlkC L2CacheReqStat.IcFillHitX 
                 L2CacheReqStat.IcFillHitS L2CacheReqStat.IcFillMiss 
                 L2PfHitL2 L2PfMissL2HitL2 L2PfMissL2L3 

        attributes: edge pc inv cmask umask nouser sys 

        See "Preliminary Processor Programming Reference (PPR) for AMD 
        Family 17h Model 31h, Revision B0 Processors" (AMD publication 
        55803), "Processor Programming Reference (PPR) for AMD Family 17h 
        Model 71h, Revision B0 Processors" (AMD publication 56176), and 
        amd_f17h_zen2_events(3CPC) 

This matches what we expect. I then tested one of the PAPI counters -- total cycles. To do this I bound a shell to cpu 0. I started up cputrack as I'll include below. A few seconds in, I lauched a bash while :; do :; done loop and watched as the cycle count went to the turbo boost limit of the processor (3.2 GHz):

rm@beowulf:~$ pfexec cpustat -c PAPI_tot_cyc 1 30 | grep ' 0 '
  1.000   0  tick     10413 
  2.000   0  tick     73507 
  3.000   0  tick     36979 
  4.000   0  tick    140720 
  5.000   0  tick    125885 
  6.000   0  tick    106126 
  7.000   0  tick    125494 
  8.000   0  tick    140529 
  9.000   0  tick 1636613247 
 10.000   0  tick 3177561332 
 11.000   0  tick 3177848817 
 12.000   0  tick 3178011789 
 13.000   0  tick 3178239888 
 14.000   0  tick 3178218914 
 15.000   0  tick 3178334000 
 16.000   0  tick 3177945700 
 17.000   0  tick 3178337902 
 18.000   0  tick 3178005176 
 19.000   0  tick 3178241498 
 20.000   0  tick 3178326925 
 21.000   0  tick 3178410671 
 22.000   0  tick 1479389657 
 23.000   0  tick     16441 
 24.000   0  tick     13102 
 25.000   0  tick     17766 
 26.000   0  tick     17350 
#3

Updated by Electric Monk 8 months ago

  • Status changed from New to Closed

git commit 31aa620247ae407b2bee2dccd71693d1938f54d6

commit  31aa620247ae407b2bee2dccd71693d1938f54d6
Author: Robert Mustacchi <rm@fingolfin.org>
Date:   2020-04-08T00:58:56.000Z

    12452 Want support for AMD Zen 2 CPC Events
    Reviewed by: Patrick Mooney <pmooney@pfmooney.com>
    Approved by: Gordon Ross <gordon.w.ross@gmail.com>

Also available in: Atom PDF