PerfMonSvc
is a service providing access to the custom build tools for Athena, it uses various Gaudi/Athena hooks to collect resource usage information and provides information on various levels such as;
The PerfMonMTSvc
is an MT-save version of the PerfMonSvc
allowing to collection of information for AthenaMT. It was started as part of the Google Summer of Code 2019 and was successfully integrated into Athena.
After migration to the CA
, the PerfMonSvc
has been retired.
To make daily monitoring easy to read the so-called PerfMonMT
domains are introduced.
During job configuration, the user can assign (a group of) algorithms to specific domains that are then used in the SPOT
daily test when making resource usage plots. Since the SPOT
components plots use only tenish
maximum expensive algorithms per plot, the increase of resource usage in the specific domain helps shifters to identify further where the problem appears.
PerfMonMT
For the transform jobs, the perfmon
options are already integrated and can be easily switched on with the --perfmonmt= (options: 'none', 'fastmonmt' [default], 'fullmonmt')
option, for example:
Reco_tf.py --AMI q445 --perfmonmt fastmont
.
For the user-defined job option you can turn on the PerfMonMT
by adding the following code to your CA configuration:
# Set the necessary flags
flags.PerfMon.doFastMonMT = True
flags.PerfMon.OutputJSON = 'perfmonmt_test.json'
# Merge the relevant service configuration
from PerfMonComps.PerfMonCompsConfig import PerfMonMTSvcCfg
acc.merge(PerfMonMTSvcCfg(flags))
The example above turned on the default fast monitoring options turning on the printout of the following non-detailed information:
The example of the log is below:
PerfMonMTSvc INFO =======================================================================================
PerfMonMTSvc INFO Event Level Monitoring
PerfMonMTSvc INFO (Only the first 10 and the last measurements are explicitly printed)
PerfMonMTSvc INFO =======================================================================================
PerfMonMTSvc INFO Event CPU [s] Wall [s] Vmem [kB] Rss [kB] Pss [kB] Swap [kB]
PerfMonMTSvc INFO ---------------------------------------------------------------------------------------
PerfMonMTSvc INFO 1 274.73 283.32 5852376 4736336 4733328 0
PerfMonMTSvc INFO 2 423.40 350.92 8640184 7461984 7458990 0
PerfMonMTSvc INFO 25 655.36 380.95 10915780 9767124 9764130 0
PerfMonMTSvc INFO 45 903.57 411.86 11189092 10069572 10066578 0
PerfMonMTSvc INFO 68 1160.43 443.85 11237524 10109504 10106510 0
PerfMonMTSvc INFO 92 1401.89 473.95 11243396 10121236 10118229 0
PerfMonMTSvc INFO 111 1651.85 506.07 11257936 10158256 10155259 0
PerfMonMTSvc INFO 133 1915.59 539.51 11275056 10175780 10172782 0
PerfMonMTSvc INFO 153 2157.11 569.74 11431616 10327076 10324078 0
PerfMonMTSvc INFO 177 2423.99 602.96 11447456 10333556 10330558 0
...
PerfMonMTSvc INFO 982 12893.39 1927.34 12208048 11083644 11080707 0
INFO =======================================================================================
PerfMonMTSvc INFO Snapshots Summary
PerfMonMTSvc INFO =======================================================================================
PerfMonMTSvc INFO Step dCPU [s] dWall [s] <CPU> dVmem [kB] dRss [kB] dPss [kB] dSwap [kB]
PerfMonMTSvc INFO ---------------------------------------------------------------------------------------
PerfMonMTSvc INFO Configure 119.33 123.739 0.96 1910384 1418476 1415640 0
PerfMonMTSvc INFO Initialize 134.65 136.647 0.99 2485500 1916532 1916368 0
PerfMonMTSvc INFO FirstEvent 163.66 83.894 1.95 3907400 3751616 3751632 0
PerfMonMTSvc INFO Execute 12775.7 1620.08 7.89 3567864 3648092 3648147 0
PerfMonMTSvc INFO Finalize 92.69 91.05 1.02 259976 -4910788 -4910442 0
PerfMonMTSvc INFO ***************************************************************************************
PerfMonMTSvc INFO Number of events processed: 1000
PerfMonMTSvc INFO CPU usage per event [ms]: 12939
PerfMonMTSvc INFO Events per second: 0.587
PerfMonMTSvc INFO CPU utilization efficiency [%]: 99
INFO***************************************************************************************
PerfMonMTSvc INFO Max Vmem: 11.64 GB
PerfMonMTSvc INFO Max Rss: 10.59 GB
PerfMonMTSvc INFO Max Pss: 10.59 GB
PerfMonMTSvc INFO Max Swap: 0.00 KB
INFO ***************************************************************************************
PerfMonMTSvc INFO Leak estimate per event Vmem: 1.08 MB
PerfMonMTSvc INFO Leak estimate per event Pss: 1.05 MB
PerfMonMTSvc INFO >> Estimated using the last 37 measurements from the Event Level Monitoring
PerfMonMTSvc INFO >> Events prior to the first 300 are omitted
...
INFO =======================================================================================
PerfMonMTSvc INFO System Information
INFO =======================================================================================
PerfMonMTSvc INFO CPU Model: AMD EPYC 7302 16-Core Processor 512 KB
PerfMonMTSvc INFO Number of Available Cores: 32
PerfMonMTSvc INFO Total Memory: 251.67 GB
INFO =======================================================================================
PerfMonMTSvc INFO Environment Information
INFO =======================================================================================
PerfMonMTSvc INFO Malloc Library: libtcmalloc_minimal.so
PerfMonMTSvc INFO Math Library: libimf.so
The more detailed information is available in the Fullmont
. In addition to the fastmont
information it contains the component-level metrics:
Initialize
, FirstEvent
, Execute
, Finalize
, Callbacks
, preLoadProxy
PerfMonMTSvc INFO PerfMonMTSvc Report
PerfMonMTSvc INFO
PerfMonMTSvc INFO Component Level Monitoring
PerfMonMTSvc INFO =======================================================================================
PerfMonMTSvc INFO Step Count CPU Time [ms] Vmem [kB] Malloc [kB] Component
PerfMonMTSvc INFO ---------------------------------------------------------------------------------------
PerfMonMTSvc INFO Initialize 1 44376.84 0 153893 AthMonSeq_TrigJetMonitorAlgorithm
PerfMonMTSvc INFO Initialize 1 9740.13 39200 86544 AthMonSeq_TrigEgammaAthMonitorCfg
PerfMonMTSvc INFO Initialize 1 8956.97 17408 17920 AthMonSeq_JetMonitoring
...
INFO=======================================================================================
PerfMonMTSvc INFO FirstEvent 1 37515.43 0 0 CondInputLoader
PfMonMTSvc INFO FirstEvent 1 16442.71 0 0 NswCalibDbAlg
PerfMonMTSvc INFO FirstEvent 1 8017.96 0 0 TrigDeserialiser
PerfMonMTSvc INFO FirstEvent 1 4463.59 0 0 MuonDetectorCondAlg
INFO =======================================================================================
PerfMonMTSvc INFO Execute 999 1136392.37 0 0 InDetSiSpTrackFinder
PerfMonMTSvc INFO Execute 999 1034468.73 0 0 InDetSiSpTrackFinderR3LargeD0
PerfMonMTSvc INFO Execute 999 958990.57 0 0 InDetAmbiguitySolver
...
INFO=======================================================================================
PerfMonMTSvc INFO Finalize 1 53148.94 0 -125681 ToolSvc
PerfMonMTSvc INFO Finalize 1 23012.33 0 -3176 HLTDecodingSeq
PerfMonMTSvc INFO Finalize 1 23012.14 0 -3176 TrigDeserialiser
...
INFO=======================================================================================
PerfMonMTSvc INFO preLoadProxy 1 15081.71 0 0 loadCachesOverhead:COOLONL_TRIGGER/CONDBR2
PerfMonMTSvc INFO preLoadProxy 1 12527.71 0 0 DetCondKeyTrans[/LAR/Align]
PerfMonMTSvc INFO preLoadProxy 1 12527.69 0 0 UpdateAddr::/LAR/Align
...
INFO=======================================================================================
PerfMonMTSvc INFO Callback 1 57.50 0 0 TRT_DetectorTool[0x3a859da8]+e9
PerfMonMTSvc INFO Callback 1 42.41 0 0 SCT_DetectorTool[0x40abd5b0]+e9
PerfMonMTSvc INFO Callback 1 36.51 0 0 PixelDetectorTool[0x40abe670]+e9
...
Note that component measurements are collected without locks (no synchronization) and that metrics are aggregated across all calls. That means that vmem/malloc
metrics in the event loop are only available in serial/--threads=1
mode.
perfmon
outputAs default the information is collected to the perfmonmt_jobname.json.tar.gz
file which can be analyzed with scripts provided by PerfMonAna:
performmt-plotter.py
: Creates event/component level plotsperfmonmt-printer.py
: Prints untrimmed/ordered usage statisticsperfmonmt-refit.py
: Re-fits memory usage in different event slicesHere is an example of snapshot metrics from the performmt-plotter.py
:
The new algorithm can be added to the PerfMon
domain in following example:
# Muon
acc.flagPerfmonDomain('Muon')
if flags.Detector.EnableMuon:
from MuonConfig.MuonReconstructionConfig import MuonReconstructionCfg
acc.merge(MuonReconstructionCfg(flags))
log.info("---------- Configured muon tracking")
# EGamma
acc.flagPerfmonDomain('EGamma')
if flags.Reco.EnableEgamma:
from egammaConfig.egammaSteeringConfig import EGammaSteeringCfg
acc.merge(EGammaSteeringCfg(flags))
log.info("---------- Configured e/gamma")
# Caching of CaloExtension for downstream
# Combined Performance algorithms.
acc.flagPerfmonDomain('CaloExtension')
if flags.Reco.EnableCaloExtension:
from TrackToCalo.CaloExtensionBuilderAlgCfg import (
CaloExtensionBuilderCfg)
acc.merge(CaloExtensionBuilderCfg(flags))
log.info("---------- Configured track calorimeter extension builder")
# Muon Combined
acc.flagPerfmonDomain('CombinedMuon')
if flags.Reco.EnableCombinedMuon:
from MuonCombinedConfig.MuonCombinedReconstructionConfig import (
MuonCombinedReconstructionCfg)
acc.merge(MuonCombinedReconstructionCfg(flags))
log.info("---------- Configured combined muon reconstruction")
It is listed in the log:
Py:ComponentAccumulator INFO :: This CA contains the following PerfMon domains ::
Py:ComponentAccumulator INFO :: There are a total of 626 registered algorithms in 22 domains ::