Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
doku:papi_ir [2016/06/23 13:23] – [Interfacing with PAPI : Low level interface] irdoku:papi_ir [2016/07/06 12:10] – [Practical tips:] ir
Line 27: Line 27:
  
 ==== Interfacing with PAPI : Low level interface ==== ==== Interfacing with PAPI : Low level interface ====
-In general, some code section to be analyzed with PAPI needs to be wrapped into a sequence of standard PAPI calls.  +In general, some code section to be analyzed with PAPI needs to be wrapped into a sequence of standard PAPI calls, e.g., like in the following examples for  
-Herecode examples for  +  * [[doku:papi_ll_c|C]] or 
-  * [[doku:papi_ll_c|C users]] and +  * [[doku:papi_ll_Fortran|Fortran]].
-  * [[doku:papi_ll_Fortran|Fortran users]] can be found.+
 As stated in the comment of the C code above, it is best to analyze one particular event at a time. This advice is given because the CPU has limitations in combining arbitrary counters at a time.  As stated in the comment of the C code above, it is best to analyze one particular event at a time. This advice is given because the CPU has limitations in combining arbitrary counters at a time. 
 ==== Interfacing with PAPI : High level interface ==== ==== Interfacing with PAPI : High level interface ====
-The high-level API combines the counters for a specified list of PAPI preset events, only. The set of implemented high level functions is quite limited and can be found in the section ''The High Level API'' next to the last paragraph of ''papi.h''It can also be used in conjunction with the low-level API. An example of the usage of the high level API can be found here [[doku:papi_hl_c|for users]]+The high level API combines the counters for a specified list of PAPI preset events, only. The set of implemented high level functions is quite limited and can be found in the section ''The High Level API'' next to the last paragraph of ''papi.h'' 
 +The high level API can also be used in conjunction with the low level API. 
 + 
 +Example code:  
 +  * [[doku:papi_hl_c|C]]  
 +  * [[doku:papi_hl_fortran|Fortran]] 
 ==== Practical tips: ==== ==== Practical tips: ====
   * A quick overview of supported events and corresponding PAPI variables for a particular type of CPU is obtained from executing command ''papi_avail''   * A quick overview of supported events and corresponding PAPI variables for a particular type of CPU is obtained from executing command ''papi_avail''
Line 40: Line 45:
   * Useful notes on Intel's CPI metric: [[https://software.intel.com/en-us/node/544403]]   * Useful notes on Intel's CPI metric: [[https://software.intel.com/en-us/node/544403]]
   * Occasionally, it is useful to PAPI-analyze an application within two steps: in the first step, selected characteristic events of the outermost code region are collected. In the second step, a set of subroutines/functions consuming major fractions of the execution time are analyzed with respect to the same events. In other words, we first obtain overall counts for ''main()'' of some application, e.g. ''PAPI_TOT_CYC'', ''PAPI_FP_OPS'', ''PAPI_L1_DCM'' and ''PAPI_L2_DCM''. Subsequently, the analogous set of event counters is measured for suspicious subroutines or functions. Also relative fractions of these event counters could be useful. Based on these measures the subroutines' or functions' performance characteristics can be compared to that of the initial evaluation of the complete code. In this way, those parts matching the overall performance (w.r.t. cache misses, flops etc.) best can be identified.   * Occasionally, it is useful to PAPI-analyze an application within two steps: in the first step, selected characteristic events of the outermost code region are collected. In the second step, a set of subroutines/functions consuming major fractions of the execution time are analyzed with respect to the same events. In other words, we first obtain overall counts for ''main()'' of some application, e.g. ''PAPI_TOT_CYC'', ''PAPI_FP_OPS'', ''PAPI_L1_DCM'' and ''PAPI_L2_DCM''. Subsequently, the analogous set of event counters is measured for suspicious subroutines or functions. Also relative fractions of these event counters could be useful. Based on these measures the subroutines' or functions' performance characteristics can be compared to that of the initial evaluation of the complete code. In this way, those parts matching the overall performance (w.r.t. cache misses, flops etc.) best can be identified.
 +  * 8-)
            
  
  
  
  • doku/papi_ir.txt
  • Last modified: 2016/06/23 13:23
  • by ir