This version is outdated by a newer approved version.DiffThis version (2016/06/15 11:13) is a draft.
Approvals: 0/1

This is an old revision of the document!


papi is an event-based profiling library that reads out hardware performance counters from the CPU and thus can provide useful information about critical events, e.g. cache misses, number of FLOPs, number of CYCLES etc.

The user will have to modify the source code and insert papi calls (see below). Invocation and usage is then as simple as

 
 module purge
 module load papi/5.4.3
 gcc my_program.c -lpapi    ( gfortran my_program.f -lpapi )
 ./a.out
 
 

In general, some code section to be analyzed with papi needs to be wrapped into a sequence of standard papi calls, e.g.

 #include "papi.h"
 // PAPI variables
 // best is to analyze one particular event at a time
 int eventset;
 long long value, time0, time1, cyc0, cyc1;
     
 // PAPI Initialization
 eventset = PAPI_NULL;
 if (PAPI_library_init(PAPI_VER_CURRENT) != PAPI_VER_CURRENT) {
    printf("PAPI init error !\n");
    exit(993);
 }
 // PAPI Event Set Creation
 if (PAPI_create_eventset(&eventset) != PAPI_OK) {
    printf("PAPI event set creation error !\n");
    exit(994);
 }
 // PAPI Specify a Particular Target Event to Analyze
 //   PAPI_TOT_CYC         Total cycles executed
 //   PAPI_FP_OPS          Floating point operations executed
 //   PAPI_L1_DCM          Level 1 data cache misses
 //   PAPI_L2_DCM          Level 2 data cache misses
 //   for other events see /opt/sw/x86_64/glibc-2.12/ivybridge-ep/papi/5.4.3/gnu-4.4.7/include/papiStdEventDefs.h
 //
 if (PAPI_add_event(eventset, PAPI_FP_OPS) != PAPI_OK) {
    printf("PAPI event set adding error !\n");
    exit(995);
 }
 // PAPI Time Estimators Initialization
 time0 = PAPI_get_real_usec();
 cyc0 = PAPI_get_real_cyc();
 // PAPI Counting Start
 if (PAPI_start(eventset) != PAPI_OK) {
    printf("PAPI start error !\n");
    exit(996);
 }
 
 //*** Here follows the original code section to be analyzed ***
 
 // PAPI Counting Stop
 if (PAPI_stop(eventset, &value) != PAPI_OK) {
    printf("PAPI stop error !\n");
    exit(997);
 }
 // PAPI Time Estimators Stop
 time1 = PAPI_get_real_usec();
 cyc1 = PAPI_get_real_cyc();
 // PAPI Results
 printf("PAPI event count %lld\n", value);
 printf("PAPI time passed in usec %lld\n", time1 - time0);
 printf("PAPI cycles passed %lld\n", cyc1 - cyc0);
 // PAPI Free Event Set
 if (PAPI_cleanup_eventset(eventset) != PAPI_OK) {
    printf("PAPI event set cleanup error !\n");
    exit(998);
 }
 if (PAPI_destroy_eventset(&eventset) != PAPI_OK) {
    printf("PAPI event set destruction error !\n");
    exit(999);
 }
 // PAPI Finalize
 PAPI_shutdown();
 
 
  • A quick overview of supported events and corresponding papi variables for a particular type of CPU is obtained from executing command papi_avail.
  • Measuring PAPI_TOT_CYC as event can differ significantly from the result obtained by calling PAPI_get_real_cyc(). This is particularly true for papi analysis of very small code sections called very often (e.g. hotspot functions/routines identified from time based profiling). Although off in absolute terms, the event based PAPI_TOT_CYC remains a valid reference time for relative comparisons.

Operation modes may be distinguished between:

A.) papi enclosing long lasting code sections:

B.) papi around hotspot functions/routines:

  • doku/papi.1465989209.txt.gz
  • Last modified: 2016/06/15 11:13
  • by sh