This is an old revision of the document!

map and ddt are ARM's (formerly Allinea's) advanced tools for performance analysis, see Licenses for up to 512 parallel tasks are available. Of additional note, perf-report — a related lightweight profiling tool — has now been integrated into forge in more recent releases.

Profiling may be split into two steps, where the initial task is to create a *.map file from within a regular job script submitted to SLURM. In a subsequent step this *.map file can then be analyzed within an interactive session on the login node. Suppose we had previously prepared an application for profiling, for instance via mpicc -g -O3 ./my_prog.c then we could call for a corresponding profile with the following submit script,

 #SBATCH -J map       
 #SBATCH -L allinea@vsc
 #SBATCH --ntasks-per-node 16
 #SBATCH --ntasks-per-core  1
 module purge
 module load  intel/18  intel-mpi/2018 allinea/20.1_FORGE
 map --profile srun --jobid $SLURM_JOB_ID --mpi=pmi2 -n 64 ./a.out

which generates a *.map file (note the mention of #tasks and #nodes together with the date/time stamp in the filename) that may then be analyzed via the gui, ie

 ssh -l my_uid -X
 cd wherever/the/map/file/may/be
 module purge
 module load allinea/20.1_FORGE
 map ./

Debugging with ddt is currently limited to the Remote Launch option. Best is to launch ddt-sessions on separate compute nodes.

ddt (fully interactive via salloc):

The following steps need to be carried out:

 ssh -l my_uid -X
 my_uid@l33$  cd wherever/my/app/may/be
 my_uid@l33$  salloc -N 4 -L allinea@vsc
 my_uid@l33$  echo $SLURM_JOB_ID    ( just to figure out the current job ID, say it's 8909346 )
 my_uid@l33$  srun --jobid 8909346 -n 4 hostname | tee ./machines.txt ( this is important ! it looks like a redundant command but will actually fix a lot of the prerequisites usually taken care of in the SLURM prologue of regular submit scripts, one of them being provisioning of required licenses )
              ... let's assume we got n305-[044,057,073,074] which should now be listed inside file 'machines.txt' 
 my_uid@l33$  rm -rf ~/.allinea/   ( to get rid of obsolete configurations from previous sessions )
 my_uid@l33$  module purge
 my_uid@l33$  module load  intel/18  intel-mpi/2018  allinea/20.1_FORGE   ( or whatever else suite of MPI )
 my_uid@l33$  mpiicc -g -O0 my_app.c
 my_uid@l33$  ddt &     ( gui should open )
              ... select 'Remote Launch - Configure'
              ... click  'Add'   
              ... set my_uid@n305-044 as 'Host Name' or any other node from the above list
              ... set 'Remote Installation Directory' to /opt/sw/x86_64/glibc-2.17/ivybridge-ep/allinea/20.1_FORGE
              ... keep auto-selected defaults for the rest, then check it with 'Test Remote Launch'     ( should be ok )
              ... click OK twice to close the dialogues
              ... click Close to exit from the Configure menu
              ... next really select 'Remote Launch' by clicking the name tag that was auto-assigned above   ( licence label should be ok in the lower left corner and the hostname of the connecting client should appear in the lower right corner )
 ssh -l my_uid   ( a second terminal will be needed to actually start the debug session )   
 my_uid@l34$  ssh n305-044       ( log into that compute node that was selected/prepared above for remote launch )
 my_uid@n305-044$  module purge
 my_uid@n305-044$  module load  intel/18  intel-mpi/2018  allinea/20.1_FORGE
 my_uid@n305-044$  cd wherever/my/app/may/be
 my_uid@n305-044$  srun --jobid 8909346 -n 16 hostname    ( just a dummy check to see whether all is set up and working correctly )
 my_uid@n305-044$  ddt --connect srun --jobid 8909346 --mpi=pmi2 -n 64 ./a.out -arg1 -arg2   ( in the initial ddt-window a dialogue will pop up prompting for a Reverse Connection request; accept it and click Run and the usual debug session will start )
  • /srv/vsc/www/dw/data/pages/doku/forge.txt
  • Last modified: 2022/11/04 10:14
  • by goldenberg