Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
doku:monitoring [2022/06/23 13:16]
msiegel
doku:monitoring [2023/03/14 12:56] (current)
goldenberg [Live]
Line 4: Line 4:
  
 There are several ways to monitor the threads CPU load distribution of There are several ways to monitor the threads CPU load distribution of
-your job, either [[doku:monitoring#live | live]] directly on the compute node, or by modifying+your job, either [[doku:monitoring#Live]] directly on the compute node, or by modifying
 the [[doku:monitoring#Job Script]], or the [[doku:monitoring#Application Code]]. the [[doku:monitoring#Job Script]], or the [[doku:monitoring#Application Code]].
  
 ==== Live ==== ==== Live ====
  
-So we assume your program runs, but could it be faster? SLURM gives you+So we assume your program runs, but could it be faster? [[doku:SLURM]] gives you
 a ''Job ID'', type ''squeue --job myjobid'' to find out on which node your a ''Job ID'', type ''squeue --job myjobid'' to find out on which node your
-job runs; say n372-007. Type ''ssh n372-007'', to connect to the given+job runs; say n4905-007. Type ''ssh n4905-007'', to connect to the given
 node. Type ''top'' to start a simple task manager: node. Type ''top'' to start a simple task manager:
  
 <code sh> <code sh>
-[myuser@l32]$ sbatch job.sh +[myuser@l42]$ sbatch job.sh 
-[myuser@l32]$ squeue -u myuser+[myuser@l42]$ squeue -u myuser
 JOBID    PARTITION  NAME      USER    ST  TIME  NODES  NODELIST(REASON) JOBID    PARTITION  NAME      USER    ST  TIME  NODES  NODELIST(REASON)
-1098917  mem_0096   gmx_mpi   myuser  R   0:02       n372-007 +1098917  skylake_0096   gmx_mpi   myuser  R   0:02       n4905-007 
-[myuser@l32]$ ssh n372-007 +[myuser@l42]$ ssh n4905-007 
-[myuser@n372-007]$ top +[myuser@n4905-007]$ top 
 </code> </code>
  
Line 63: Line 63:
 application. application.
  
-In the following screenshot we can see stats for all 32 threads of a compute node running ''VASP'':+In the following screenshot we can see stats for all 32 threads of a compute node running [[doku:VASP]]:
  
 {{ :doku:top_vasp_2.png }} {{ :doku:top_vasp_2.png }}
Line 82: Line 82:
 === mpi.h === === mpi.h ===
  
-<code C>+<code cpp>
 #include "mpi.h" #include "mpi.h"
 ...  MPI_Get_processor_name(processor_name, &namelen); ...  MPI_Get_processor_name(processor_name, &namelen);
Line 89: Line 89:
 === sched.h (scheduling parameters) === === sched.h (scheduling parameters) ===
  
-<code C>+<code c++>
 #include <sched.h> #include <sched.h>
 ...  CPU_ID = sched_getcpu(); ...  CPU_ID = sched_getcpu();
Line 96: Line 96:
 === hwloc.h (Hardware locality) === === hwloc.h (Hardware locality) ===
  
-<code C>+<code cpp>
 #include <hwloc.h> #include <hwloc.h>
 ... ...
  • doku/monitoring.1655990209.txt.gz
  • Last modified: 2022/06/23 13:16
  • by msiegel