Differences

This shows you the differences between two versions of the page.

--- doku:monitoring [2017/03/17 13:59] – ir
+++ doku:monitoring [2023/03/14 12:56] (current) – [Live] goldenberg
@@ Line 1: / Line 1: @@
-====== Monitor where threads/processes are running ======
+====== Monitoring Processes & Threads ======
-**Environment variables :**
+===== CPU Load =====
-Setting environment variables does unfortunately <html><span style="color:blue;font-size:100%;">not always lead to the expected outcome.</span> &#x27A0;</html>
-Thus, it is highly encouraged to always monitor if the job is doing what it is supposed to.
-There are several ways to monitor your job, either in live time directly on the compute node or by modifying the job script or the application code:
+There are several ways to monitor the threads CPU load distribution of
-  * <html><font color=#cc3300>live </font> <span style="color:blue;font-size:100%;">&dzigrarr;</span> submit the job and connect with the compute node</html>
+your job, either [[doku:monitoring#Live]] directly on the compute node, or by modifying
+the [[doku:monitoring#Job Script]], or the [[doku:monitoring#Application Code]].
-{{ :doku:top_vasp_2.png?200|}}
+==== Live ====
-<code>
-[xy@l32]$ sbatch job.sh
+So we assume your program runs, but could it be faster? [[doku:SLURM]] gives you
-[xy@l32]$ squeue -u xy
+a ''Job ID'', type ''squeue --job myjobid'' to find out on which node your
-JOBID    PARTITION  NAME      USER  ST  TIME  NODES  NODELIST(REASON)
+job runs; say n4905-007. Type ''ssh n4905-007'', to connect to the given
-5066692  mem_0064   nn1tn8th   xy   R   0:02    1    n09-005
+node. Type ''top'' to start a simple task manager:
-[xy@l32]$ ssh n09-005
-[xy@n09-005]$ top
+<code sh>
+[myuser@l42]$ sbatch job.sh
+[myuser@l42]$ squeue -u myuser
+JOBID    PARTITION  NAME      USER    ST  TIME  NODES  NODELIST(REASON)
+1098917  skylake_0096   gmx_mpi   myuser  R   0:02   1     n4905-007
+[myuser@l42]$ ssh n4905-007
+[myuser@n4905-007]$ top
 </code>
-When typing <html><span style="color:blue;font-size:100%;">&dzigrarr;</span> <font color=#cc3300>1</font></html> during the top call, per-core-information is obtained, e.g., about the cpu-usage, compare with the picture to the right. The user can select the parameters to be shown from a list displayed when typing <html><span style="color:blue;font-size:100%;">&dzigrarr;</span> <font color=#cc3300>f</font></html>.
-  * <html><font color=#cc3300>batch script</font> (Intel-MPI) <span style="color:blue;font-size:100%;">&dzigrarr;</span> set: I_MPI_DEBUG=4 </html>
+Within ''top'', hit the following keys (case sensitive): ''H t 1''. Now you
-  * <html><font color=#cc3300>code</font> <span style="color:blue;font-size:100%;">&dzigrarr;</span> via library functions information about the locality of processes and threads can be obtained (libraries: mpi.h or in C-code hwloc.h (hardware locality) or sched.h (scheduling parameters)) </html>
+should be able to see the load on all the available CPUs, as an
+example:
 <code>
-#include "mpi.h"
+top - 16:31:51 up 181 days,  1:04,  3 users,  load average: 1.67, 3.39, 3.61
+Threads: 239 total,   2 running, 237 sleeping,   0 stopped,   0 zombie
+%Cpu0  :  69.8/29.2   99[|||||||||||||||||||||||||||||||||||||||||||||||| ]
+%Cpu1  :  97.0/2.3    99[|||||||||||||||||||||||||||||||||||||||||||||||| ]
+%Cpu2  :  98.7/0.7    99[|||||||||||||||||||||||||||||||||||||||||||||||| ]
+%Cpu3  :  95.7/4.0   100[|||||||||||||||||||||||||||||||||||||||||||||||| ]
+%Cpu4  :  99.0/0.3    99[|||||||||||||||||||||||||||||||||||||||||||||||| ]
+%Cpu5  :  98.7/0.3    99[|||||||||||||||||||||||||||||||||||||||||||||||| ]
+%Cpu6  :  99.3/0.0    99[|||||||||||||||||||||||||||||||||||||||||||||||||]
+%Cpu7  :  99.0/0.0    99[|||||||||||||||||||||||||||||||||||||||||||||||| ]
+KiB Mem : 65861076 total, 60442504 free,  1039244 used,  4379328 buff/cache
+KiB Swap:        0 total,        0 free,        0 used. 62613824 avail Mem
+  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
+myuser    20   0 9950.2m 303908 156512 S 99.3  0.5   0:11.14 gmx_mpi
+myuser    20   0 9950.2m 303908 156512 S 99.0  0.5   0:12.28 gmx_mpi
+myuser    20   0 9950.2m 303908 156512 R 99.0  0.5   0:11.20 gmx_mpi
+myuser    20   0 9950.2m 303908 156512 S 99.0  0.5   0:11.25 gmx_mpi
+myuser    20   0 9950.2m 303908 156512 S 98.7  0.5   0:11.19 gmx_mpi
+myuser    20   0 9950.2m 303908 156512 S 98.7  0.5   0:11.15 gmx_mpi
+myuser    20   0 9950.2m 303908 156512 S 96.3  0.5   0:11.09 gmx_mpi
+myuser    20   0 9950.2m 303908 156512 S 95.7  0.5   0:11.02 gmx_mpi
+root      20   0       0      0      0 S  6.6  0.0   0:00.70 nv_queue
 ...
-MPI_Get_processor_name(processor_name, &namelen);
 </code>
-<code>
+In our example all 8 threads are utilised; which is good. The opposite
+is not true however, sometimes the best case still only uses 40% on
+most CPUs!
+The columns ''VIRT'' and ''RES'' indicate the //virtual//, respective
+//resident// memory usage of each process (unless noted otherwise in
+kB). The column ''COMMAND'' lists the name of the command or
+application.
+In the following screenshot we can see stats for all 32 threads of a compute node running [[doku:VASP]]:
+{{ :doku:top_vasp_2.png }}
+==== Job Script ====
+If you are using ''Intel-MPI'' you might include this option in your batch script:
+  I_MPI_DEBUG=4
+==== Application Code ====
+If your application code is in ''C'', information about the locality of
+processes and threads can be obtained via library functions using either
+of the following libraries:
+=== mpi.h ===
+<code cpp>
+#include "mpi.h"
+...  MPI_Get_processor_name(processor_name, &namelen);
+</code>
+=== sched.h (scheduling parameters) ===
+<code c++>
 #include <sched.h>
-...
+...  CPU_ID = sched_getcpu();
-CPU_ID = sched_getcpu();
 </code>
-<code>
+=== hwloc.h (Hardware locality) ===
+<code cpp>
 #include <hwloc.h>
 ...
@@ Line 47: / Line 113: @@
 //  compile: mpiicc -qopenmp -o ompMpiCoreIds ompMpiCoreIds.c -lhwloc
+</code>
+===== GPU Load =====
+We assume you program uses a GPU, and your program runs as expected,
+so could it be faster? On the same node where your job runs (see CPU
+load section), maybe in a new terminal, type ''watch nvidia-smi'', to
+start a simple task manager for the graphics card. ''watch'' just
+repeats a command every 2 seconds, acts as a live monitor for the
+GPU. In our example below the GPU utilisation is around 80% the most
+time, which is very good already.
+<code>
+Every 2.0s: nvidia-smi                                 Wed Jun 22 16:42:52 2022
+Wed Jun 22 16:42:52 2022
++-----------------------------------------------------------------------------+
+| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
+|-------------------------------+----------------------+----------------------+
+| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
+| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
+|                               |                      |               MIG M. |
+|===============================+======================+======================|
+|   0  GeForce GTX 1080    Off  | 00000000:02:00.0 Off |                  N/A |
+| 36%   59C    P2   112W / 180W |    161MiB /  8119MiB |     83%      Default |
+|                               |                      |                  N/A |
++-------------------------------+----------------------+----------------------+
++-----------------------------------------------------------------------------+
+| Processes:                                                                  |
+|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
+|        ID   ID                                                   Usage      |
+|=============================================================================|
+|    0   N/A  N/A     21045      C   gmx_mpi                           159MiB |
++-----------------------------------------------------------------------------+
 </code>