====== GPUs available & how to use it ======
===== TOP500 List June 2024 =====
^ Rank^Nation ^Machine ^ Performance^Accelerators ^
| 1.|{{.:us.png?0x24}} | Frontier | 1206 PFLOPs/s | AMD Instinct MI250X |
| 2.|{{.:us.png?0x24}} | Aurora | 1012 PFLOPs/s | Intel Data Center GPU Max |
| 3.|{{.:us.png?0x24}} | Eagle | 561 PFLOPs/s | NVIDIA H100 |
| 4.|{{.:jp.png?0x24}} | Fugaku | 442 PFLOPs/s | |
| 5.|{{.:640px-flag_of_finland.svg.png?nolink&24}} | LUMI | 379 PFLOPs/s | AMD Instinct MI250X |
| 6.|{{.:ch.png?0x24}} | Alps | 270 PFLOPs/s | NVIDIA GH200 Superchip |
| 7.|{{.:it.png?0x24}} | Leonardo | 241 PFLOPs/s | NVIDIA A100 SXM4 |
| 8.|{{.:640px-bandera_de_espana.svg.png?nolink&24}} | MareNostrum 5 ACC | 175 PFLOPs/s | NVIDIA H100 |
| 9.|{{.:us.png?0x24}} | Summit | 148 PFLOPs/s | NVIDIA V100 |
| 10.|{{.:us.png?0x24}} | Eos NVIDIA DGX | 121 PFLOPs/s | NVIDIA H100 |
===== Components on VSC-5 =====
^Model ^#cores ^Clock Freq (GHz)^Memory (GB)^Bandwidth (GB/s)^TDP (Watt)^FP32/FP64 (GFLOPs/s)^
|19x GeForce RTX-2080Ti n375-[001-019] - only in a special project | | | | | |
|{{:pandoc:introduction-to-vsc:09_special_hardware:rtx-2080.jpg?nolink&200}} |4352|1.35 |11 |616 |250 |13450/420 |
|45x2 nVidia A40 n306[6,7,8]-[001-019,001-019,001-007] | | | | | | |
|{{ :pandoc:introduction-to-vsc:09_special_hardware:a40.jpg?nolink&200|}} |10752 |1.305 |48 |696 |300 |37400/1169 |
|62x2 nVidia A100-40GB n307[1-4]-[001-015] | | | | | |
|{{ :pandoc:introduction-to-vsc:09_special_hardware:a100.jpg?nolink&200|}} |6912 |0.765 |40 |1555 |250 |19500/9700 |
==== Working on GPU nodes Interactively ====
**Interactive mode**
1. VSC-5 > salloc -N 1 -p zen2_0256_a40x2 --qos zen2_0256_a40x2 --gres=gpu:2
2. VSC-5 > squeue -u $USER
3. VSC-5 > srun -n 1 hostname (...while still on the login node !)
4. VSC-5 > ssh n3066-012 (...or whatever else node had been assigned)
5. VSC-5 > module load cuda/9.1.85
cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMul
nvcc ./matrixMul.cu
./a.out
cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMulCUBLAS
nvcc matrixMulCUBLAS.cu -lcublas
./a.out
6. VSC-5 > nvidia-smi
7. VSC-5 > /opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery
===== Working on GPU using SLURM =====
**SLURM submission** gpu_test.scrpt
#!/bin/bash
#
# usage: sbatch ./gpu_test.scrpt
#
#SBATCH -J A40
#SBATCH -N 1 #use -N only if you use both GPUs on the nodes, otherwise leave this line out
#SBATCH --partition zen2_0256_a40x2
#SBATCH --qos zen2_0256_a40x2
#SBATCH --gres=gpu:2 #or --gres=gpu:1 if you only want to use half a node
module purge
module load cuda/9.1.85
nvidia-smi
/opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery
===== Real-World Example, AMBER-16 =====
^ Performance^Power Efficiency ^
| {{.:amber16.perf.png}}|{{.:amber16.powereff.png}} |