| Previous revision Next revision |
— | pandoc:introduction-to-vsc:09_special_hardware:accelerators [2023/05/17 14:31] – msiegel |
---|
| |
| ====== GPUs available & how to use it ====== |
| |
| * Article written by Siegfried Höfinger (VSC Team) <html><br></html>(last update 2020-10-04 by sh). |
| |
| ====== TOP500 List June 2020 ====== |
| |
| |
| <HTML> |
| <!--slide 1--> |
| <!--for nations flags see https://www.free-country-flags.com--> |
| </HTML> |
| ^ Rank^Nation ^Machine ^ Performance^Accelerators ^ |
| | 1.|{{.:jp.png?0x24}} |Fugaku | 416 PFLOPs/s| | |
| | 2.|{{.:us.png?0x24}} |Summit | 149 PFLOPs/s|<html><font color="navy"></html>NVIDIA V100<html></font></html> | |
| | 3.|{{.:us.png?0x24}} |Sierra | 95 PFLOPs/s|<html><font color="navy"></html>NVIDIA V100<html></font></html> | |
| | 4.|{{.:cn.png?0x24}} |Sunway TaihuLight | 93 PFLOPs/s| | |
| | 5.|{{.:cn.png?0x24}} |Tianhe-2A | 62 PFLOPs/s| | |
| | 6.|{{.:it.png?0x24}} |HPC5 | 36 PFLOPs/s|<html><font color="navy"></html>NVIDIA V100<html></font></html> | |
| | 7.|{{.:us.png?0x24}} |Selene | 28 PFLOPs/s|<html><font color="navy"></html>NVIDIA A100<html></font></html> | |
| | 8.|{{.:us.png?0x24}} |Frontera | 24 PFLOPs/s|<html><font color="navy"></html>NVIDIA RTX5000/V100<html></font></html> | |
| | 9.|{{.:it.png?0x24}} |Marconi-100 | 22 PFLOPs/s|<html><font color="navy"></html>NVIDIA V100<html></font></html> | |
| | 10.|{{.:ch.png?0x24}} |Piz Daint | 21 PFLOPs/s|<html><font color="navy"></html>NVIDIA P100<html></font></html> | |
| |
| |
| <HTML> |
| <!--slide 2--> |
| </HTML> |
| ====== Components on VSC ====== |
| |
| ===== GPUs on VSC-5 ===== |
| |
| ^Model ^#cores ^Clock Freq (GHz)^Memory (GB)^Bandwidth (GB/s)^TDP (Watt)^FP32/FP64 (GFLOPs/s)^ |
| |<html><font color="navy"></html>19x GeForce RTX-2080Ti n375-[001-019]<html></font></html>| | | | | | | |
| |{{:pandoc:introduction-to-vsc:09_special_hardware:rtx-2080.jpg?nolink&200}} |4352|1.35 |11 |616 |250 |13450/420 | |
| |<html><font color="navy"></html>45x2 nVidia A40 n306[6,7,8]-[001-019,001-019,001-007]<html></font></html>| | | | | | | |
| |{{ :pandoc:introduction-to-vsc:09_special_hardware:a40.jpg?nolink&200|}} |10752 |1.305 |48 |696 |300 |37400/1169 | |
| |<html><font color="navy"></html>60x2 nVidia A100-40GB n307[1-4]-[001-015]<html></font></html>| | | | | | | |
| |{{ :pandoc:introduction-to-vsc:09_special_hardware:a100.jpg?nolink&200|}} |6912 |0.765 |40 |1555 |250 |19500/9700 | |
| |
| |
| |
| <HTML> |
| <!--slide 3--> |
| </HTML> |
| ====== Working on GPU nodes Interactively ====== |
| |
| **Interactive mode** |
| |
| <code> |
| 1. VSC-5 > salloc -N 1 -p zen2_0256_a40x2 --qos zen2_0256_a40x2 --gres=gpu:2 |
| |
| 2. VSC-5 > squeue -u $USER |
| |
| 3. VSC-5 > srun -n 1 hostname (...while still on the login node !) |
| |
| 4. VSC-5 > ssh n3066-012 (...or whatever else node had been assigned) |
| |
| 5. VSC-5 > module load cuda/9.1.85 |
| cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMul |
| nvcc ./matrixMul.cu |
| ./a.out |
| |
| cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMulCUBLAS |
| nvcc matrixMulCUBLAS.cu -lcublas |
| ./a.out |
| |
| 6. VSC-5 > nvidia-smi |
| |
| 7. VSC-5 > /opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery |
| </code> |
| <HTML> |
| <!--slide 4--> |
| </HTML> |
| ====== Working on GPU using SLURM ====== |
| |
| **SLURM submission** gpu_test.scrpt |
| |
| <code bash> |
| #!/bin/bash |
| # |
| # usage: sbatch ./gpu_test.scrpt |
| # |
| #SBATCH -J A40 |
| #SBATCH -N 1 #use -N only if you use both GPUs on the nodes, otherwise leave this line out |
| #SBATCH --partition zen2_0256_a40x2 |
| #SBATCH --qos zen2_0256_a40x2 |
| #SBATCH --gres=gpu:2 #or --gres=gpu:1 if you only want to use half a node |
| |
| module purge |
| module load cuda/9.1.85 |
| |
| nvidia-smi |
| /opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery |
| </code> |
| |
| <HTML> |
| <!--slide 5--> |
| </HTML> |
| |
| ====== Real-World Example, AMBER-16 ====== |
| |
| ^ Performance^Power Efficiency ^ |
| | {{.:amber16.perf.png}}|{{.:amber16.powereff.png}} | |
| |