====== GPUs available & how to use it ====== ===== TOP500 List June 2020 ===== ^ Rank^Nation ^Machine ^ Performance^Accelerators ^ | 1.|{{.:jp.png?0x24}} |Fugaku | 416 PFLOPs/s| | | 2.|{{.:us.png?0x24}} |Summit | 149 PFLOPs/s|NVIDIA V100 | | 3.|{{.:us.png?0x24}} |Sierra | 95 PFLOPs/s|NVIDIA V100 | | 4.|{{.:cn.png?0x24}} |Sunway TaihuLight | 93 PFLOPs/s| | | 5.|{{.:cn.png?0x24}} |Tianhe-2A | 62 PFLOPs/s| | | 6.|{{.:it.png?0x24}} |HPC5 | 36 PFLOPs/s|NVIDIA V100 | | 7.|{{.:us.png?0x24}} |Selene | 28 PFLOPs/s|NVIDIA A100 | | 8.|{{.:us.png?0x24}} |Frontera | 24 PFLOPs/s|NVIDIA RTX5000/V100 | | 9.|{{.:it.png?0x24}} |Marconi-100 | 22 PFLOPs/s|NVIDIA V100 | | 10.|{{.:ch.png?0x24}} |Piz Daint | 21 PFLOPs/s|NVIDIA P100 | ===== Components on VSC-5 ===== ^Model ^#cores ^Clock Freq (GHz)^Memory (GB)^Bandwidth (GB/s)^TDP (Watt)^FP32/FP64 (GFLOPs/s)^ |19x GeForce RTX-2080Ti n375-[001-019]| | | | | | | |{{:pandoc:introduction-to-vsc:09_special_hardware:rtx-2080.jpg?nolink&200}} |4352|1.35 |11 |616 |250 |13450/420 | |45x2 nVidia A40 n306[6,7,8]-[001-019,001-019,001-007]| | | | | | | |{{ :pandoc:introduction-to-vsc:09_special_hardware:a40.jpg?nolink&200|}} |10752 |1.305 |48 |696 |300 |37400/1169 | |60x2 nVidia A100-40GB n307[1-4]-[001-015]| | | | | | | |{{ :pandoc:introduction-to-vsc:09_special_hardware:a100.jpg?nolink&200|}} |6912 |0.765 |40 |1555 |250 |19500/9700 | ==== Working on GPU nodes Interactively ==== **Interactive mode**


1. VSC-5 >  salloc -N 1 -p zen2_0256_a40x2 --qos  zen2_0256_a40x2 --gres=gpu:2

2. VSC-5 >  squeue -u $USER

3. VSC-5 >  srun -n 1 hostname  (...while still on the login node !)

4. VSC-5 >  ssh n3066-012  (...or whatever else node had been assigned)

5. VSC-5 >  module load cuda/9.1.85    
            cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMul
            nvcc ./matrixMul.cu  
            ./a.out 

            cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMulCUBLAS
            nvcc matrixMulCUBLAS.cu -lcublas
            ./a.out

6. VSC-5 >  nvidia-smi

7. VSC-5 >  /opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery

===== Working on GPU using SLURM ===== **SLURM submission** gpu_test.scrpt


#!/bin/bash
#
#  usage: sbatch ./gpu_test.scrpt          
#
#SBATCH -J A40     
#SBATCH -N 1                           #use -N only if you use both GPUs on the nodes, otherwise leave this line out
#SBATCH --partition zen2_0256_a40x2
#SBATCH --qos zen2_0256_a40x2
#SBATCH --gres=gpu:2                   #or --gres=gpu:1 if you only want to use half a node

module purge
module load cuda/9.1.85

nvidia-smi
/opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery

===== Real-World Example, AMBER-16 ===== ^ Performance^Power Efficiency ^ | {{.:amber16.perf.png}}|{{.:amber16.powereff.png}} |