Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
doku:gromacs [2022/06/23 13:24]
msiegel
doku:gromacs [2022/07/15 11:35]
msiegel added recommendations
Line 1: Line 1:
 ====== GROMACS ====== ====== GROMACS ======
 +
 +Our recommendation: Follow these four steps --- in this order! --- to get the fastest program.
 +
 +  - Use the most recent version of GROMACS that we provide or build your own.
 +  - Use the newest Hardware: the partitions ''gpu_a40dual'' or ''gpu_gtx1080single'' on VSC3 have plenty nodes available.
 +  - Read our article on multi GPU setup and do some performance analysis.
 +  - Run on multiple nodes with MPI; each with 1 GPU
 +  - Additionally use multiple GPUs per node
 +
  
 ===== GPU Partition ===== ===== GPU Partition =====
  
 First you have to decide on which hardware GROMACS should run, we call First you have to decide on which hardware GROMACS should run, we call
-this a ''partition'', described in detail at . On login node, type +this a ''partition'', described in detail at [[doku:slurm | 
-''sinfo'' to get a list of the available partitions. Be aware that each +SLURM]]. On any login node, type ''sinfo'' to get a list of the 
-setup has different hardware, for example the partition +available partitions. The partition has to be set in the batch script, 
-''gpu_gtx1080single'' on VSC3 has 1 GPU and single socket à 4 cores, +see the example below. Be aware that each partition has different 
-with 2 hyperthreads each core, listed at [[doku:vsc3gpuqos | GPU Partitions on VSC3]]. +hardware, for example the partition ''gpu_gtx1080single'' on VSC3 has 
-The partition has to be set in the batch script, see the example +1 GPU and single socket à 4 cores, with 2 hyperthreads each core, 
-below. Thus here it makes sense to let GROMACS run on 8 threads +listed at [[doku:vsc3gpuqos | GPU Partitions on VSC3]].  Thus here it 
-(''-ntomp 8''), yet it makes little sense to force more threads than +makes sense to let GROMACS run on 8 threads (''-ntomp 8''), yet it 
-that, as this would lead to oversubscribing. GROMACS decides mostly on +makes little sense to force more threads than that, as this would lead 
-its own how it wants to work, so don't be surprised if it ignores +to oversubscribing. GROMACS decides mostly on its own how it wants to 
-settings like environment variables.+work, so don't be surprised if it ignores settings like environment 
 +variables. 
 + 
 +===== Batch Script =====
  
-=== Batch Script === +Write a ''batch script'' (example below) including:
-In order to be scheduled efficiently with [[doku:slurm | SLURM]], one writes a ''shell +
-script'' (see the text file ''myscript.sh'' below) consisting of:+
    
   * some SLURM parameters: the ''#BATCH ...'' part)   * some SLURM parameters: the ''#BATCH ...'' part)
-  * exporting environment variables: ''export CUDA_VISIBLE_DEVICES=0''+  * exporting environment variables: e.g. ''export CUDA_VISIBLE_DEVICES=0''
   * cleaning the environment: ''module purge''   * cleaning the environment: ''module purge''
   * loading modules: ''load gcc/7.3 ...''   * loading modules: ''load gcc/7.3 ...''
   * last but not least starting the program in question: ''gmx_mpi ...''   * last but not least starting the program in question: ''gmx_mpi ...''
  
-<code sh myscript.sh>+<code bash mybatchscript.sh>
 #!/bin/bash #!/bin/bash
 #SBATCH --job-name=myname #SBATCH --job-name=myname
Line 42: Line 52:
 </code> </code>
  
-Type ''sbatch myscript.sh'' to submit such batch script to [[doku:SLURM]]. you +Type ''sbatch myscript.sh'' to submit such your batch script to 
-get the job id, and your job will be scheduledand executed +[[doku:SLURM]]. you get the job id, and your job will be scheduled and 
-automatically.+executed automatically.
  
  
-=== CPU / GPU Load ===+===== CPU / GPU Load =====
  
-There is a whole page dedicated to [[doku:monitoring]] the CPU and GPU, for +There is a whole page dedicated to [[doku:monitoring]] the CPU and 
-GROMACS the relevant sections are section [[doku:monitoring#Live]] Live and [[doku:monitoring#GPU]].+GPU, for GROMACS the relevant sections are section 
 +[[doku:monitoring#Live]] and [[doku:monitoring#GPU]].
  
-=== Performance === 
  
-As an example we ran ''gmx_mpi mdrun -s topol.tpr'' with different +===== Performance ===== 
-options, where ''topol.tpl'' is just some sample topology, we don't + 
-actually care about the result. Without any options GROMACS already +There is a whole article about the [[doku:gromacs_multi_gpu|Performance of GROMACS on multi GPU systems]]. 
-runs fine (a). Setting the number of tasks (b,c) is not needed; if set + 
-wrong can even slow the calculation down significantly (over +As a short example we ran ''gmx_mpi mdrun -s topol.tpr'' with 
-provisioning)Enforcing pinning also does not show any effects (d), +different options, where ''topol.tpl'' is just some sample topology, 
-we assume that the tasks are pinned automatically already. The only +we don't actually care about the result. Without any options GROMACS 
-improvement we had was using the ''-update gpu'' option, which puts more +already runs fine (a). Setting the number of tasks (b) is not needed; 
-load on the GPU. This might not work however if we use more than one +if set wrong can even slow the calculation down significantly ( c ) due 
-GPU.+to over provisioning! We would advice to enforce pinning, in our 
 +example it does not show any effects though (d), we assume that the 
 +tasks are pinned automatically already. The only further improvement 
 +we could get was using the ''-update gpu'' option, which puts more 
 +load on the GPU (e).
  
 ^ # ^ cmd         ^ ns / day ^ cpu load / % ^ gpu load / % ^ notes                               ^ ^ # ^ cmd         ^ ns / day ^ cpu load / % ^ gpu load / % ^ notes                               ^
Line 72: Line 86:
 | e | -update gpu | 170    | 100   | 90        |                                    | | e | -update gpu | 170    | 100   | 90        |                                    |
  
 +
 +==== GROMACS2020 ====
 +
 +The following environment variables need to be set with Gromacs2020
 +when using multiple GPUs: It is not necessary to set these variables
 +for Gromacs2021 onwards; they are already included and setting them
 +explicitly might actually decrease performance again.
 +
 +<code bash>
 +export GMX_GPU_PME_PP_COMMS=true
 +export GMX_GPU_DD_COMMS=true
 +export GMX_GPU_FORCE_UPDATE_DEFAULT_GPU=true
 +</code>
  • doku/gromacs.txt
  • Last modified: 2023/11/23 12:27
  • by msiegel