Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
doku:gromacs [2022/06/23 13:24] msiegeldoku:gromacs [2022/07/15 11:36] msiegel
Line 1: Line 1:
 ====== GROMACS ====== ====== GROMACS ======
 +
 +Our recommendation: Follow these four steps---in this order!---to get the fastest program.
 +
 +  - Use the most recent version of GROMACS that we provide or build your own.
 +  - Use the newest Hardware: the partitions ''gpu_a40dual'' or ''gpu_gtx1080single'' on VSC3 have plenty nodes available.
 +  - Read our article on [[doku:gromacs_multi_gpu|multi GPU]] setup and do some performance analysis.
 +  - Run on multiple nodes with MPI; each with 1 GPU
 +  - Additionally use multiple GPUs per node
 +
  
 ===== GPU Partition ===== ===== GPU Partition =====
  
 First you have to decide on which hardware GROMACS should run, we call First you have to decide on which hardware GROMACS should run, we call
-this a ''partition'', described in detail at . On login node, type +this a ''partition'', described in detail at [[doku:slurm | 
-''sinfo'' to get a list of the available partitions. Be aware that each +SLURM]]. On any login node, type ''sinfo'' to get a list of the 
-setup has different hardware, for example the partition +available partitions. The partition has to be set in the batch script, 
-''gpu_gtx1080single'' on VSC3 has 1 GPU and single socket à 4 cores, +see the example below. Be aware that each partition has different 
-with 2 hyperthreads each core, listed at [[doku:vsc3gpuqos | GPU Partitions on VSC3]]. +hardware, for example the partition ''gpu_gtx1080single'' on VSC3 has 
-The partition has to be set in the batch script, see the example +1 GPU and single socket à 4 cores, with 2 hyperthreads each core, 
-below. Thus here it makes sense to let GROMACS run on 8 threads +listed at [[doku:vsc3gpuqos | GPU Partitions on VSC3]].  Thus here it 
-(''-ntomp 8''), yet it makes little sense to force more threads than +makes sense to let GROMACS run on 8 threads (''-ntomp 8''), yet it 
-that, as this would lead to oversubscribing. GROMACS decides mostly on +makes little sense to force more threads than that, as this would lead 
-its own how it wants to work, so don't be surprised if it ignores +to oversubscribing. GROMACS decides mostly on its own how it wants to 
-settings like environment variables.+work, so don't be surprised if it ignores settings like environment 
 +variables. 
 + 
 +===== Batch Script =====
  
-=== Batch Script === +Write a ''batch script'' (example below) including:
-In order to be scheduled efficiently with [[doku:slurm | SLURM]], one writes a ''shell +
-script'' (see the text file ''myscript.sh'' below) consisting of:+
    
   * some SLURM parameters: the ''#BATCH ...'' part)   * some SLURM parameters: the ''#BATCH ...'' part)
-  * exporting environment variables: ''export CUDA_VISIBLE_DEVICES=0''+  * exporting environment variables: e.g. ''export CUDA_VISIBLE_DEVICES=0''
   * cleaning the environment: ''module purge''   * cleaning the environment: ''module purge''
   * loading modules: ''load gcc/7.3 ...''   * loading modules: ''load gcc/7.3 ...''
   * last but not least starting the program in question: ''gmx_mpi ...''   * last but not least starting the program in question: ''gmx_mpi ...''
  
-<code sh myscript.sh>+<code bash mybatchscript.sh>
 #!/bin/bash #!/bin/bash
 #SBATCH --job-name=myname #SBATCH --job-name=myname
Line 42: Line 52:
 </code> </code>
  
-Type ''sbatch myscript.sh'' to submit such batch script to [[doku:SLURM]]. you +Type ''sbatch myscript.sh'' to submit such your batch script to 
-get the job id, and your job will be scheduledand executed +[[doku:SLURM]]. you get the job id, and your job will be scheduled and 
-automatically.+executed automatically.
  
  
-=== CPU / GPU Load ===+===== CPU / GPU Load =====
  
-There is a whole page dedicated to [[doku:monitoring]] the CPU and GPU, for +There is a whole page dedicated to [[doku:monitoring]] the CPU and 
-GROMACS the relevant sections are section [[doku:monitoring#Live]] Live and [[doku:monitoring#GPU]].+GPU, for GROMACS the relevant sections are section 
 +[[doku:monitoring#Live]] and [[doku:monitoring#GPU]].
  
-=== Performance === 
  
-As an example we ran ''gmx_mpi mdrun -s topol.tpr'' with different +===== Performance ===== 
-options, where ''topol.tpl'' is just some sample topology, we don't + 
-actually care about the result. Without any options GROMACS already +There is a whole article about the [[doku:gromacs_multi_gpu|Performance of GROMACS on multi GPU systems]]. 
-runs fine (a). Setting the number of tasks (b,c) is not needed; if set + 
-wrong can even slow the calculation down significantly (over +As a short example we ran ''gmx_mpi mdrun -s topol.tpr'' with 
-provisioning)Enforcing pinning also does not show any effects (d), +different options, where ''topol.tpl'' is just some sample topology, 
-we assume that the tasks are pinned automatically already. The only +we don't actually care about the result. Without any options GROMACS 
-improvement we had was using the ''-update gpu'' option, which puts more +already runs fine (a). Setting the number of tasks (b) is not needed; 
-load on the GPU. This might not work however if we use more than one +if set wrong can even slow the calculation down significantly ( c ) due 
-GPU.+to over provisioning! We would advice to enforce pinning, in our 
 +example it does not show any effects though (d), we assume that the 
 +tasks are pinned automatically already. The only further improvement 
 +we could get was using the ''-update gpu'' option, which puts more 
 +load on the GPU (e).
  
 ^ # ^ cmd         ^ ns / day ^ cpu load / % ^ gpu load / % ^ notes                               ^ ^ # ^ cmd         ^ ns / day ^ cpu load / % ^ gpu load / % ^ notes                               ^
Line 72: Line 86:
 | e | -update gpu | 170    | 100   | 90        |                                    | | e | -update gpu | 170    | 100   | 90        |                                    |
  
 +
 +==== GROMACS2020 ====
 +
 +The following environment variables need to be set with Gromacs2020
 +when using multiple GPUs: It is not necessary to set these variables
 +for Gromacs2021 onwards; they are already included and setting them
 +explicitly might actually decrease performance again.
 +
 +<code bash>
 +export GMX_GPU_PME_PP_COMMS=true
 +export GMX_GPU_DD_COMMS=true
 +export GMX_GPU_FORCE_UPDATE_DEFAULT_GPU=true
 +</code>
  • doku/gromacs.txt
  • Last modified: 2023/11/23 12:27
  • by msiegel