GROMACS

This version is outdated by a newer approved version.

This version (2022/06/23 13:24) was approved by msiegel.The Previously approved version (2014/11/04 15:35) is available.

This is an old revision of the document!

First you have to decide on which hardware GROMACS should run, we call this a partition, described in detail at . On a login node, type sinfo to get a list of the available partitions. Be aware that each setup has different hardware, for example the partition gpu_gtx1080single on VSC3 has 1 GPU and single socket à 4 cores, with 2 hyperthreads each core, listed at GPU Partitions on VSC3. The partition has to be set in the batch script, see the example below. Thus here it makes sense to let GROMACS run on 8 threads (-ntomp 8), yet it makes little sense to force more threads than that, as this would lead to oversubscribing. GROMACS decides mostly on its own how it wants to work, so don't be surprised if it ignores settings like environment variables.

Batch Script

In order to be scheduled efficiently with SLURM, one writes a shell script (see the text file myscript.sh below) consisting of:

some SLURM parameters: the #BATCH … part)
exporting environment variables: export CUDA_VISIBLE_DEVICES=0
cleaning the environment: module purge
loading modules: load gcc/7.3 …
last but not least starting the program in question: gmx_mpi …

myscript.sh

#!/bin/bash
#SBATCH --job-name=myname
#SBATCH --partition=gpu_gtx1080single
#SBATCH --gres=gpu:1
#SBATCH --nodes=1
 
unset OMP_NUM_THREADS
export CUDA_VISIBLE_DEVICES=0
 
module purge
module load gcc/7.3 nvidia/1.0 cuda/10.1.168 cmake/3.15.4 openmpi/4.0.5 python/3.7 gromacs/2021.2_gtx1080
 
gmx_mpi mdrun -s topol.tpr

Type sbatch myscript.sh to submit such a batch script to SLURM. you get the job id, and your job will be scheduled, and executed automatically.

CPU / GPU Load

There is a whole page dedicated to monitoring the CPU and GPU, for GROMACS the relevant sections are section Live Live and GPU.

Performance

As an example we ran gmx_mpi mdrun -s topol.tpr with different options, where topol.tpl is just some sample topology, we don't actually care about the result. Without any options GROMACS already runs fine (a). Setting the number of tasks (b,c) is not needed; if set wrong can even slow the calculation down significantly (over provisioning)! Enforcing pinning also does not show any effects (d), we assume that the tasks are pinned automatically already. The only improvement we had was using the -update gpu option, which puts more load on the GPU. This might not work however if we use more than one GPU.

#	cmd	ns / day	cpu load / %	gpu load / %	notes
a	–	160	100	80
b	-ntomp 8	160	100	80
c	-ntomp 16	140	40	70	gromacs warning: over provisioning
d	-pin on	160	100	80
e	-update gpu	170	100	90