Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
doku:slurm [2023/02/17 16:23] – msiegel | doku:slurm [2024/07/11 09:05] (current) – grokyta | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
- | ===== SPACK ===== | + | Contrary to the previously on VSC-1 and VSC-2 employed SGE, the scheduler on VSC-3, VSC-4, and VSC-5 is [[http:// |
- | On VSC-4 and VSC-5, spack is used to install | + | ==== Basic SLURM commands: ==== |
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | |||
+ | |||
+ | ==== Software Installations and Modules ==== | ||
- | ===== Module environment ===== | + | On VSC-4 and VSC-5, spack is used to install and provide modules, see [[doku: |
In order to set environment variables needed for a specific application, | In order to set environment variables needed for a specific application, | ||
Line 20: | Line 28: | ||
When all required/ | When all required/ | ||
- | |||
- | ===== SLURM (Simple Linux Utility for Resource Management) ===== | ||
- | Contrary to the previously on VSC 1 and VSC 2 employed SGE, the scheduler on VSC-3 and VSC-4 is [[http:// | ||
- | |||
- | ==== Basic SLURM commands: ==== | ||
- | * '' | ||
- | * '' | ||
- | * '' | ||
- | * '' | ||
- | * '' | ||
- | |||
==== Node configuration - hyperthreading ==== | ==== Node configuration - hyperthreading ==== | ||
- | The compute nodes of VSC-3 are configured with the following parameters in SLURM: | + | The compute nodes of VSC-4 are configured with the following parameters in SLURM: |
+ | < | ||
+ | CoresPerSocket=24 | ||
+ | Sockets=2 | ||
+ | ThreadsPerCore=2 | ||
+ | </ | ||
+ | And the primary nodes of VSC-5 with: | ||
< | < | ||
- | CoresPerSocket=8 | + | CoresPerSocket=64 |
Sockets=2 | Sockets=2 | ||
ThreadsPerCore=2 | ThreadsPerCore=2 | ||
</ | </ | ||
- | This reflects the fact that <html> < | + | This reflects the fact that <color #cc3300> hyperthreading </color> is activated on all compute nodes and <color # |
In the batch script hyperthreading is selected by adding the line | In the batch script hyperthreading is selected by adding the line | ||
- | < | + | < |
- | #SBATCH --ntasks-per-core=2 | + | |
</ | </ | ||
which allows for 2 tasks per core. | which allows for 2 tasks per core. | ||
- | Some codes may experience a performance gain from using all 32 virtual cores, e.g., GROMACS seems to profit. But note that using all virtual cores also leads to more communication and may impact on the performance of large MPI jobs. | + | Some codes may experience a performance gain from using all virtual cores, e.g., GROMACS seems to profit. But note that using all virtual cores also leads to more communication and may impact on the performance of large MPI jobs. |
- | **NOTE on accounting**: | + | **NOTE on accounting**: |
==== Node allocation policy ==== | ==== Node allocation policy ==== | ||
- | On VSC-3 (as on VSC-2) < | + | On VSC-4 & VSC-5 there is a set of nodes that accept |
Line 59: | Line 62: | ||
Depending on the demands of a certain application, | Depending on the demands of a certain application, | ||
- | [[doku:vsc3_queue|partition (grouping hardware according its type) and | + | [[doku:vsc5_queue|partition (grouping hardware according its type) and |
quality of service (QOS; defining the run time etc.)]] can be selected. | quality of service (QOS; defining the run time etc.)]] can be selected. | ||
- | Additionally, | + | Additionally, |
==== The job submission script==== | ==== The job submission script==== | ||
- | It is recommended to write the job script using a [[doku: | + | It is recommended to write the job script using a [[doku: |
- | Editors in //Windows// may add additional invisible characters to the job file which render it unreadable and, thus, it is not executed. | + | Editors in //Windows// may add additional invisible characters to the job file which render it unreadable and, thus, it cannot be not executed. |
Assume a submission script '' | Assume a submission script '' | ||
Line 73: | Line 76: | ||
#SBATCH -J chk | #SBATCH -J chk | ||
#SBATCH -N 2 | #SBATCH -N 2 | ||
- | #SBATCH --ntasks-per-node=16 | + | #SBATCH --ntasks-per-node=48 |
#SBATCH --ntasks-per-core=1 | #SBATCH --ntasks-per-core=1 | ||
#SBATCH --mail-type=BEGIN | #SBATCH --mail-type=BEGIN | ||
Line 80: | Line 83: | ||
# when srun is used, you need to set: | # when srun is used, you need to set: | ||
- | <srun -l -N2 -n32 a.out > | + | <srun -l -N2 -n96 a.out > |
# or | # or | ||
- | <mpirun -np 32 a.out> | + | <mpirun -np 96 a.out> |
</ | </ | ||
* **-J** | * **-J** | ||
- | * **-N** | + | * **-N** |
* **-n, --ntasks=< | * **-n, --ntasks=< | ||
* **--ntasks-per-node** | * **--ntasks-per-node** | ||
Line 94: | Line 97: | ||
* **--mail-user** sends an email to this address | * **--mail-user** sends an email to this address | ||
- | In order to send the job to specific queues, see [[doku:vsc3_queue|Queue/Partition setup on VSC-3]]. | + | In order to send the job to specific queues, see [[doku:vsc4_queue|Queue |
====Job submission==== | ====Job submission==== | ||
< | < | ||
- | [username@l31 ~]$ sbatch check.slrm | + | [username@l42 ~]$ sbatch check.slrm |
- | [username@l31 ~]$ squeue -u `whoami` | + | [username@l42 ~]$ squeue -u `whoami` |
- | [username@l31 ~]$ scancel | + | [username@l42 ~]$ scancel |
# is obtained from the previous command | # is obtained from the previous command | ||
</ | </ | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | ====A word on srun and mpirun:==== | ||
- | Currently (27th March 2015), **srun** only works when the application uses **intel mpi** and is compiled with the **intel compiler**. We will provide compatible versions of MVAPICH2 and OpenMPI in the near future. | ||
- | At the moment, it is recommended to use **mpirun** in case of MVAPICH2 and OpenMPI. | ||
- | |||
==== Hybrid MPI/OMP: ==== | ==== Hybrid MPI/OMP: ==== | ||
Line 154: | Line 148: | ||
</ | </ | ||
See also the [[https:// | See also the [[https:// | ||
+ | |||
Line 166: | Line 161: | ||
#SBATCH -J chk | #SBATCH -J chk | ||
#SBATCH -N 4 | #SBATCH -N 4 | ||
- | #SBATCH --ntasks-per-node=16 | + | #SBATCH --ntasks-per-node=48 |
#SBATCH --ntasks-per-core=1 | #SBATCH --ntasks-per-core=1 | ||
Line 172: | Line 167: | ||
scontrol show hostnames $SLURM_NODELIST | scontrol show hostnames $SLURM_NODELIST | ||
- | srun -l -N2 -r0 -n32 job1.scrpt & | + | srun -l -N2 -r0 -n96 job1.scrpt & |
- | srun -l -N2 -r2 -n32 job2.scrpt & | + | srun -l -N2 -r2 -n96 job2.scrpt & |
wait | wait | ||
- | srun -l -N2 -r2 -n32 job3.scrpt & | + | srun -l -N2 -r2 -n96 job3.scrpt & |
- | srun -l -N2 -r0 -n32 job4.scrpt & | + | srun -l -N2 -r0 -n96 job4.scrpt & |
wait | wait | ||
Line 285: | Line 280: | ||
#SBATCH -J par # job name | #SBATCH -J par # job name | ||
#SBATCH -N 2 # number of nodes=2 | #SBATCH -N 2 # number of nodes=2 | ||
- | #SBATCH --ntasks-per-node=16 # uses all cpus of one node | + | #SBATCH --ntasks-per-node=48 # uses all cpus of one node |
#SBATCH --ntasks-per-core=1 | #SBATCH --ntasks-per-core=1 | ||
#SBATCH --threads-per-core=1 | #SBATCH --threads-per-core=1 | ||
Line 295: | Line 290: | ||
rm machines_tmp | rm machines_tmp | ||
- | tasks_per_node=16 # change number accordingly | + | tasks_per_node=48 # change number accordingly |
nodes=2 | nodes=2 | ||
for ((line=1; line< | for ((line=1; line< | ||
Line 347: | Line 342: | ||
- continue at 2. for further dependent jobs | - continue at 2. for further dependent jobs | ||
- | ===== Licenses ===== | ||
- | Software, that uses a license server, has to be specified upon job submission. A list of all available licensed software for your user can be shown by using the command: | ||
- | |||
- | < | ||
- | slic | ||
- | </ | ||
- | |||
- | Within the job script add the flags as shown with ' | ||
- | |||
- | < | ||
- | #SBATCH -L matlab@vsc, | ||
- | </ | ||
===== Prolog Error Codes ===== | ===== Prolog Error Codes ===== |