Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revisionLast revisionBoth sides next revision | ||
doku:slurm [2023/03/14 12:15] – [Node configuration - hyperthreading] goldenberg | doku:slurm [2024/07/11 08:52] – [Partition, quality of service and run time] grokyta | ||
---|---|---|---|
Line 29: | Line 29: | ||
ThreadsPerCore=2 | ThreadsPerCore=2 | ||
</ | </ | ||
- | This reflects the fact that <html> < | + | This reflects the fact that <color #cc3300> hyperthreading </color> is activated on all compute nodes and <color #cc3300> 96 cores on VSC4 and 256 cores on VSC5 </color> may be utilized on each node. |
In the batch script hyperthreading is selected by adding the line | In the batch script hyperthreading is selected by adding the line | ||
- | < | + | < |
- | #SBATCH --ntasks-per-core=2 | + | |
</ | </ | ||
which allows for 2 tasks per core. | which allows for 2 tasks per core. | ||
Line 41: | Line 40: | ||
==== Node allocation policy ==== | ==== Node allocation policy ==== | ||
- | On VSC-3 (as on VSC-2) < | + | On VSC-4 & VSC-5 there is a set of nodes that accept |
Line 49: | Line 49: | ||
Depending on the demands of a certain application, | Depending on the demands of a certain application, | ||
- | [[doku:vsc3_queue|partition (grouping hardware according its type) and | + | [[doku:vsc5_queue|partition (grouping hardware according its type) and |
quality of service (QOS; defining the run time etc.)]] can be selected. | quality of service (QOS; defining the run time etc.)]] can be selected. | ||
- | Additionally, | + | Additionally, |
==== The job submission script==== | ==== The job submission script==== | ||
- | It is recommended to write the job script using a [[doku: | + | It is recommended to write the job script using a [[doku: |
- | Editors in //Windows// may add additional invisible characters to the job file which render it unreadable and, thus, it is not executed. | + | Editors in //Windows// may add additional invisible characters to the job file which render it unreadable and, thus, it cannot be not executed. |
Assume a submission script '' | Assume a submission script '' | ||
Line 63: | Line 63: | ||
#SBATCH -J chk | #SBATCH -J chk | ||
#SBATCH -N 2 | #SBATCH -N 2 | ||
- | #SBATCH --ntasks-per-node=16 | + | #SBATCH --ntasks-per-node=48 |
#SBATCH --ntasks-per-core=1 | #SBATCH --ntasks-per-core=1 | ||
#SBATCH --mail-type=BEGIN | #SBATCH --mail-type=BEGIN | ||
Line 70: | Line 70: | ||
# when srun is used, you need to set: | # when srun is used, you need to set: | ||
- | <srun -l -N2 -n32 a.out > | + | <srun -l -N2 -n96 a.out > |
# or | # or | ||
- | <mpirun -np 32 a.out> | + | <mpirun -np 96 a.out> |
</ | </ | ||
* **-J** | * **-J** | ||
- | * **-N** | + | * **-N** |
* **-n, --ntasks=< | * **-n, --ntasks=< | ||
* **--ntasks-per-node** | * **--ntasks-per-node** | ||
Line 84: | Line 84: | ||
* **--mail-user** sends an email to this address | * **--mail-user** sends an email to this address | ||
- | In order to send the job to specific queues, see [[doku:vsc3_queue|Queue/Partition setup on VSC-3]]. | + | In order to send the job to specific queues, see [[doku:vsc4_queue|Queue |
====Job submission==== | ====Job submission==== | ||
< | < | ||
- | [username@l31 ~]$ sbatch check.slrm | + | [username@l42 ~]$ sbatch check.slrm |
- | [username@l31 ~]$ squeue -u `whoami` | + | [username@l42 ~]$ squeue -u `whoami` |
- | [username@l31 ~]$ scancel | + | [username@l42 ~]$ scancel |
# is obtained from the previous command | # is obtained from the previous command | ||
</ | </ | ||
Line 98: | Line 98: | ||
- | ====A word on srun and mpirun:==== | ||
- | Currently (27th March 2015), **srun** only works when the application uses **intel mpi** and is compiled with the **intel compiler**. We will provide compatible versions of MVAPICH2 and OpenMPI in the near future. | ||
- | At the moment, it is recommended to use **mpirun** in case of MVAPICH2 and OpenMPI. | ||
- | |||
- | ==== Hybrid MPI/OMP: ==== | ||
- | |||
- | SLURM Script: | ||
- | < | ||
- | #SBATCH -N 3 | ||
- | #SBATCH --ntasks-per-node=2 | ||
- | #SBATCH -c 8 | ||
- | |||
- | export OMP_NUM_THREADS=8 | ||
- | srun myhybridcode.exe | ||
- | </ | ||
- | |||
- | **mpirun** pins processes to cores. | ||
- | At least in the case of pure MPI processes (without any threads) the best performance has been observed with our default pinning (pinning to the physical cpus 0, 1, ..., 15). | ||
- | If you need to use hybrid MPI/openMP, you may have to disable our default pinning including the following line in the job script: | ||
- | < | ||
- | unset I_MPI_PIN_PROCESSOR_LIST | ||
- | export I_MPI_PIN_PROCESSOR_LIST=0, | ||
- | export I_MPI_PIN_PROCESSOR_LIST=0, | ||
- | export I_MPI_PIN_PROCESSOR_LIST=0, | ||
- | </ | ||
- | or use the shell script: | ||
- | < | ||
- | if [ $PROC_PER_NODE -gt 1 ] | ||
- | then | ||
- | unset I_MPI_PIN_PROCESSOR_LIST | ||
- | if [ $PROC_PER_NODE -eq 2 ] | ||
- | then | ||
- | export I_MPI_PIN_PROCESSOR_LIST=0, | ||
- | elif [ $PROC_PER_NODE -eq 4 ] | ||
- | then | ||
- | export I_MPI_PIN_PROCESSOR_LIST=0, | ||
- | elif [ $PROC_PER_NODE -eq 8 ] | ||
- | then | ||
- | export I_MPI_PIN_PROCESSOR_LIST=0, | ||
- | else | ||
- | export I_MPI_PIN=disable | ||
- | fi | ||
- | fi | ||
- | </ | ||
- | See also the [[https:// | ||
Line 156: | Line 111: | ||
#SBATCH -J chk | #SBATCH -J chk | ||
#SBATCH -N 4 | #SBATCH -N 4 | ||
- | #SBATCH --ntasks-per-node=16 | + | #SBATCH --ntasks-per-node=48 |
#SBATCH --ntasks-per-core=1 | #SBATCH --ntasks-per-core=1 | ||
Line 162: | Line 117: | ||
scontrol show hostnames $SLURM_NODELIST | scontrol show hostnames $SLURM_NODELIST | ||
- | srun -l -N2 -r0 -n32 job1.scrpt & | + | srun -l -N2 -r0 -n96 job1.scrpt & |
- | srun -l -N2 -r2 -n32 job2.scrpt & | + | srun -l -N2 -r2 -n96 job2.scrpt & |
wait | wait | ||
- | srun -l -N2 -r2 -n32 job3.scrpt & | + | srun -l -N2 -r2 -n96 job3.scrpt & |
- | srun -l -N2 -r0 -n32 job4.scrpt & | + | srun -l -N2 -r0 -n96 job4.scrpt & |
wait | wait | ||
Line 275: | Line 230: | ||
#SBATCH -J par # job name | #SBATCH -J par # job name | ||
#SBATCH -N 2 # number of nodes=2 | #SBATCH -N 2 # number of nodes=2 | ||
- | #SBATCH --ntasks-per-node=16 # uses all cpus of one node | + | #SBATCH --ntasks-per-node=48 # uses all cpus of one node |
#SBATCH --ntasks-per-core=1 | #SBATCH --ntasks-per-core=1 | ||
#SBATCH --threads-per-core=1 | #SBATCH --threads-per-core=1 | ||
Line 285: | Line 240: | ||
rm machines_tmp | rm machines_tmp | ||
- | tasks_per_node=16 # change number accordingly | + | tasks_per_node=48 # change number accordingly |
nodes=2 | nodes=2 | ||
for ((line=1; line< | for ((line=1; line< | ||
Line 337: | Line 292: | ||
- continue at 2. for further dependent jobs | - continue at 2. for further dependent jobs | ||
- | ===== Licenses ===== | ||
- | Software, that uses a license server, has to be specified upon job submission. A list of all available licensed software for your user can be shown by using the command: | ||
- | |||
- | < | ||
- | slic | ||
- | </ | ||
- | |||
- | Within the job script add the flags as shown with ' | ||
- | |||
- | < | ||
- | #SBATCH -L matlab@vsc, | ||
- | </ | ||
===== Prolog Error Codes ===== | ===== Prolog Error Codes ===== |