Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revisionBoth sides next revision
pandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm [2018/01/31 13:17] – Pandoc Auto-commit pandocpandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm [2020/10/20 08:09] – Pandoc Auto-commit pandoc
Line 26: Line 26:
  
 <code> <code>
-sbatch job.sh +sbatch job.sh
-</code> +
-<code>+
 Submitted batch job 5250981 Submitted batch job 5250981
 </code> </code>
Line 39: Line 37:
 <code> <code>
   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
-5250981  mem_0128   h5test   markus  R       0:00      2 n23-[018-019]+5250981  mem_0128   h5test   markus  R       0:00      2 n323-[018-019]
 </code> </code>
 Output files: Output files:
Line 91: Line 89:
  
  
-{{:pandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm:queueing_basics.png?200}}+{{..:queueing_basics.png?200}}
  
 ==== SLURM: Accounts and Users ==== ==== SLURM: Accounts and Users ====
  
-{{:pandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm:slurm_accounts.png}}+{{..:slurm_accounts.png}}
  
  
 ==== SLURM: Partition and Quality of Service ==== ==== SLURM: Partition and Quality of Service ====
  
-{{:pandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm:partitions.png}}+{{..:partitions.png}}
  
  
 ==== VSC-3 Hardware Types ==== ==== VSC-3 Hardware Types ====
  
-^partition^  memory      +^partition      RAM (GB)   ^CPU                          ^  Cores   IB (HCA)  ^  #Nodes  
-|mem_0064 |   64 GB|default+|mem_0064*         64      |2x Intel E5-2650 v2 @ 2.60GHz|   2x8     2xQDR    |   1849   
-|mem_0128 |  128 GB      +|mem_0128         128      |2x Intel E5-2650 v2 @ 2.60GHz|   2x8     2xQDR    |   140    
-|mem_0256 |  256 GB      |+|mem_0256     |     256      |2x Intel E5-2650 v2 @ 2.60GHz|   2x8     2xQDR    |    50    | 
 +|vsc3plus_0064|      64      |2x Intel E5-2660 v2 @ 2.20GHz|  2x10     1xFDR    |   816    | 
 +|vsc3plus_0256|     256      |2x Intel E5-2660 v2 @ 2.20GHz|  2x10     1xFDR    |    48    | 
 +|binf          512 - 1536  |2x Intel E5-2690 v4 @ 2.60GHz|  2x14     1xFDR       17    |
  
-  All nodes with the same CPU configuration+ 
-    * 16 cores +* default partition, QDR: Intel Truescale Infinipath (40Gbit/s), FDR: Mellanox ConnectX-3 (56Gbit/s) 
-    * 2 x Intel(RXeon(RCPU E5-2650 v2 2.60GHz (Ivy-Bridge)+ 
 +effective: 10/2018 
 + 
 +  + GPU nodes (see later) 
 +  * specify partition in job script
 + 
 +<code> 
 +#SBATCH -p <partition> 
 +</code> 
 +==== Standard QOS ==== 
 + 
 +^partition    ^QOS          ^ 
 +|mem_0064*    |normal_0064 
 +|mem_0128     |normal_0128 
 +|mem_0256     |normal_0256 
 +|vsc3plus_0064|vsc3plus_0064| 
 +|vsc3plus_0256|vsc3plus_0256| 
 +|binf         |normal_binf 
 + 
 + 
 +  specify QOS in job script: 
 + 
 +<code> 
 +#SBATCH --qos <QOS> 
 +</code> 
 + 
 +---- 
 + 
 +==== VSC-4 Hardware Types ==== 
 + 
 +^partition^  RAM (GB ^CPU                              Cores  ^  IB (HCA  #Nodes 
 +|mem_0096*|     96     |2x Intel Platinum 8174 3.10GHz|  2x24     1xEDR    |   688    | 
 +|mem_0384 |    384     |2x Intel Platinum 8174 @ 3.10GHz|  2x24     1xEDR    |    78    | 
 +|mem_0768 |    768     |2x Intel Platinum 8174 @ 3.10GHz|  2x24     1xEDR    |    12    | 
 + 
 + 
 +* default partition, EDR: Intel Omni-Path (100Gbit/s) 
 + 
 +effective: 10/2020 
 + 
 +==== Standard QOS ==== 
 + 
 +^partition^QOS     ^ 
 +|mem_0096*|mem_0096| 
 +|mem_0384 |mem_0384| 
 +|mem_0768 |mem_0768| 
 + 
 + 
 + 
 +---- 
 + 
 +==== VSC Hardware Types ====
  
   * Display information about partitions and their nodes:   * Display information about partitions and their nodes:
Line 119: Line 171:
 sinfo -o %P sinfo -o %P
 scontrol show partition mem_0064 scontrol show partition mem_0064
-scontrol show node n01-001+scontrol show node n301-001
 </code> </code>
- 
  
 ==== QOS-Account/Project assignment ==== ==== QOS-Account/Project assignment ====
  
  
-{{:pandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm:setup.png?200}}+{{..:setup.png?200}}
  
 1.+2.: 1.+2.:
Line 135: Line 186:
  
 <code> <code>
-default_account:        p70824 +default_account:              p70824 
-        account:        p70824               +        account:              p70824                    
- +
-    default_qos:   normal_0064 +
-            qos:    devel_0128               +
-                      goodluck               +
-                   gpu_compute               +
-                       gpu_vis               +
-                           knl               +
-                   normal_0064               +
-                   normal_0128               +
-                   normal_0256               +
  
 +    default_qos:         normal_0064                    
 +            qos:          devel_0128                    
 +                            goodluck                    
 +                      gpu_gtx1080amd                    
 +                    gpu_gtx1080multi                    
 +                   gpu_gtx1080single                    
 +                            gpu_k20m                    
 +                             gpu_m60                    
 +                                 knl                    
 +                         normal_0064                    
 +                         normal_0128                    
 +                         normal_0256                    
 +                         normal_binf                    
 +                       vsc3plus_0064                    
 +                       vsc3plus_0256
 </code> </code>
  
Line 161: Line 216:
 </code> </code>
 <code> <code>
-   qos_name total  free     walltime   prio partitions   +            qos_name total  used  free     walltime   priority partitions   
-========================================================== +========================================================================= 
-normal_0064  1796    43   3-00:00:00   2000 mem_0064     +         normal_0064  1782  1173   609   3-00:00:00       2000 mem_0064     
-normal_0256    15    -  3-00:00:00   2000 mem_0256     +         normal_0256    15    24    -  3-00:00:00       2000 mem_0256     
-normal_0128    67    -3   3-00:00:00   2000 mem_0128     +         normal_0128    93    51    42   3-00:00:00       2000 mem_0128     
- devel_0128    10     9     00:10:00  20000 mem_0128     +          devel_0128    10    20   -10     00:10:00      20000 mem_0128     
-gpu_compute    12       3-00:00:00   1000 p70971_gpu,gpu +            goodluck     0           3-00:00:00       1000 vsc3plus_0256,vsc3plus_0064,amd 
-    gpu_vis     4       3-00:00:00   1000 p70971_gpu,gpu +                 knl     4     1       3-00:00:00       1000 knl          
-   goodluck   470   470   3-00:00:00   1000              +         normal_binf    16        11   1-00:00:00       1000 binf         
-        knl         4   3-00:00:00   1000 knl         +    gpu_gtx1080multi               3-00:00:00       2000 gpu_gtx1080multi 
 +   gpu_gtx1080single    50    18    32   3-00:00:00       2000 gpu_gtx1080single 
 +            gpu_k20m               3-00:00:00       2000 gpu_k20m     
 +             gpu_m60             0   3-00:00:00       2000 gpu_m60      
 +       vsc3plus_0064   800   781    19   3-00:00:00       1000 vsc3plus_0064 
 +       vsc3plus_0256    48    44     4   3-00:00:00       1000 vsc3plus_0256 
 +      gpu_gtx1080amd               3-00:00:00       2000 gpu_gtx1080amd
 </code> </code>
 naming convention: naming convention:
Line 176: Line 237:
 ^QOS   ^Partition^ ^QOS   ^Partition^
 |*_0064|mem_0064 | |*_0064|mem_0064 |
 +
  
  
Line 232: Line 294:
   * must be a shell script (first line!)   * must be a shell script (first line!)
   * ‘#SBATCH’ for marking SLURM parameters   * ‘#SBATCH’ for marking SLURM parameters
-  * environment variables are set by SLURM for use within the script (e.g. ''%%SLURM_JOB_NUM_NODES%%'')+  * environment variables are set by SLURM for use within the script (e.g. ''%%SLURM_JOB_NUM_NODES%%'')
  
  
Line 267: Line 329:
 ==== Bad job practices ==== ==== Bad job practices ====
  
-  * looped job submission (takes a long time):+  * job submissions in a loop (takes a long time):
  
 <code> <code>
Line 276: Line 338:
 </code> </code>
  
-  * loop in job (sequential mpirun commands):+  * loop inside job script (sequential mpirun commands):
  
 <code> <code>
Line 286: Line 348:
  
  
-==== Array job ====+==== Array jobs ====
  
-  * run similar, **independent** jobs at once, that can be distinguished by **one parameter** +  * submit/run a series of **independent** jobs via a single SLURM script 
-  * each task will be treated as a seperate job +  * each job in the array gets a unique identifier (SLURM_ARRAY_TASK_ID) based on which various workloads can be organized 
-  * example ([[examples/job_array.sh|job_array.sh]], [[examples/sleep.sh|sleep.sh]]), start=1, end=30stepwidth=7:+  * example ([[examples/job_array.sh|job_array.sh]]), 10 jobs, SLURM_ARRAY_TASK_ID=1,2,3…10
  
 <code> <code>
Line 296: Line 358:
 #SBATCH -J array #SBATCH -J array
 #SBATCH -N 1 #SBATCH -N 1
-#SBATCH --array=1-30:7+#SBATCH --array=1-10
  
-./sleep.sh $SLURM_ARRAY_TASK_ID+echo "Hi, this is array job number"  $SLURM_ARRAY_TASK_ID 
 +sleep $SLURM_ARRAY_TASK_ID
 </code> </code>
-  * computed tasks: 1, 815, 22, 29+  * independent jobs: 1, 23 … 10
  
 <code> <code>
-5605039_[15-29mem_0064    array   markus PD +VSC-4 >  squeue  -u $user 
-5605039_1       mem_0064    array   markus  R +             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON) 
-5605039_8       mem_0064    array   markus  R+     406846_[7-10 mem_0096    array       sh PD       0:00      1 (Resources) 
 +          406846_4  mem_0096    array       sh     INVALID      1 n403-062 
 +          406846_5  mem_0096    array       sh  R    INVALID      1 n403-072 
 +          406846_6  mem_0096    array       sh     INVALID      1 n404-031
 </code> </code>
- 
- 
- 
-useful variables within job: 
  
 <code> <code>
-SLURM_ARRAY_JOB_ID +VSC-4 >  ls slurm-* 
-SLURM_ARRAY_TASK_ID +slurm-406846_10.out  slurm-406846_3.out  slurm-406846_6.out  slurm-406846_9.out 
-SLURM_ARRAY_TASK_STEP +slurm-406846_1.out   slurm-406846_4.out  slurm-406846_7.out 
-SLURM_ARRAY_TASK_MAX +slurm-406846_2.out   slurm-406846_5.out  slurm-406846_8.out
-SLURM_ARRAY_TASK_MIN+
 </code> </code>
- 
-limit number of simultanously running jobs to 2: 
  
 <code> <code>
-#SBATCH --array=1-30:7%2+VSC-4 >  cat slurm-406846_8.out 
 +Hi, this is array job number  8
 </code> </code>
  
  
-==== Single core ==== 
  
-  * use a complete compute node for several tasks at once+  * fine-tuning via builtin variables (SLURM_ARRAY_TASK_MIN, SLURM_ARRAY_TASK_MAX…)
  
-  * example: [[examples/job_singlenode_manytasks.sh|job_singlenode_manytasks.sh]]:+  * example of going in chunks of a certain size, e.g5, SLURM_ARRAY_TASK_ID=1,6,11,16
  
 <code> <code>
-...+#SBATCH --array=1-20:
 +</code>
  
-max_num_tasks=16+  * example of limiting number of simultaneously running jobs to 2 (perhaps for licences)
  
-...+<code> 
 +#SBATCH --array=1-20:5%2 
 +</code> 
 + 
 + 
 +==== Single core jobs ==== 
 + 
 +  * use an entire compute node for several independent jobs 
 +  * example: [[examples/single_node_multiple_jobs.sh|single_node_multiple_jobs.sh]]:
  
-for i in `seq $task_start $task_increment $task_end`+<code> 
 +for ((i=1; i<=48; i++))
 do do
-  ./$executable $i & +   stress --cpu 1 --timeout $i  &
-  check_running_tasks #sleeps as long as max_num_tasks are running+
 done done
 wait wait
 </code> </code>
 +  * ‘&’: send process into the background, script can continue
 +  * ‘wait’: waits for all processes in the background, otherwise script would terminate
  
-  * ‘&’: start binary in background, script can continue 
-  * ‘wait’: waits for all processes in the background, otherwise script will finish 
  
 +==== Combination of array & single core job ====
  
- +  * example: [[examples/combined_array_multiple_jobs.sh|combined_array_multiple_jobs.sh]]:
-==== Array job + single core ==== +
- +
-[[examples/job_array_some_tasks.sh|job_array_some_tasks.sh]]:+
  
 <code> <code>
 ... ...
-#SBATCH --array=1-100:32+#SBATCH --array=1-144:48
  
-...+j=$SLURM_ARRAY_TASK_ID 
 +((j+=47))
  
-task_start=$SLURM_ARRAY_TASK_ID +for ((i=$SLURM_ARRAY_TASK_ID; i<=$j; i++))
-task_end=$(( $SLURM_ARRAY_TASK_ID $SLURM_ARRAY_TASK_STEP -1 )) +
-if [ $task_end -gt $SLURM_ARRAY_TASK_MAX ]; then +
-        task_end=$SLURM_ARRAY_TASK_MAX +
-fi +
-task_increment=1 +
- +
-... +
- +
-for i in `seq $task_start $task_increment $task_end`+
 do do
-  ./$executable $i & +   stress --cpu 1 --timeout $i  &
-  check_running_tasks+
 done done
 wait wait
 +
 </code> </code>
 ==== Exercises ==== ==== Exercises ====
  
   * files are located in folder ''%%examples/05_submitting_batch_jobs%%''   * files are located in folder ''%%examples/05_submitting_batch_jobs%%''
-  * download or copy [[examples/sleep.sh|sleep.sh]] and find out what it is doing +  * look into [[examples/job_array.sh|job_array.sh]] and modify it such that the considered range is from 1 to 20 but in steps of 5 
-  * run [[examples/job_array.sh|job_array.sh]] with tasks 4-20 and stepwidth 3 +  * look into [[examples/single_node_multiple_jobs.sh|single_node_multiple_jobs.sh]] and also change it to go in steps of 5 
-  * start a jobs for [[examples/job_singlenode_manytasks.sh|job_singlenode_manytasks.sh]] with max_num_tasks=16 and max_num_tasks=8; compare the job runtimes +  * run [[examples/combined_array_multiple_jobs.sh|combined_array_multiple_jobs.sh]] and check whether the output is reasonable
-  * run [[examples/job_array_some_tasks.sh|job_array_some_tasks.sh]]+
  
 ==== Job/process setup ==== ==== Job/process setup ====
Line 391: Line 448:
   * normal jobs:   * normal jobs:
  
-^#SBATCH            ^job environment        +^#SBATCH          ^job environment      
-|-N                 |SLURM_JOB_NUM_NODES    +|-N               |SLURM_JOB_NUM_NODES  
-|--ntasks-per-core  |SLURM_NTASKS_PER_CORE  +|--ntasks-per-core|SLURM_NTASKS_PER_CORE| 
-|--ntasks-per-node  |SLURM_NTASKS_PER_NODE  | +|--ntasks-per-node|SLURM_NTASKS_PER_NODE| 
-|--ntasks-per-socket|SLURM_NTASKS_PER_SOCKET+|--ntasks, -n     |SLURM_NTASKS         |
-|--ntasks, -n       |SLURM_NTASKS           |+
  
   * emails:   * emails:
Line 408: Line 464:
  
 <code> <code>
-#SBATCH -C --constraint 
-#SBATCH --gres= 
- 
 #SBATCH -t, --time=<time> #SBATCH -t, --time=<time>
 #SBATCH --time-min=<time> #SBATCH --time-min=<time>
 </code> </code>
  
-Valid time formats:+time format:
  
-  * MM 
-  * [HH:]MM:SS 
   * DD-HH[:MM[:SS]]   * DD-HH[:MM[:SS]]
  
  
  
-  * backfilling: +  * backfilling: * specify ‘–time’ or ‘–time-min’ which are estimates of the runtime of your job * shorter than default runtimes (mostly 72h) may enable the scheduler to use idle nodes waiting for a larger job 
-    * specify ‘–time’ or ‘–time-min’ that is eligible for your job +  * get the remaining running time for your job:
-    short runtimes may enable the scheduler to use idle nodes waiting for a large job+
  
 +<code>
 +squeue -h -j $SLURM_JOBID -o %L
 +</code>
  
  
 ==== Licenses ==== ==== Licenses ====
  
-{{:pandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm:licenses.png}}+{{..:licenses.png}}
  
  
 <code> <code>
-slic+VSC-3 >  slic
 </code> </code>
-Within the job script add the flags as shown with ‘slic’, e.g. for using both Matlab and Mathematica:+Within the SLURN submit script add the flags as shown with ‘slic’, e.g. when both Matlab and Mathematica are required
  
 <code> <code>
 #SBATCH -L matlab@vsc,mathematica@vsc #SBATCH -L matlab@vsc,mathematica@vsc
 </code> </code>
-Intel licenses are needed only for compiling code, not for running it!+Intel licenses are needed only when compiling code, not for running resulting executables
  
-==== Reservations of compute nodes ====+==== Reservation of compute nodes ====
  
-  * core-h accounting is done for the full reservation time +  * core-h accounting is done for the entire period of reservation 
-  * contact us, if needed+  * contact service@vsc.ac.at
   * reservations are named after the project id   * reservations are named after the project id
  
Line 453: Line 506:
  
 <code> <code>
-scontrol show reservations+VSC-3 >  scontrol show reservations
 </code> </code>
-  * use it:+  * usage:
  
 <code> <code>
Line 471: Line 524:
 echo "2+2" | matlab echo "2+2" | matlab
 </code> </code>
-==== MPI + NTASKS_PER_NODE + pinning ====+==== MPI + pinning ====
  
   * understand what your code is doing and place the processes correctly   * understand what your code is doing and place the processes correctly
Line 477: Line 530:
   * details for pinning: https://wiki.vsc.ac.at/doku.php?id=doku:vsc3_pinning   * details for pinning: https://wiki.vsc.ac.at/doku.php?id=doku:vsc3_pinning
  
-Example: Two nodes with two mpi processes each:+Example: Two nodes with two MPI processes each:
  
 === srun === === srun ===
Line 485: Line 538:
 #SBATCH --tasks-per-node=2 #SBATCH --tasks-per-node=2
  
-srun --cpu_bind=map_cpu:0,./my_mpi_program+srun --cpu_bind=map_cpu:0,24 ./my_mpi_program
  
 </code> </code>
Line 495: Line 548:
 #SBATCH --tasks-per-node=2 #SBATCH --tasks-per-node=2
  
-export I_MPI_PIN_PROCESSOR_LIST=0,8+export I_MPI_PIN_PROCESSOR_LIST=0,24   # Intel MPI syntax 
 mpirun ./my_mpi_program mpirun ./my_mpi_program
 </code> </code>
Line 517: Line 570:
  
 ---- ----
 +
  
  • pandoc/introduction-to-vsc/05_submitting_batch_jobs/slurm.txt
  • Last modified: 2020/10/20 09:13
  • by pandoc