Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
doku:slurm [2021/09/29 19:00] – [SLURM (Simple Linux Utility for Resource Management)] goldenbergdoku:slurm [2023/02/17 16:23] msiegel
Line 1: Line 1:
 ====== Submitting batch jobs ====== ====== Submitting batch jobs ======
 +
 +===== SPACK =====
 +
 +On VSC-4 and VSC-5, spack is used to install and provide software. See [[doku:spack|SPACK - a package manager for HPC systems]]
  
 ===== Module environment ===== ===== Module environment =====
Line 17: Line 21:
 When all required/intended modules have been loaded, user packages may be compiled as usual. When all required/intended modules have been loaded, user packages may be compiled as usual.
  
-===== spack ===== 
- 
-On VSC-4, spack is also user to install and provide software. See [[doku:spack|SPACK - a package manager for HPC systems]] 
 ===== SLURM (Simple Linux Utility for Resource Management) ===== ===== SLURM (Simple Linux Utility for Resource Management) =====
- 
 Contrary to the previously on VSC 1 and VSC 2 employed SGE, the scheduler on VSC-3 and VSC-4 is [[http://slurm.schedmd.com|SLURM]].  Contrary to the previously on VSC 1 and VSC 2 employed SGE, the scheduler on VSC-3 and VSC-4 is [[http://slurm.schedmd.com|SLURM]]. 
-=== Basic SLURM commands: ===+ 
 +==== Basic SLURM commands: ====
   * ''[...]$ sinfo'' gives information on which 'queues'='partitions' are available for job submission. Note: the under SGE termed 'queue' is called a 'partition' under SLURM.   * ''[...]$ sinfo'' gives information on which 'queues'='partitions' are available for job submission. Note: the under SGE termed 'queue' is called a 'partition' under SLURM.
   * ''[...]$ scontrol'' is used to view SLURM configuration including: job, job step, node, partition, reservation, and overall system configuration. Without a command entered on the execute line, scontrol operates in an interactive mode and prompt for input. With a command entered on the execute line, scontrol executes that command and terminates.    * ''[...]$ scontrol'' is used to view SLURM configuration including: job, job step, node, partition, reservation, and overall system configuration. Without a command entered on the execute line, scontrol operates in an interactive mode and prompt for input. With a command entered on the execute line, scontrol executes that command and terminates. 
Line 29: Line 30:
   * ''[...]$ scontrol show partition'' shows information on available partitions.   * ''[...]$ scontrol show partition'' shows information on available partitions.
   * ''[...]$ squeue''    to see the current list of submitted jobs, their state and resource allocation. [[doku:slurm_job_reason_codes|Here]] is a description of the most important **job reason codes** returned by the squeue command.   * ''[...]$ squeue''    to see the current list of submitted jobs, their state and resource allocation. [[doku:slurm_job_reason_codes|Here]] is a description of the most important **job reason codes** returned by the squeue command.
 +
 ==== Node configuration - hyperthreading ==== ==== Node configuration - hyperthreading ====
  
Line 46: Line 48:
 Some codes may experience a performance gain from using all 32 virtual cores, e.g., GROMACS seems to profit. But note that using all virtual cores also leads to more communication and may impact on the performance of large MPI jobs. Some codes may experience a performance gain from using all 32 virtual cores, e.g., GROMACS seems to profit. But note that using all virtual cores also leads to more communication and may impact on the performance of large MPI jobs.
  
-**NOTE on accounting**: the project's core-h are always calculated as ''job_walltime * nnodes * 16'' (16 physical cores per node). SLURM's built in function ''sreport'' yields wrong accounting statistics because (depending on the job script) the multiplier is 32 instead of 16. You may instead use the accounting script introduced in this [[https://wiki.vsc.ac.at/doku.php?id=doku:slurm_sacct&#accounting_script|section]].+**NOTE on accounting**: the project's core-h are always calculated as ''job_walltime * nnodes * 16'' (16 physical cores per node). SLURM's built in function ''sreport'' yields wrong accounting statistics because (depending on the job script) the multiplier is 32 instead of 16. You may instead use the accounting script introduced in this [[doku:slurm_sacct|section]].
  
 ==== Node allocation policy ==== ==== Node allocation policy ====
  • doku/slurm.txt
  • Last modified: 2024/02/07 10:55
  • by katrin