Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
pandoc:introduction-to-mul-cluster:01_introduction:04_slurm [2018/02/02 10:51] – Pandoc Auto-commit pandocpandoc:introduction-to-mul-cluster:01_introduction:04_slurm [2020/10/20 08:09] (current) – Pandoc Auto-commit pandoc
Line 289: Line 289:
 ===== SLURM ===== ===== SLURM =====
  
-==== Interactive jobs (2) ====+==== Interactive jobs (2) (excercise) ====
  
 Alternatively the ''%%salloc%%'' command can be used: Alternatively the ''%%salloc%%'' command can be used:
  
 <code> <code>
-salloc -N 1 -J test -p E5-2690v4 --qos E5-2690v4-batch +salloc -N 1 -J test -p E5-2690v4 --qos E5-2690v4-batch --mem=10G
 </code> </code>
 +Then find out where your job is running:
 +
 +<code>
 +squeue -u <username>
 +</code>
 +or
 +
 +<code>
 +srun hostname
 +</code>
 +and connect to it:
 +
 +<code>
 +ssh <node>
 +</code>
 +
 +----
 +
 +===== SLURM =====
 +
 +==== Interactive jobs (2) (excercise) ====
 +
 +To get direct interactive access to a compute try:
 +
 +<code>
 +salloc -N 1 -J test -p E5-2690v4 --qos E5-2690v4-batch --mem=10G  srun --pty --preserve-env $SHELL
 +</code>
 +
  
 ---- ----
Line 309: Line 337:
 #SBATCH --mem=2G #SBATCH --mem=2G
 </code> </code>
-The cores and the requested memory are then exclusively assigned to the processes of this job via cgroups. If the memory is not specified, usage of the whole memory is assumed and no other job will be scheduled on this node.+The cores and the requested memory are then exclusively assigned to the processes of this job via cgroups. The current policy is that if the memory is not specified, the job cannot be submitted an an error will be displayed. 
 + 
 + 
 +---- 
 + 
 +===== SLURM: memory ===== 
 + 
 +  * you **have to** specify memory 
 +  * slurm does not accept your job without a memory specification 
 +  * choose the right amount of memory: 
 +    * not too little 
 +    * not too much 
 +  * too **little** memory: 
 +    * could lead to very low speed because of swapping 
 +    * could lead to crash of job (experienced with Abaqus) 
 +  * too **much** memory 
 +    * does not hurt performance and does not kill your job 
 +    * but it costs you more of your fair share 
 + 
 + 
 +---- 
 + 
 +===== SLURM: memory ===== 
 + 
 +==== why have this annoying feature anyway? ==== 
 + 
 +  * because of shared usage of nodes 
 +  * if we would use nodes **only exclusively** then this would not be necessary
  
  
Line 404: Line 459:
  
 ---- ----
 +
  
  • pandoc/introduction-to-mul-cluster/01_introduction/04_slurm.1517568712.txt.gz
  • Last modified: 2018/02/02 10:51
  • by pandoc