Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
doku:slurm [2023/03/14 12:34] – [The job submission script] goldenbergdoku:slurm [2024/02/07 10:55] (current) – [The job submission script] katrin
Line 41: Line 41:
  
 ==== Node allocation policy ==== ==== Node allocation policy ====
-On VSC-4 there are a set of nodes which accept jobs that do not require entire nodes (anythong from 1 core to less than a full node). These nodes are set up to accomodate different jobs from different users until they are full. They are automatically used for such types of jobs. All other nodes are assigned completely to a job.  +On VSC-4 & VSC-5 there is a set of nodes that accept jobs that do not require entire exclusive nodes (anything from 1 core to less than a full node). These nodes are set up to accommodate different jobs from different users until they are full. They are automatically used for such types of jobs. All other nodes are assigned completely (and exclusively) to a job whenever the '-N' argument is used. 
-On VSC-5 that feature is not yet active, so only complete nodes are assigned to jobs.+
  
  
Line 56: Line 56:
  
 It is recommended to write the job script using a [[doku:win2vsc&#the_job_filetext_editors_on_the_cluster|text editor]] on the VSC //Linux// cluster or on any Linux/Mac system.  It is recommended to write the job script using a [[doku:win2vsc&#the_job_filetext_editors_on_the_cluster|text editor]] on the VSC //Linux// cluster or on any Linux/Mac system. 
-Editors in //Windows// may add additional invisible characters to the job file which render it unreadable and, thus, it is not executed. +Editors in //Windows// may add additional invisible characters to the job file which render it unreadable and, thus, it cannot be not executed. 
  
 Assume a submission script ''check.slrm'' Assume a submission script ''check.slrm''
Line 76: Line 76:
 </code> </code>
   * **-J**     job name,\\    * **-J**     job name,\\ 
-  * **-N**     number of nodes requested (16 cores per node available)\\ +  * **-N**     number of nodes requested\\ 
   * **-n, --ntasks=<number>** specifies the number of tasks to run,         * **-n, --ntasks=<number>** specifies the number of tasks to run,      
   * **--ntasks-per-node**     number of processes run in parallel on a single node \\           * **--ntasks-per-node**     number of processes run in parallel on a single node \\        
Line 89: Line 89:
    
 <code> <code>
-[username@l31 ~]$ sbatch check.slrm    # to submit the job              +[username@l42 ~]$ sbatch check.slrm    # to submit the job              
-[username@l31 ~]$ squeue -u `whoami`   # to check the status  of own jobs +[username@l42 ~]$ squeue -u `whoami`   # to check the status  of own jobs 
-[username@l31 ~]$ scancel  JOBID       # for premature removal, where JOBID+[username@l42 ~]$ scancel  JOBID       # for premature removal, where JOBID
                                        # is obtained from the previous command                                           # is obtained from the previous command   
 </code> </code>
Line 98: Line 98:
  
  
- 
-====A word on srun and mpirun:==== 
-Currently (27th March 2015), **srun** only works when the application uses **intel mpi** and is compiled with the **intel compiler**. We will provide compatible versions of MVAPICH2 and OpenMPI in the near future. 
-At the moment, it is recommended to use **mpirun** in case of MVAPICH2 and OpenMPI. 
  
  
Line 157: Line 153:
 #SBATCH -J chk #SBATCH -J chk
 #SBATCH -N 4 #SBATCH -N 4
-#SBATCH --ntasks-per-node=16+#SBATCH --ntasks-per-node=48
 #SBATCH --ntasks-per-core=1 #SBATCH --ntasks-per-core=1
  
Line 163: Line 159:
 scontrol show hostnames $SLURM_NODELIST  > ./nodelist scontrol show hostnames $SLURM_NODELIST  > ./nodelist
  
-srun -l -N2 -r0 -n32 job1.scrpt & +srun -l -N2 -r0 -n96 job1.scrpt & 
-srun -l -N2 -r2 -n32 job2.scrpt &+srun -l -N2 -r2 -n96 job2.scrpt &
 wait wait
  
-srun -l -N2 -r2 -n32 job3.scrpt & +srun -l -N2 -r2 -n96 job3.scrpt & 
-srun -l -N2 -r0 -n32 job4.scrpt &+srun -l -N2 -r0 -n96 job4.scrpt &
 wait wait
  
Line 276: Line 272:
 #SBATCH -J par                      # job name #SBATCH -J par                      # job name
 #SBATCH -N 2                        # number of nodes=2 #SBATCH -N 2                        # number of nodes=2
-#SBATCH --ntasks-per-node=16        # uses all cpus of one node      +#SBATCH --ntasks-per-node=48        # uses all cpus of one node      
 #SBATCH --ntasks-per-core=1 #SBATCH --ntasks-per-core=1
 #SBATCH --threads-per-core=1 #SBATCH --threads-per-core=1
Line 286: Line 282:
 rm machines_tmp rm machines_tmp
  
-tasks_per_node=16         # change number accordingly+tasks_per_node=48         # change number accordingly
 nodes=2                   # change number accordingly nodes=2                   # change number accordingly
 for ((line=1; line<=nodes; line++)) for ((line=1; line<=nodes; line++))
Line 338: Line 334:
   - continue at 2. for further dependent jobs   - continue at 2. for further dependent jobs
  
-===== Licenses ===== 
  
-Software, that uses a license server, has to be specified upon job submission. A list of all available licensed software for your user can be shown by using the command: 
- 
-<code> 
-slic 
-</code> 
- 
-Within the job script add the flags as shown with 'slic', e.g. for using both Matlab and Mathematica: 
- 
-<code> 
-#SBATCH -L matlab@vsc,mathematica@vsc 
-</code> 
  
 ===== Prolog Error Codes ===== ===== Prolog Error Codes =====
  • doku/slurm.1678797260.txt.gz
  • Last modified: 2023/03/14 12:34
  • by goldenberg