no way to compare when less than two revisions

Differences

This shows you the differences between two versions of the page.


Previous revision
Next revision
doku:vsc3_queue [2021/09/11 08:02] – [Hardware types] goldenberg
Line 1: Line 1:
 +====== Queue | Partition setup on VSC-3+ ======
 +On VSC-3+, the type of hardware and the quality of service (QOS) where the jobs run on may be selected. Nodes of the same type of hardware are grouped to partitions, the QOS defines the maximum run time of a job and the number and type of allocable nodes. 
 +===== Hardware types =====
 +Three different types of compute nodes, nodes with 64 GB and 256 GB, GPU nodes and
 +bioinformatics nodes (very high memory) are available.
 +
 +On VSC-3+, the hardware is grouped into so-called <html><font color=#cc3300>&#x27A0; partitions</font></html>:
 +
 +^partition name^ description^
 +| | | 
 +|vsc3plus_0064 | default, nodes with 64 GB of memory |
 +|vsc3plus_0256 | nodes with 256 GB of memory|
 +|gpu_xxxx      | GPU nodes, partition depending on GPU type |
 +|binf | Bioinformatics nodes |
 +|jupyter| reserved for the JupyterHub |
 +===== Quality of service (QOS) =====
 +
 +Access to node partitions is granted by the so-called <html><font color=#cc3300>&#x27A0; quality of service (QOS)</font></html>. The QOSs constrain the number of allocatable nodes and limit job wall time. The naming scheme of the QOSs is:
 +<project_type>_<memoryConfig>
 +
 +The QOSs that are assigned to a specific user can be viewed with:
 +<code>
 +sacctmgr show user `id -u` withassoc format=user,defaultaccount,account,qos%40s,defaultqos%20s
 +</code>
 +
 +Have a look on the <html><font color=green>&#x27A0; </font></html> [[doku:vsc3qos|available QOSs on VSC-3]].
 +==== Run time limits ====
 +
 +
 +^ The QOS's hard run time limits ^   |
 +| | | 
 +| normal_0064 / normal_0128 / normal_0256  | 72h (3 days) |           
 +| idle_0064 / idle_0128 / idle_0256        | 24h (1 day)  |
 +| private queues   p....._0...             | 240h (10 days) |
 +| devel queue (up to 10 nodes available)   | 10min        |
 +The QOS's run time limits can also be requested via the command
 +<code>sacctmgr show qos  format=name%20s,priority,grpnodes,maxwall,description%40s</code>
 +SLURM allows for setting a run time limit //below// the default QOS's run time limit. After the specified time is elapsed, the job is killed:
 +<code>#SBATCH --time=<time> </code>
 +Acceptable time formats include "minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and "days-hours:minutes:seconds".
 +
 +==== Backfilling ====
 +
 +Furthermore, it is possible to set a minimum time limit on the job allocation. When jobs with different demands of resources are scheduled, it is most likely that not all nodes can be filled. Imagine a job requesting more resources than are currently free and, thus, cannot start. Since it has the highest priority, all other jobs would need to wait. **Backfilling** means the process to fill up those idle nodes with jobs fitting the unused time gap. This permits the time limited job to be **scheduled earlier than jobs with higher piority**. It is highly encouraged to guess this minimum time where possible because it also contributes to a better cluster usage:
 +<code>#SBATCH --time-min=<time></code>
 +
 +==== sbatch parameters ====
 +For submitting jobs, three parameters are important:
 +
 +<code>
 +#SBATCH --partition=mem_xxxx
 +#SBATCH --qos=xxxxx_xxxx
 +#SBATCH --account=xxxxxx
 +</code>
 +The core hours will be charged to the specified account. If not specified, the default account (''sacctmgr show user `id -u` withassoc format=defaultaccount'') will be used.
 +
 +=== ordinary projects ===
 +
 +For ordinary projects the QOSs are:
 +^QOS name ^ gives access to partition ^description^
 +| | | 
 +|normal_0064 | mem_0064| default |
 +|normal_0128 | mem_0128| |
 +|normal_0256 | mem_0256| |
 +|devel_0128  | mem_0128|for development purposes only 10 min & 10 nodes|
 +
 +== examples ==
 +<code>
 +#SBATCH --partition=mem_0128
 +#SBATCH --qos=normal_0128      
 +#SBATCH --account=p7xxxx   
 +</code>
 +<code>
 +#SBATCH --partition=mem_0128
 +#SBATCH --qos=devel_0128
 +#SBATCH --account=p7xxxx
 +</code>
 +  * Note that partition, qos, and account have to fit together. 
 +  * If the account is not given, the default account (''sacctmgr show user `id -u` withassoc format=defaultaccount'') will be used.
 +  * If partition and qos are not given, default values are mem_0064 and normal_0064.
 +
 +=== private nodes projects ===
 +
 +== example ==
 +
 +<code>
 +#SBATCH --partition=mem_xxxx
 +#SBATCH --qos=p7xxx_xxxx
 +#SBATCH --account=p7xxxx 
 +</code> 
 +
  
  • doku/vsc3_queue.txt
  • Last modified: 2022/11/04 11:24
  • by 85.25.185.103