Differences

This shows you the differences between two versions of the page.

--- doku:slurm_multisite [2022/12/22 22:30] – fsattari
+++ doku:slurm_multisite [2022/12/22 23:01] (current) – fsattari
@@ Line 21: / Line 21: @@
 </code>
-==== Node allocation policy ====
-  * The multi-cluster functionality requires the use of the SlurmDBD.
-  * When sbatch, salloc or srun is invoked with a cluster list, Slurm submits the job to the cluster that offers the earliest start time considering its queue of pending and running jobs
-  * BUT Slurm will make no subsequent effort to migrate the job to a different cluster whose resources become available when running jobs finish before their scheduled end times.
-  * Originally, job IDs are not unique across multiple clusters.
-{{:doku:mcslurm.png?700|}}
@@ Line 70: / Line 58: @@
 CLUSTER: vsc4
              JOBID  PARTITION     NAME     USER   ST       TIME  NODES NODELIST(REASON)
-skylake_0 V_0.2_U_     nobody PD       0:00      1 (Resources)
+skylake_0 V_0.2_U_     nobody PD       0:00     1  n4905-025,n4906-020
-skylake_0 V_0.3_U_     nobody PD       0:00      1 (Priority)
+skylake_0 V_0.3_U_     nobody PD       0:00     1  n4905-025,n4906-020
              .
              .
@@ Line 110: / Line 98: @@
   * A cluster can only be part of one federation at a time
   * Embed cluster ID within the originally 32-bit job ID
+{{:doku:orig-federated-jobid.png?600|}}
-{{:doku:slurmfederation.png?700|}}
 <code>
@@ Line 130: / Line 117: @@
 <code>
 [...]# squeue -M vscdev,vscdev2
-CLUSTER: vscdev
+CLUSTER: vsc4
-             JOBID          PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
+             JOBID          PARTITION     NAME     USER   ST    TIME  NODES  NODELIST(REASON)
-         ** 67109080**      test          test     root  R       0:01      2 storage[02-03]
+             67109080       skylake_0 V_0.3_U_     nobody PD    0:05     3   n4905-025,n4906-020
-CLUSTER: vscdev2
-             JOBID          PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
+CLUSTER: vsc5
-         ** 134217981**     test          test     root  R       0:11      1 storage04
+             JOBID          PARTITION     NAME     USER   ST    TIME  NODES  NODELIST(REASON)
+             134217981      zen3_2048              nobody PD    0:25     8   n3511-[011-013,015-020]
 </code>
@@ Line 142: / Line 130: @@
 [root@node]# scontrol show fed --sibling job
 Federation: vscdev_fed
-Self:       vscdev2:X.X.X.X:X ID:2 FedState:ACTIVE Features:synced:yes
+Self:       vsc4:X.X.X.X:X ID:2 FedState:ACTIVE Features:synced:yes
-Sibling:    vscdev:X.X.X.X:X  ID:1 FedState:ACTIVE Features:synced:yes PersistConnSend/Recv:Yes/Yes Synced:Yes
+Sibling:    vsc5:X.X.X.X:X ID:1 FedState:ACTIVE Features:synced:yes PersistConnSend/Recv:Yes/Yes Synced:Yes
 </code>
-==== Slurm Federation Workflow ====
-{{:doku:federationworkflow.png?700|}}
 ===== Multi-Cluster vs Federation implementation =====
-{{:doku:multiclustervsfederation.png?500|}}
 On a basic approach, multi-cluster is one unique interface to submit jobs to multiple separated Slurm clusters and the Slurm database can be unique or can be dedicated to each Slurm cluster while federation is a way to federate the job and scheduling information as one and the Slurm database must be unique.
-===== Slurm Burst-Buffer =====
-I/O components are much slower than the compute parts of a supercomputer, therefore they can create bottlenecks if the bandwidth is saturated.
-The data staging derives large scale of traffic on a network connecting computing nodes for moving input and output data between the computing nodes. In this network, the traffic of inter-process communication also flows and consequently mutual interference between both types of traffic may degrade network performance. For example, burst traffic derived from the data staging increases delay in inter-process communication. Also, both types of traffic compete network bandwidth and consequently communication time is increased.
-Burst-Buffer plugin adds a layer between the compute nodes and the parallel file system to improve network performance, I/O, and data staging.