Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revisionLast revisionBoth sides next revision | ||
doku:slurm_multisite [2022/12/22 22:49] – [Federation Job Submission] fsattari | doku:slurm_multisite [2022/12/22 23:01] – fsattari | ||
---|---|---|---|
Line 110: | Line 110: | ||
* A cluster can only be part of one federation at a time | * A cluster can only be part of one federation at a time | ||
* Embed cluster ID within the originally 32-bit job ID | * Embed cluster ID within the originally 32-bit job ID | ||
- | {{ : | + | {{: |
< | < | ||
Line 131: | Line 131: | ||
CLUSTER: vsc4 | CLUSTER: vsc4 | ||
| | ||
- | '' | + | 67109080 |
CLUSTER: vsc5 | CLUSTER: vsc5 | ||
| | ||
- | //134217981// | + | |
</ | </ | ||
Line 142: | Line 142: | ||
[root@node]# | [root@node]# | ||
Federation: vscdev_fed | Federation: vscdev_fed | ||
- | Self: vscdev2:X.X.X.X:X ID:2 FedState: | + | Self: vsc4:X.X.X.X:X ID:2 FedState: |
- | Sibling: | + | Sibling: |
</ | </ | ||
- | |||
- | |||
- | ==== Slurm Federation Workflow ==== | ||
- | {{: | ||
===== Multi-Cluster vs Federation implementation ===== | ===== Multi-Cluster vs Federation implementation ===== | ||
- | {{: | ||
On a basic approach, multi-cluster is one unique interface to submit jobs to multiple separated Slurm clusters and the Slurm database can be unique or can be dedicated to each Slurm cluster while federation is a way to federate the job and scheduling information as one and the Slurm database must be unique. | On a basic approach, multi-cluster is one unique interface to submit jobs to multiple separated Slurm clusters and the Slurm database can be unique or can be dedicated to each Slurm cluster while federation is a way to federate the job and scheduling information as one and the Slurm database must be unique. | ||
- | |||
- | ===== Slurm Burst-Buffer ===== | ||
- | |||
- | I/O components are much slower than the compute parts of a supercomputer, | ||
- | |||
- | The data staging derives large scale of traffic on a network connecting computing nodes for moving input and output data between the computing nodes. In this network, the traffic of inter-process communication also flows and consequently mutual interference between both types of traffic may degrade network performance. For example, burst traffic derived from the data staging increases delay in inter-process communication. Also, both types of traffic compete network bandwidth and consequently communication time is increased. | ||
- | |||
- | Burst-Buffer plugin adds a layer between the compute nodes and the parallel file system to improve network performance, |