This version (2024/10/24 10:28) is a draft.
Approvals: 0/1
Approvals: 0/1
Xpra vs. VNC (1)
On the current cluster (“smmpmech.unileoben.ac.at”):
- X server (VNC) runs on head node
- X clients (=applications, e.g. Fluent) run on compute nodes
- X clients communicate with X server
- over “physical” network (Infiniband)
- with the inefficient X protocol
- many clients with one server
Xpra vs. VNC (2)
Problems with the current method:
- X clients (=applications) die when connection is lost
- therefore head node can not be rebooted without killing jobs
- somtimes VNC crashes or gets stuck
- many clients are displayed on one server
- one misbehaving client can block the server
- communication (X server ↔ X client)
- takes a lot of CPU power on head node
- can slow down the application (experienced up to 60% performance loss with Fluent)
- that is the reason why minimizing Fluent window helps (no graphic updates ⇒ no communication)
Xpra vs. VNC (3)
On the new cluster we have:
- X servers (Xpra) run on compute nodes
- one server per application
- X clients (applications) run on compute nodes
- client and server communicate directly on the same machine
- each client with its own server
- no “physical” network involved
- to see the actual output you must attach to the Xpra server with an Xpra client
- use
sbatch+display
to submit and display a job - use
display-all
to display graphical output of all your jobs
Xpra vs. VNC (4)
Solved problems with this method:
- X clients (=applications) no longer die when connection is lost
- login nodes can be booted any time
- simply detach the Xpra connection when you are not watching it in order not to slow down the application (e.g. Fluent)
- misbehaving X clients can only block their own server
- communication (X server ↔ X client) stays on the comput node ⇒ fast
- communication Xpra server ↔ Xpra client
- can be detached and reattached any time
- uses efficient Xpra protocol
whole nodes vs. partial nodes (1)
When users can allocate only whole nodes:
- everything is easier for the admins
- no jobs are getting in the way of each other on the same node
- memory is given implicitly by number and kind of nodes
- no need for the user to specify memory in job script
- no fragmentation and problems related to fragmentation
- e.g. partial nodes are free but the user needs a whole node
whole nodes vs. partial nodes (2)
But there are also disadvantages with whole nodes:
- single core jobs are more complicated for the user
- user must manage the execution of many small jobs on one node himself/herself
- if a cluster consists of relatively few nodes its utilization will be worse
Therefore we have decided on allowing use of partial nodes.
why must the memory be specified?
- because of shared node usage (also referred to as “single core jobs”)
- to avoid “killing” nodes by using too much memory
how does fair share scheduling work?
- as long as there are free resources (e.g. cores): no effect can be seen
- as soon as jobs have to compete for resources:
- history of user / group comes into play
- scheduler preferes jobs of users/groups which have not had their fair share yet
- in the long run this allocates resources according to the predefined percentages
why do we count memory for fair share?
- not only cores but also memory can “block” nodes
- e.g. 1 core and 120 GB of RAM block an entire E5-2690v4 node and are in this way equivalent to 28 cores
- therefore memory must also count