Xpra vs. VNC (1)
On the current cluster (“smmpmech.unileoben.ac.at”):
X server (VNC) runs on head node
X clients (=applications, e.g. Fluent) run on compute nodes
X clients communicate with X server
over “physical” network (Infiniband)
with the inefficient X protocol
many clients with one server
Xpra vs. VNC (2)
Problems with the current method:
X clients (=applications) die when connection is lost
many clients are displayed on one server
communication (X server ↔ X client)
takes a lot of CPU power on head node
can slow down the application (experienced up to 60% performance loss with Fluent)
that is the reason why minimizing Fluent window helps (no graphic updates ⇒ no communication)
Xpra vs. VNC (3)
On the new cluster we have:
X servers (Xpra) run on compute nodes
X clients (applications) run on compute nodes
client and server communicate directly on the same machine
to see the actual output you must attach to the Xpra server with an Xpra client
Xpra vs. VNC (4)
Solved problems with this method:
X clients (=applications) no longer die when connection is lost
misbehaving X clients can only block their own server
communication (X server ↔ X client) stays on the comput node ⇒ fast
communication Xpra server ↔ Xpra client
whole nodes vs. partial nodes (1)
When users can allocate only whole nodes:
everything is easier for the admins
no jobs are getting in the way of each other on the same node
memory is given implicitly by number and kind of nodes
no fragmentation and problems related to fragmentation
whole nodes vs. partial nodes (2)
But there are also disadvantages with whole nodes:
Therefore we have decided on allowing use of partial nodes.
why must the memory be specified?
how does fair share scheduling work?
as long as there are free resources (e.g. cores): no effect can be seen
as soon as jobs have to compete for resources:
in the long run this allocates resources according to the predefined percentages
why do we count memory for fair share?