no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
Previous revision Next revision | |||
— | doku:vsc3_gpu [2017/09/22 12:17] – sh | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== GPU computing and visualization ======= | ||
+ | |||
+ | The following GPU devices are available: | ||
+ | |||
+ | ^ Tesla c2050 (fermi) | ||
+ | | Total amount of global memory|2687 MBytes | | ||
+ | | (14) Multiprocessors, | ||
+ | | GPU Clock rate|1147 MHz | | ||
+ | | Maximum number of threads per block|1024 | | ||
+ | | Device has ECC support|Enabled | | ||
+ | |||
+ | |||
+ | ^ Tesla k20m (kepler) | ||
+ | | Total amount of global memory|4742 MBytes | | ||
+ | | (13) Multiprocessors, | ||
+ | | GPU Clock rate|706 MHz | | ||
+ | | Maximum number of threads per block|1024 | | ||
+ | | Device has ECC support|Enabled | | ||
+ | |||
+ | |||
+ | ^ Tesla m60 (maxwell) | ||
+ | | Total amount of global memory|8114 MBytes | | ||
+ | | (16) Multiprocessors, | ||
+ | | GPU Clock rate|1.18 GHz | | ||
+ | | Maximum number of threads per block|1024 | | ||
+ | | Device has ECC support|Disabled | | ||
+ | |||
+ | |||
+ | ^ Consumer grade GeForce GTX 1080 (pascal) | ||
+ | | Total amount of global memory|8113 MBytes | | ||
+ | | (20) Multiprocessors, | ||
+ | | GPU Clock rate|1.73 GHz | | ||
+ | | Maximum number of threads per block|1024 | | ||
+ | | Device has ECC support|Disabled | | ||
+ | |||
+ | |||
+ | * One node, n25-009, equipped with two Tesla c2050 (fermi) GPUs. The host system includes two Intel Xeon X5650 @ 2.67GHz CPUs with 6 cores each and 24GB of RAM. | ||
+ | * Two nodes, n25-[005, | ||
+ | * < | ||
+ | * Ten nodes, n25-[011-020], | ||
+ | * Two shared-private nodes, n25-[021-022], | ||
+ | [[https:// | ||
+ | ------- | ||
+ | ==== Slurm integration ==== | ||
+ | |||
+ | There is one partition called '' | ||
+ | |||
+ | < | ||
+ | PARTITION AVAIL TIMELIMIT | ||
+ | gpu up | ||
+ | gpu up | ||
+ | </ | ||
+ | and which needs to be specified via: | ||
+ | < | ||
+ | |||
+ | GPU nodes are selected via the **generic resource (--gres=)** and **constraints (-C, | ||
+ | * c2050 (fermi) GPU node: < | ||
+ | #SBATCH --gres=gpu: | ||
+ | </ | ||
+ | * k20m (kepler) GPU nodes: < | ||
+ | #SBATCH --gres=gpu: | ||
+ | </ | ||
+ | * m60 (maxwell) GPU nodes: < | ||
+ | #SBATCH --gres=gpu: | ||
+ | </ | ||
+ | * gtx1080 (pascal) GPU nodes: < | ||
+ | #SBATCH --gres=gpu: | ||
+ | </ | ||
+ | * at idle times of private-shared gtx1080 (pascal) GPU nodes: < | ||
+ | #SBATCH -C gtx1080 | ||
+ | #SBATCH --gres=gpu: | ||
+ | </ | ||
+ | |||
+ | |||
+ | To use a gpu node for computing purposes the quality of service (QoS) '' | ||
+ | < | ||
+ | For visualization the QoS '' | ||
+ | < | ||
+ | When a job is submitted within the '' | ||
+ | |||
+ | -------------------------------------------------- | ||
+ | |||
+ | ===== Visualization ====== | ||
+ | |||
+ | To make use of a gpu node for visualization you need to perform the following steps. | ||
+ | |||
+ | - set a vnc password, this is needed when connecting to the vnc server, **this has to be done only once**: < | ||
+ | mkdir ${HOME}/ | ||
+ | vncpasswd | ||
+ | Password: ****** | ||
+ | Warning: password truncated to the length of 8. | ||
+ | Verify: | ||
+ | Would you like to enter a view-only password (y/n)? n | ||
+ | </ | ||
+ | - allocate gpu nodes with this script: < | ||
+ | - start vnc server: < | ||
+ | - follow the instructions on the screen and connect from your local machine with a vncvier: < | ||
+ | |||
+ | All options for sviz: | ||
+ | < | ||
+ | sviz -h | ||
+ | usage: / | ||
+ | Parameters: | ||
+ | -h print this help | ||
+ | -a allocate gpu nodes | ||
+ | -r start vnc server on allocated nodes | ||
+ | |||
+ | options for allocating: | ||
+ | -t set gpu type; default=gtx1080 | ||
+ | -n set gpu count; default=1 | ||
+ | |||
+ | options for vnc server: | ||
+ | -g set geometry; default=1920x1080 | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==== Linux/ | ||
+ | |||
+ | On your local (Linux) workstation you can use any vnc client which supports a gateway parameter (usually, there is a '' | ||
+ | < | ||
+ | Password: | ||
+ | Connected to RFB server, using protocol version 3.8 | ||
+ | Enabling TightVNC protocol extensions | ||
+ | Performing standard VNC authentication | ||
+ | Password: | ||
+ | Authentication successful | ||
+ | Desktop name " | ||
+ | VNC server default format: | ||
+ | 32 bits per pixel. | ||
+ | Least significant byte first in each pixel. | ||
+ | True colour: max red 255 green 255 blue 255, shift red 16 green 8 blue 0 | ||
+ | Warning: Cannot convert string " | ||
+ | Using default colormap which is TrueColor. | ||
+ | 32 bits per pixel. | ||
+ | Least significant byte first in each pixel. | ||
+ | True colour: max red 255 green 255 blue 255, shift red 16 green 8 blue 0 | ||
+ | Tunneling active: preferring tight encoding</ | ||
+ | |||
+ | You should now see a desktop like this: | ||
+ | {{: | ||
+ | |||
+ | Windows versions of TightVNC are also available. | ||
+ | |||
+ | |||
+ | ==== OS X/ | ||
+ | |||
+ | Under OS X it is suggested to use the [[http:// | ||
+ | This is how you can setup the client connection to the VNC server: | ||
+ | - Setup the connection: {{: | ||
+ | - Enter your cluster password: {{: | ||
+ | - Enter your OTP: {{: | ||
+ | - Enter your VNC password: {{: | ||
+ | |||
+ | A desktop will be displayed on your screen: | ||
+ | {{: | ||
+ | |||
+ | |||
+ | ==== VirtualGL ==== | ||
+ | |||
+ | Load the module | ||
+ | |||
+ | < | ||
+ | module load VirtualGL/ | ||
+ | </ | ||
+ | |||
+ | |||
+ | The following variables need to be set: | ||
+ | |||
+ | < | ||
+ | export VGL_DISPLAY=: | ||
+ | export DISPLAY=:1 | ||
+ | </ | ||
+ | |||
+ | To make use of VirtualGL your application needs to be started with '' | ||
+ | < | ||
+ | |||
+ | |||
+ | --------------------- | ||
+ | ===== GPU computing ===== | ||
+ | |||
+ | ==== CUDA ==== | ||
+ | Cuda toolkits are available in version 5.5, 7.5, 8.0.27 and 8.0.61 (which provide e.g. the nvcc compiler) and are accessible by loading the corresponding cuda module: | ||
+ | < | ||
+ | or | ||
+ | < | ||
+ | or | ||
+ | < | ||
+ | or | ||
+ | < | ||
+ | |||
+ | ==== Batch jobs ==== | ||
+ | |||
+ | To submit batch jobs, a sample job script '' | ||
+ | < | ||
+ | #SBATCH -J gpucmp | ||
+ | #SBATCH -N 1 | ||
+ | #SBATCH --partition=gpu | ||
+ | #SBATCH --qos=gpu_compute | ||
+ | #SBATCH --time=00: | ||
+ | #SBATCH --gres=gpu: | ||
+ | #SBATCH -C k20m | ||
+ | |||
+ | ./ | ||
+ | </ | ||
+ | Submit the job: | ||
+ | < | ||
+ | |||
+ | |||
+ | ==== Controlling GPU utilization with nvidia-smi ==== | ||
+ | |||
+ | The standard way to check whether an application actually makes use of GPUs and to what extent is by calling | ||
+ | < | ||
+ | nvidia-smi | ||
+ | </ | ||
+ | or better within a separate terminal | ||
+ | < | ||
+ | watch nvidia-smi | ||
+ | </ | ||
+ | For further details please see the man page of '' | ||
+ | |||
+ | ==== CUDA C References ==== | ||
+ | {{: | ||
+ | {{: | ||
+ | {{: | ||
+ | |||
+ | ==== CUDA Libraries References ==== | ||
+ | {{: | ||
+ | {{: | ||
+ | {{: | ||
+ | {{: | ||
+ | |||
+ | ==== Additional Docu ==== | ||
+ | More details found in dir '' | ||