doku:gpu [VSC Wiki]

This version (2022/06/20 09:01) was approved by msiegel.

This is an old revision of the document!

On VSC-1 currently two different kind of GPU Nodes are available:

Four Nodes with two NVIDIA Tesla C2050 GPU cards each. One Tesla GPU features 448 CUDA Cores and the card provides 3GB of GDDR5 memory. The Nodes are equipped with two Intel X5650 Westmere 2.67 GHz CPUs and 24 GB RAM.
Two Nodes with two NVIDIA Kepler K20m GPU cards each. One K20m GPU features 2496 CUDA cores and provides 5GB of GDDR5 memory. The Nodes are equipped with two Intel Xeon E5-2680 2.70 GHz CPUs and 256 GB RAM.

Access to nodes is managed via a dedicated queue called 'fermi' / 'kepler'. If you would like to have access to this queue please contact system administration and specify the user who wants to access the GPU nodes. The username needs to be one of an already existing VSC user. After a user has been added, the nodes can be access interactively via:

#switch user group first, then login
sg fermi             
qrsh -q fermi

or

#switch user group first, then login
sg fermi             
qrsh -q kepler

Alternatively you can submit a job script which has in its header the parameters:

#$ -q fermi
#$ -P fermi

or

#$ -q kepler
#$ -P fermi

The usage of the additional -pe smp 6 parameter is currently not mandatory, but strongly encouraged. In order to avoid more than two jobs requesting GPU's on one node (each node has two GPU units) half of the 12 CPU cores of the node should be requested by the job.

Job submission has to be done using the 'qsub.py' wrapper script:

qsub.py job.sh

The runtime consumed on the GPU nodes will not be deducted from your VSC account.

Available software tools are the nvcc compiler, cuda libraries, and cula tools. The following variables are defined (automatically set in your environment) in order to use these:

CULA_ROOT="/opt/sw/cula"
CULA_INC_PATH="$CULA_ROOT/include"
CULA_BIN_PATH_64="$CULA_ROOT/bin64"
CULA_LIB_PATH_64="$CULA_ROOT/lib64"
PATH=/opt/sw/cuda/bin:$PATH
LD_LIBRARY_PATH=/opt/sw/cuda/lib64:/opt/sw/cuda/computeprof/bin:$CULA_LIB_PATH_64:$LD_LIBRARY_PATH

Example programs can be found in '/opt/sw/cula/examples/'.
Extensive documentation here: '/opt/sw/cuda/doc/'.

can be used to monitor the GPU usage, e.g.

[@r18n45 ~]# nvidia-smi -i 1 -q -d UTILIZATION,MEMORY -l

==============NVSMI LOG==============

Timestamp                       : Tue Jun 28 14:09:58 2011

Driver Version                  : 270.40

Attached GPUs                   : 2

GPU 0:84:0
    Memory Usage
        Total                   : 2687 Mb
        Used                    : 101 Mb
        Free                    : 2586 Mb
    Utilization
        Gpu                     : 99 %
        Memory                  : 0 %

for more options see man nvidia-smi

Examples of C and fortran code can be found here: