User Tools

Site Tools


doku:vsc3_gpu
LDAP: couldn't connect to LDAP server

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
doku:vsc3_gpu [2017/09/22 12:36]
sh
doku:vsc3_gpu [2018/04/10 09:52] (current)
markus [Slurm integration]
Line 2: Line 2:
  
 The following GPU devices are available: The following GPU devices are available:
- 
-^  Tesla c2050 (fermi)  ^^ 
-|  Total amount of global memory|2687 MBytes | 
-|  (14) Multiprocessors, ( 32) CUDA Cores/MP|448 CUDA Cores | 
-|  GPU Clock rate|1147 MHz | 
-|  Maximum number of threads per block|1024 | 
-|  Device has ECC support|Enabled | 
  
  
-^  Tesla k20m (kepler)  ^^+^  Enterprise-grade Tesla k20m (kepler)  ^^
 |  Total amount of global memory|4742 MBytes | |  Total amount of global memory|4742 MBytes |
 |  (13) Multiprocessors, (192) CUDA Cores/MP|2496 CUDA Cores | |  (13) Multiprocessors, (192) CUDA Cores/MP|2496 CUDA Cores |
Line 19: Line 12:
  
  
-^  Tesla m60 (maxwell)  ^^+^  Enterprise-grade Tesla m60 (maxwell)  ^^
 |  Total amount of global memory|8114 MBytes | |  Total amount of global memory|8114 MBytes |
 |  (16) Multiprocessors, (128) CUDA Cores/MP|2048 CUDA Cores | |  (16) Multiprocessors, (128) CUDA Cores/MP|2048 CUDA Cores |
Line 27: Line 20:
  
  
-^  Consumer grade GeForce GTX 1080 (pascal)  ^^+^  Consumer-grade GeForce GTX 1080 (pascal)  ^^
 |  Total amount of global memory|8113 MBytes | |  Total amount of global memory|8113 MBytes |
 |  (20) Multiprocessors, (128) CUDA Cores/MP|2560 CUDA Cores | |  (20) Multiprocessors, (128) CUDA Cores/MP|2560 CUDA Cores |
Line 35: Line 28:
  
  
-  * One node, n25-009, equipped with two Tesla c2050 (fermi) GPUs. The host system includes two Intel Xeon X5650  @ 2.67GHz CPUs with 6 cores each and 24GB of RAM. 
   * Two nodes, n25-[005,006], with two Tesla k20m (kepler) GPUs each. Host systems are equipped with two Intel Xeon E5-2680 0 @ 2.70GHz each with 8 cores and 256GB of RAM.   * Two nodes, n25-[005,006], with two Tesla k20m (kepler) GPUs each. Host systems are equipped with two Intel Xeon E5-2680 0 @ 2.70GHz each with 8 cores and 256GB of RAM.
-  * <html><font color=#cc3300>One</font color=#cc3300></html> node, n25-007, with two Tesla m60 (maxwell) GPUs. n25-007 is equipped with 2 Intel Xeon E5-2650 v3 @ 2.30GHz, each with 10 cores and a host memory of 256GB RAM.+  * One node, n25-007, with two Tesla m60 (maxwell) GPUs. n25-007 is equipped with 2 Intel Xeon E5-2650 v3 @ 2.30GHz, each with 10 cores and a host memory of 256GB RAM.
   * Ten nodes, n25-[011-020], with single GPUs of type GTX-1080 (pascal), where host systems are single socket 4-core Intel Xeon E5-1620 @ 3.5 GHz with 64GB RAM.    * Ten nodes, n25-[011-020], with single GPUs of type GTX-1080 (pascal), where host systems are single socket 4-core Intel Xeon E5-1620 @ 3.5 GHz with 64GB RAM. 
   * Two shared-private nodes, n25-[021-022], each equipped with 8 GTX-1080 (pascal) devices hosted on dual socket 4-core Intel Xeon E5-2623 systems @ 2.6 GHz with 128GB RAM.   * Two shared-private nodes, n25-[021-022], each equipped with 8 GTX-1080 (pascal) devices hosted on dual socket 4-core Intel Xeon E5-2623 systems @ 2.6 GHz with 128GB RAM.
 [[https://github.com/NVIDIA/gdrcopy|gdrdrv]] is loaded by default (also see [[https://devtalk.nvidia.com/default/topic/919381/gdrcpy-problem/|notes]] regarding gtx1080 cards). [[https://github.com/NVIDIA/gdrcopy|gdrdrv]] is loaded by default (also see [[https://devtalk.nvidia.com/default/topic/919381/gdrcpy-problem/|notes]] regarding gtx1080 cards).
-<html><p><font color=#cc3300><sup>*</sup>effective September 22, 2017</font color=#cc3300></p></html>+
 ------- -------
 ==== Slurm integration ==== ==== Slurm integration ====
  
-There is one partition called ''gpu'' which includes all available gpu nodes+There are several partions which include all identical compute nodes with the same amount of GPUs on each node:
  
-<code>[user@l31 ~]$ sinfo -p gpu +<code> 
-PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST +gpu_gtx1080single   #  4 cpu cores, one gpu per node 
-gpu          up   infinite     10   alloc n25-[011-020] +gpu_gtx1080multi    # 16 cpu cores, eight gpus per node 
-gpu          up   infinite      5   idle n25-[005-007,009-010]+gpu_k20m            # 16 cpu cores, two gpus per node 
 +gpu_m60             # 16 cpu cores, one gpu per node
 </code> </code>
-and which needs to be specified via: 
-<code>#SBATCH --partition=gpu</code> 
  
-GPU nodes are selected via the **generic resource (--gres=)** and **constraints (-C,--constraint=)** options+For each partition a identically named QOS is defined. Slurm usage is eg.
-  * c2050 (fermi) GPU node: <code>#SBATCH -C c2050 + 
-#SBATCH --gres=gpu:2+<code> 
 +#SBATCH -p gpu_gtx1080single 
 +#SBATCH --qos gpu_gtx1080single
 </code> </code>
-  * k20m (kepler) GPU nodes: <code>#SBATCH -C k20m 
-#SBATCH --gres=gpu:2 
-</code> 
-  * m60 (maxwell) GPU nodes: <code>#SBATCH -C m60 
-#SBATCH --gres=gpu:2 
-</code> 
-  * gtx1080 (pascal) GPU nodes: <code>#SBATCH -C gtx1080 
-#SBATCH --gres=gpu:1 
-</code> 
-  * at idle times of private-shared gtx1080 (pascal) GPU nodes: <code>#SBATCH --partition=p70971_gpu  
-#SBATCH -C gtx1080 
-#SBATCH --gres=gpu:8 
-</code> 
- 
  
-To use a gpu node for computing purposes the quality of service (QoS) ''gpu_compute'' is available which provides a **maximum runtime of three days**: 
-<code>#SBATCH --qos=gpu_compute</code> 
-For visualization the QoS ''gpu_vis'' has to be used where a gpu node can be occupied for up to **twelve hours** for interactive visualization: 
-<code>#SBATCH --qos=gpu_vis</code> 
-When a job is submitted within the ''gpu_vis'' QoS, an X server is started on the gpu node. 
  
 -------------------------------------------------- --------------------------------------------------
  
-===== Visualization ======+===== Visualization (!!currently not supported!!) ======
  
 To make use of a gpu node for visualization you need to perform the following steps. To make use of a gpu node for visualization you need to perform the following steps.
doku/vsc3_gpu.1506083769.txt.gz · Last modified: 2017/09/22 12:36 by sh