Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
doku:vsc3_gpu [2017/09/01 14:58]
sh
doku:vsc3_gpu [2018/04/10 09:52]
markus [Slurm integration]
Line 2: Line 2:
  
 The following GPU devices are available: The following GPU devices are available:
- 
-^  Tesla c2050 (fermi)  ^^ 
-|  Total amount of global memory|2687 MBytes | 
-|  (14) Multiprocessors, ( 32) CUDA Cores/MP|448 CUDA Cores | 
-|  GPU Clock rate|1147 MHz | 
-|  Maximum number of threads per block|1024 | 
-|  Device has ECC support|Enabled | 
  
  
-^  Tesla k20m (kepler)  ^^+^  Enterprise-grade Tesla k20m (kepler)  ^^
 |  Total amount of global memory|4742 MBytes | |  Total amount of global memory|4742 MBytes |
 |  (13) Multiprocessors, (192) CUDA Cores/MP|2496 CUDA Cores | |  (13) Multiprocessors, (192) CUDA Cores/MP|2496 CUDA Cores |
Line 19: Line 12:
  
  
-^  Tesla m60 (maxwell)  ^^+^  Enterprise-grade Tesla m60 (maxwell)  ^^
 |  Total amount of global memory|8114 MBytes | |  Total amount of global memory|8114 MBytes |
 |  (16) Multiprocessors, (128) CUDA Cores/MP|2048 CUDA Cores | |  (16) Multiprocessors, (128) CUDA Cores/MP|2048 CUDA Cores |
Line 27: Line 20:
  
  
-^  Consumer grade GeForce GTX 1080 (pascal)  ^^+^  Consumer-grade GeForce GTX 1080 (pascal)  ^^
 |  Total amount of global memory|8113 MBytes | |  Total amount of global memory|8113 MBytes |
 |  (20) Multiprocessors, (128) CUDA Cores/MP|2560 CUDA Cores | |  (20) Multiprocessors, (128) CUDA Cores/MP|2560 CUDA Cores |
Line 35: Line 28:
  
  
-  * One node, n25-009, equipped with two Tesla c2050 (fermi) GPUs. The host system includes two Intel Xeon X5650  @ 2.67GHz CPUs with 6 cores each and 24GB of RAM. 
   * Two nodes, n25-[005,006], with two Tesla k20m (kepler) GPUs each. Host systems are equipped with two Intel Xeon E5-2680 0 @ 2.70GHz each with 8 cores and 256GB of RAM.   * Two nodes, n25-[005,006], with two Tesla k20m (kepler) GPUs each. Host systems are equipped with two Intel Xeon E5-2680 0 @ 2.70GHz each with 8 cores and 256GB of RAM.
-  * Two nodes, n25-[007,010], with two Tesla m60 (maxwell) GPUs each. n25-007 is equipped with 2 Intel Xeon E5-2650 v3 @ 2.30GHz, each with 10 cores and a host memory of 256GB RAM, while n25-010 features 2 Intel Xeon E5-2660 v3 @ 2.60GHz, again both with 10 cores and a host memory of 128GB RAM.+  * One node, n25-007, with two Tesla m60 (maxwell) GPUs. n25-007 is equipped with 2 Intel Xeon E5-2650 v3 @ 2.30GHz, each with 10 cores and a host memory of 256GB RAM.
   * Ten nodes, n25-[011-020], with single GPUs of type GTX-1080 (pascal), where host systems are single socket 4-core Intel Xeon E5-1620 @ 3.5 GHz with 64GB RAM.    * Ten nodes, n25-[011-020], with single GPUs of type GTX-1080 (pascal), where host systems are single socket 4-core Intel Xeon E5-1620 @ 3.5 GHz with 64GB RAM. 
   * Two shared-private nodes, n25-[021-022], each equipped with 8 GTX-1080 (pascal) devices hosted on dual socket 4-core Intel Xeon E5-2623 systems @ 2.6 GHz with 128GB RAM.   * Two shared-private nodes, n25-[021-022], each equipped with 8 GTX-1080 (pascal) devices hosted on dual socket 4-core Intel Xeon E5-2623 systems @ 2.6 GHz with 128GB RAM.
 [[https://github.com/NVIDIA/gdrcopy|gdrdrv]] is loaded by default (also see [[https://devtalk.nvidia.com/default/topic/919381/gdrcpy-problem/|notes]] regarding gtx1080 cards). [[https://github.com/NVIDIA/gdrcopy|gdrdrv]] is loaded by default (also see [[https://devtalk.nvidia.com/default/topic/919381/gdrcpy-problem/|notes]] regarding gtx1080 cards).
 +
 ------- -------
 ==== Slurm integration ==== ==== Slurm integration ====
  
-There is one partition called ''gpu'' which includes all available gpu nodes+There are several partions which include all identical compute nodes with the same amount of GPUs on each node:
  
-<code>[user@l31 ~]$ sinfo -p gpu +<code> 
-PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST +gpu_gtx1080single   #  4 cpu cores, one gpu per node 
-gpu          up   infinite     10   alloc n25-[011-020] +gpu_gtx1080multi    # 16 cpu cores, eight gpus per node 
-gpu          up   infinite      5   idle n25-[005-007,009-010]+gpu_k20m            # 16 cpu cores, two gpus per node 
 +gpu_m60             # 16 cpu cores, one gpu per node
 </code> </code>
-and which needs to be specified via: 
-<code>#SBATCH --partition=gpu</code> 
  
-GPU nodes are selected via the **generic resource (--gres=)** and **constraints (-C,--constraint=)** options+For each partition a identically named QOS is defined. Slurm usage is eg.
-  * c2050 (fermi) GPU node: <code>#SBATCH -C c2050 + 
-#SBATCH --gres=gpu:2+<code> 
 +#SBATCH -p gpu_gtx1080single 
 +#SBATCH --qos gpu_gtx1080single
 </code> </code>
-  * k20m (kepler) GPU nodes: <code>#SBATCH -C k20m 
-#SBATCH --gres=gpu:2 
-</code> 
-  * m60 (maxwell) GPU nodes: <code>#SBATCH -C m60 
-#SBATCH --gres=gpu:2 
-</code> 
-  * gtx1080 (pascal) GPU nodes: <code>#SBATCH -C gtx1080 
-#SBATCH --gres=gpu:1 
-</code> 
-  * at idle times of private-shared gtx1080 (pascal) GPU nodes: <code>#SBATCH --partition=p70971_gpu  
-#SBATCH -C gtx1080 
-#SBATCH --gres=gpu:8 
-</code> 
- 
  
-To use a gpu node for computing purposes the quality of service (QoS) ''gpu_compute'' is available which provides a **maximum runtime of three days**: 
-<code>#SBATCH --qos=gpu_compute</code> 
-For visualization the QoS ''gpu_vis'' has to be used where a gpu node can be occupied for up to **twelve hours** for interactive visualization: 
-<code>#SBATCH --qos=gpu_vis</code> 
-When a job is submitted within the ''gpu_vis'' QoS, an X server is started on the gpu node. 
  
 -------------------------------------------------- --------------------------------------------------
  
-===== Visualization ======+===== Visualization (!!currently not supported!!) ======
  
 To make use of a gpu node for visualization you need to perform the following steps. To make use of a gpu node for visualization you need to perform the following steps.
  • doku/vsc3_gpu.txt
  • Last modified: 2018/04/10 09:52
  • by markus