Both sides previous revision Previous revision Next revision | Previous revision |
pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer [2018/01/31 11:10] – Pandoc Auto-commit pandoc | pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer [2019/10/08 09:26] (current) – Pandoc Auto-commit pandoc |
---|
| ====== VSC-3 – supercomputer ====== |
| |
| * Article written by Claudia Blaas-Schenner (VSC Team) <html><br></html>(last update 2019-10-08 by cb). |
| |
| **OUTLINE:** |
| |
| * **VSC – Vienna Scientific Cluster**<html><br></html>$~$ |
| * **Supercomputers for beginners –**<html><br></html>**– introducing VSC-3 to our (new) users** |
| * Supercomputers for beginners – what is a supercomputer ? |
| * VSC-3 – what does it look like ? |
| * VSC-3 – components of a supercomputer |
| * Parallel hardware architectures –<html><br></html>– which parallel programming models can be used ? |
| * VSC-3 compute nodes |
| * VSC-3 node-interconnect |
| * VSC-3 ping-pong – intra-node vs. inter-node |
| |
| |
| |
| |
| ---- |
| |
| ====== VSC – Vienna Scientific Cluster ====== |
| |
| |
| * **The VSC is** a joint high performance computing (HPC) facility of Austrian universities. |
| * **Our mission:** Within the limits of available resources we satisfy the HPC needs of our users. |
| * **VSC is primarily devoted to research.** |
| * **Who can use VSC?** Scientific personnel of the partner universities, see: http://vsc.ac.at/access <html><nobr></html>VSC is open to users<html></nobr></html> from other academic and research institutions. |
| * **Projects** (test, funded, …): Access to VSC is granted on the basis of **peer-reviewed projects**. |
| * **Project manager** (= usually your supervisor): Project application, extensions, creates user accounts, … |
| * **Publications**: Please [[http://typo3.vsc.ac.at/access/acknowledgments/|acknowledge VSC]] and [[http://typo3.vsc.ac.at/access/publications-database/|add publications]] <html><font color=#cc3300></html>$~~$➠$~~$<html></font></html> visible on [[http://vsc.ac.at/publications|VSC homepage]] ! |
| |
| |
| ^VSC links: ^Information provided: ^ |
| |<html><font color=#cc3300></html>➠$~~$<html></font></html>**http://vsc.ac.at** |VSC homepage (general info) | |
| |<html><font color=#cc3300></html>➠$~~$<html></font></html>**https://service.vsc.ac.at** |VSC service website (application) | |
| |<html><font color=#cc3300></html>➠$~~$<html></font></html>**https://wiki.vsc.ac.at** |VSC user documentation | |
| |<html><font color=#cc3300></html>➠$~~$<html></font></html>{{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:contact_vsc-red_margin.png?150}} |VSC user support $~$&$~$ contact | |
| |
| |
| * **VSC Training Courses:** <html><br></html><html><font color=#cc3300></html>➠$~~$<html></font></html>**http://vsc.ac.at/training** <html><br></html>**VSC course slides:** <html><br></html><html><font color=#cc3300></html>➠$~~$➠$~~$➠$~~$<html></font></html>**[[https://wiki.vsc.ac.at/doku.php?id=pandoc:introduction-to-vsc:01_supercomputers_for_beginners:00_linux|VSC-Linux]]** <html><br></html><html><font color=#cc3300></html>➠$~~$➠$~~$➠$~~$<html></font></html>**[[https://wiki.vsc.ac.at/doku.php?id=pandoc:introduction-to-vsc:01_supercomputers_for_beginners:00_intro|VSC-Intro]]** |
| |
| |
| |
| |
| |
| |
| ---- |
| |
| ====== Supercomputers for beginners ====== |
| |
| |
| * **What is a supercomputer ?** |
| * A supercomputer is a computer with a high level of computing performance compared to a general-purpose computer. Performance of a supercomputer is measured in floating-point operations per second (FLOPS)… [from Wikipedia] <html><br></html><html><br></html> |
| * **A supercomputer is listed in the [[https://www.top500.org|TOP500]]** |
| |
| ^ ^ ^ TOP500^ GREEN500^ (#1 TOP500)^ |
| |VSC-1 (2009) | 35 TFlop/s| 156 (11/2009)| 94 (06/2009)| 1.8 PFlop/s #1 (11/2009)| |
| |VSC-2 (2011) | 135 TFlop/s| 56 (06/2011)| 71 (06/2011)| 8 PFlop/s #1 (06/2011)| |
| |[[https://www.top500.org/system/178471|VSC-3 (2014)]]| 596 TFlop/s| 85 (11/2014)| 86 (11/2014)| 33 PFlop/s #1 (11/2014)| |
| |VSC-3 (………) | 596 TFlop/s| 460 (11/2017)| 175 (11/2017)| 93 PFlop/s #1 (11/2017)| |
| |[[https://www.top500.org/system/179697|VSC-4 (2019)]]| 2.7 PFlop/s| 82 (06/2019)| ——–| 148 PFlop/s #1 (06/2019)| |
| |
| |
| |
| ---- |
| |
| ====== VSC-3 – what does it look like ? ====== |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:vsc3.png}} |
| |
| |
| |
| ---- |
| |
| ====== VSC-3 – what does it look like ? – inside ====== |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:vsc3-inside.png}} |
| |
| |
| |
| ---- |
| |
| ====== VSC-3 – components of a supercomputer ====== |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:vsc3-schematic.png}} |
| |
| |
| <html><br></html>$~$ |
| |
| * **login nodes** vs. **compute nodes** |
| * **shared** (login, storage) vs. **user exclusive** (compute nodes) |
| |
| |
| |
| ---- |
| |
| ====== Parallel hardware architectures ====== |
| |
| |
| * **how to connect cores (processing units) ?** <html><br></html>{{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:hw-cores_margin.png?150}} |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:hw-architectures.png}} |
| |
| |
| |
| ---- |
| |
| ====== VSC-3 compute nodes ====== |
| |
| * most nodes are **Intel Xeon IvyBridge** (E5-2650 v2 @ 2.60GHz) with **64 GB**, some with 128 / 256 GB, <html><br></html>plus special types of hardware |
| * **1 node** $~$ = $~$ **2 sockets** (CPUs), **8 cores** per socket (P), **2 threads** per core (T1/T2) $~$ + $~$ **2 HCAs** |
| |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:vsc3-node.png}} |
| |
| |
| |
| |
| * **intra-socket**: 59.7 GB/s (max), **inter-socket** via QPI (QuickPath interconnect): 32 GB/s (max) |
| * **inter-node** via dual rail Intel QDR-80: 4 GB/s (max) / 3.4 GB/s (eff) per HCA (host channel adapter) |
| * <html><font color=#cc3300></html>Avoiding slow data paths is the key to most performance optimizations! $~~~$ ➠ $~$**Affinity matters!**$~$<html></font></html> |
| |
| |
| **processing units** (PU#) $~~~$ <html><font color=#cc3300></html> ➠ pinning<html></font></html> <html><br></html>see: [[https://wiki.vsc.ac.at/doku.php?id=pandoc:introduction-to-vsc:05_submitting_batch_jobs:slurm#mpi_ntasks_per_node_pinning|article on SLURM]] and [[https://wiki.vsc.ac.at/doku.php?id=doku:vsc3_pinning|pinning@Wiki]] |
| |
| **memory hierarchy (mem_0064 nodes):** <html><br></html>L1 instruction cache: **32 kB**, private to core <html><br></html>L1 data cache: **32 kB**, private to core <html><br></html>L2 cache: **256 kB**, private to core (unified) <html><br></html>L3 cache: **20 MB**, shared by 8 cores of 1 socket <html><br></html>**memory: 32 GB per socket** |
| |
| |
| |
| ---- |
| |
| ====== VSC-3 node-interconnect ====== |
| |
| |
| |
| |
| |
| |
| |
| **IB fabric = dual rail Intel QDR-80 = 3-level fat-tree** (BF: 2:1 / 4:1) – schematic figure / numbers only <html><br></html><html><font color=#ffffff></html>**IB fabric = dual rail Intel **<html></font></html> (blocking – BF: down- : up-links – might introduce an additional latency) |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:vsc3-fabric-3.png}} |
| |
| |
| |
| ---- |
| |
| ====== VSC-3 ping-pong – intra-node vs. inter-node ====== |
| |
| * **1 node** $~$ = $~$ 2 sockets with 8 cores per socket $~$ + $~$ **2 HCAs** |
| * **inter-node** $~$ = $~$ IB fabric = dual rail Intel QDR-80 = 3-level fat-tree (BF: 2:1 / 4:1) |
| * **ping-pong benchmark** $~$ = $~$ module load $~$ intel/16.0.3 $~$ intel-mpi/5.1.3 $~$ | $~$ openmpi/1.10.2 $~$ (1 HCA) |
| |
| |
| <HTML><ul></HTML> |
| <HTML><li></HTML><HTML><p></HTML>**MPI latency & bandwidth (plus typical values for comparison):**<HTML></p></HTML> |
| ^VSC-3: ^ latency [μs] ^ ^ typical values for: ^ latency^ bandwidth^ |
| |<html><font color=#0000ff></html>intra-socket<html></font></html> | <html><font color=#0000ff></html>0.3 μs<html></font></html> | | <html><font color=#696969></html>L1 cache<html></font></html> | <html><font color=#696969></html>1–2 ns<html></font></html>| <html><font color=#696969></html>100 GB/s<html></font></html>| |
| |<html><font color=#6b8e23></html>inter-socket<html></font></html> | <html><font color=#6b8e23></html>0.7 μs<html></font></html> | | <html><font color=#696969></html>L2/L3 c.<html></font></html> | <html><font color=#696969></html>3–10 ns<html></font></html>| <html><font color=#696969></html>50 GB/s<html></font></html>| |
| |<html><font color=#cc3300></html>IB -1- edge<html></font></html> | <html><font color=#cc3300></html>1.4 μs<html></font></html> | | <html><font color=#696969></html>memory<html></font></html> | <html><font color=#696969></html>100 ns<html></font></html>| <html><font color=#696969></html>10 GB/s<html></font></html>| |
| |<html><font color=#ff00ff></html>IB -2- leaf<html></font></html> | <html><font color=#ff00ff></html>1.8 μs<html></font></html> | | <html><font color=#696969></html>HPC networks<html></font></html> | |
| |<html><font color=#ffa500></html>IB -3- spine<html></font></html> | <html><font color=#ffa500></html>2.3 μs<html></font></html> | | <html><font color=#696969></html>(per node / 2 HCAs)<html></font></html> | <html><font color=#696969></html>1–10 μs<html></font></html>| <html><font color=#696969></html>1–8 GB/s<html></font></html>| |
| <HTML></li></HTML><HTML></ul></HTML> |
| |
| |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:ping-pong-bandwidth.png}} |
| |
| |
| |
| {{:pandoc:introduction-to-vsc:01_supercomputers_for_beginners:vsc3_supercomputer:ping-pong-bandwidth-log.png}} |
| |
| |
| |
| ---- |
| |