Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
pandoc:parallel-io:03_storage_infrastructure:storage_infrastructure [2017/12/04 13:11] – Pandoc Auto-commit pandocpandoc:parallel-io:03_storage_infrastructure:storage_infrastructure [2020/10/20 09:13] (current) – Pandoc Auto-commit pandoc
Line 16: Line 16:
       * 2x Gigabit Ethernet       * 2x Gigabit Ethernet
       * 1x Infiniband QDR       * 1x Infiniband QDR
-    * Login Nodes, Head Nodes, Hypervisors, etc...+    * Login Nodes, Head Nodes, Hypervisors, etc
  
 ====== VSC2 - File Systems ====== ====== VSC2 - File Systems ======
Line 25: Line 25:
     * Scratch (/fhgfs/nodeName)     * Scratch (/fhgfs/nodeName)
     * TMP (/tmp)     * TMP (/tmp)
-  * Change Filesystem with 'cd'+  * Change Filesystem with cd
     * cd /global # Go into Global     * cd /global # Go into Global
     * cd ~ # Go into Home     * cd ~ # Go into Home
   * Use your applications settings to set in/out directories   * Use your applications settings to set in/out directories
-  * Pipe the output into your home with './myApp.sh 2>&1 > ~/out.txt'+  * Pipe the output into your home with ./myApp.sh 2>&1 > ~/out.txt
  
 ====== VSC - Fileserver ====== ====== VSC - Fileserver ======
  
-{{pandoc:parallel-io:03_storage_infrastructure:storage_infrastructure:fileserver.jpg}}+{{.:fileserver.jpg}}
  
 ====== VSC - Why different Filesystems ====== ====== VSC - Why different Filesystems ======
Line 45: Line 45:
       * Do some testing       * Do some testing
       * Write (small) Log Files       * Write (small) Log Files
-      * etc...+      * etc
  
 ====== VSC - Why different Filesystems ====== ====== VSC - Why different Filesystems ======
Line 62: Line 62:
  
   * Small I/Os   * Small I/Os
-    * Random Access --> TMP +    * Random Access > TMP 
-    * Others --> HOME+    * Others > HOME
   * Large I/Os   * Large I/Os
-    * Random Access --> TMP +    * Random Access > TMP 
-    * Sequential Access --> GLOBAL+    * Sequential Access > GLOBAL
  
 ====== VSC2 - Storage Homes ====== ====== VSC2 - Storage Homes ======
Line 112: Line 112:
     * 2x Infiniband QDR (Dual-Rail)     * 2x Infiniband QDR (Dual-Rail)
     * 2x Gigabit Ethernet     * 2x Gigabit Ethernet
-  * Login Nodes, Head Nodes, Hypervisors, Accelerator Nodes, etc...+  * Login Nodes, Head Nodes, Hypervisors, Accelerator Nodes, etc
  
 ====== VSC3 - File Systems ====== ====== VSC3 - File Systems ======
Line 147: Line 147:
   * 1 Metadata Target per Server   * 1 Metadata Target per Server
     * 2 SSDs each (Raid-1 / Mirrored)     * 2 SSDs each (Raid-1 / Mirrored)
-  * Up to 20'000 MB/s throughput+  * Up to 20000 MB/s throughput
   * ~ 600 TB capacity   * ~ 600 TB capacity
  
Line 164: Line 164:
     * Raid (Spectrum Scale Raid)     * Raid (Spectrum Scale Raid)
     * Fast rebuild     * Fast rebuild
-    * ...+    * 
  
 ====== VSC3 - EODC Filesystem ====== ====== VSC3 - EODC Filesystem ======
  
-  * VSC3 <--> EODC+  * VSC3 <> EODC
     * 4 IBM ESS Servers (~350 spinning disks each)     * 4 IBM ESS Servers (~350 spinning disks each)
     * Running GPFS 4.2.3     * Running GPFS 4.2.3
Line 184: Line 184:
 ====== VSC3 - EODC Filesystem ====== ====== VSC3 - EODC Filesystem ======
  
-  * VSC3 is a "remote clusterfor the EODC GPFS filesystem+  * VSC3 is a remote cluster” for the EODC GPFS filesystem
     * 2 Management servers     * 2 Management servers
       * Tie-Breaker disk for quorum       * Tie-Breaker disk for quorum
Line 192: Line 192:
 ====== VSC3 - Bioinformatics ====== ====== VSC3 - Bioinformatics ======
  
-  * VSC3 got a "bioinformaticsupgrade in late 2016+  * VSC3 got a bioinformatics” upgrade in late 2016
     * 17 Nodes     * 17 Nodes
       * 2x Intel Xeon E5-2690 v4 @ 2.60 Ghz (14 cores each / 28 with hyperthreading)       * 2x Intel Xeon E5-2690 v4 @ 2.60 Ghz (14 cores each / 28 with hyperthreading)
Line 204: Line 204:
 ====== VSC3 - Bioinformatics ====== ====== VSC3 - Bioinformatics ======
  
-{{pandoc:parallel-io:03_storage_infrastructure:storage_infrastructure:binf.jpg}}+{{.:binf.jpg}}
  
 ====== VSC3 - BINFL Filesystem ====== ====== VSC3 - BINFL Filesystem ======
Line 249: Line 249:
 ====== Storage Performance ====== ====== Storage Performance ======
  
-{{pandoc:parallel-io:03_storage_infrastructure:storage_infrastructure:vsc3_storage_performance.png}}+{{.:vsc3_storage_performance.png}}
  
 ====== Temporary Filesystems ====== ====== Temporary Filesystems ======
Line 273: Line 273:
       * via self written, parallellized rsync script       * via self written, parallellized rsync script
     * Global     * Global
-      * Metadata is backupped. But if a total disaster happens this won't help.+      * Metadata is backupped. But if a total disaster happens this wont help.
     * Altough nothing happenend for some time, this is high performance computing and redundancy is minimal     * Altough nothing happenend for some time, this is high performance computing and redundancy is minimal
-      * Keep an eye on your data. If it's important you should backup it yourself.+      * Keep an eye on your data. If its important you should backup it yourself.
         * Use rsync         * Use rsync
  
 ====== Addendum: Big Files ====== ====== Addendum: Big Files ======
  
-  * What does an administrator mean with 'don't use small files'?+  * What does an administrator mean with dont use small files?
     * It depends     * It depends
-    * On /global fopen --> fclose takes ~100 microseconds+    * On /global fopen > fclose takes ~100 microseconds
     * On VSC2 Home Filesystems > 100 Million Files are stored.     * On VSC2 Home Filesystems > 100 Million Files are stored.
       * A check which files have changed takes more than 12 hours. Without even reading file contents or copying.       * A check which files have changed takes more than 12 hours. Without even reading file contents or copying.
Line 292: Line 292:
     * Storage works well when the blocksize is reasonable for the storage system (on VSC3 a few Megabytes are enough)     * Storage works well when the blocksize is reasonable for the storage system (on VSC3 a few Megabytes are enough)
     * Do not create millions of files     * Do not create millions of files
-      * If every user had millions of files we'd run into some problems +      * If every user had millions of files wed run into some problems 
-    * Use 'tarto archive unneeded files+    * Use tar’ to archive unneeded files
       * tar -cvpf myArchive.tar myFolderWithFiles/       * tar -cvpf myArchive.tar myFolderWithFiles/
       * tar -cvjpf myArchive.tar myFolderWithFiles/ # Uses bzip2 compression       * tar -cvjpf myArchive.tar myFolderWithFiles/ # Uses bzip2 compression
  • pandoc/parallel-io/03_storage_infrastructure/storage_infrastructure.1512393116.txt.gz
  • Last modified: 2017/12/04 13:11
  • by pandoc