Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
pandoc:introduction-to-vsc:08_storage_infrastructure:storage_infrastructure [2018/01/31 13:17] – Pandoc Auto-commit pandoc | pandoc:introduction-to-vsc:08_storage_infrastructure:storage_infrastructure [2020/10/20 09:13] (current) – Pandoc Auto-commit pandoc | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Storage infrastructure ====== | ====== Storage infrastructure ====== | ||
- | * Article written by Siegfried Reinwald (VSC Team) < | + | * Article written by Siegfried Reinwald (VSC Team) < |
- | ====== Storage hardware ====== | + | ====== Storage hardware |
* Storage on VSC-3 | * Storage on VSC-3 | ||
- | * 9 Servers for '' | + | * 10 Servers for '' |
* 8 Servers for '' | * 8 Servers for '' | ||
- | * 17 Servers for '' | + | * 16 Servers for '' |
* ~ 800 spinning disks | * ~ 800 spinning disks | ||
* ~ 100 SSDs | * ~ 100 SSDs | ||
Line 24: | Line 24: | ||
* '' | * '' | ||
* For different purposes | * For different purposes | ||
+ | * Random I/O | ||
* Small Files | * Small Files | ||
* Huge Files / Streaming Data | * Huge Files / Streaming Data | ||
- | * Random I/O | ||
====== Storage performance ====== | ====== Storage performance ====== | ||
- | {{: | + | {{.: |
- | ====== The HOME Filesystem ====== | + | ====== The HOME Filesystem |
* Use for non I/O intensive jobs | * Use for non I/O intensive jobs | ||
* Basically NFS Exports over infiniband (no RDMA) | * Basically NFS Exports over infiniband (no RDMA) | ||
- | * Targets with up to 24 Disks (RAID-6 on VSC-3) | ||
- | * Up to 2 Gigabyte/ | ||
* Logical volumes of projects are distributed among the servers | * Logical volumes of projects are distributed among the servers | ||
* Each logical volume belongs to 1 NFS server | * Each logical volume belongs to 1 NFS server | ||
Line 49: | Line 47: | ||
* Can be increased on request (subject to availability) | * Can be increased on request (subject to availability) | ||
* BeeGFS Filesystem | * BeeGFS Filesystem | ||
- | * Metadata Servers | ||
- | * Metadata on SSDs (RAID-1) | ||
- | * 8 Metadata Targets for VSC-3 | ||
- | * Object Storages | ||
- | * Disk Storages (RAID-6 on VSC-3) | ||
- | * VSC-3: 12 Disks per Target / 4 Targets per Server / 8 Servers total | ||
- | * Up to 20 Gigabyte/ | ||
* Accessible via the '' | * Accessible via the '' | ||
* '' | * '' | ||
* '' | * '' | ||
+ | * Check quota | ||
+ | < | ||
+ | beegfs-ctl --getquota --cfgFile=/ | ||
+ | </ | ||
+ | < | ||
+ | VSC-3 > beegfs-ctl --getquota --cfgFile=/ | ||
+ | user/ | ||
+ | | ||
+ | --------------|------||------------|------------||---------|--------- | ||
+ | p70824| 70824|| | ||
+ | |||
+ | </ | ||
====== The BINFL filesystem ====== | ====== The BINFL filesystem ====== | ||
Line 67: | Line 70: | ||
* Can be increased on request (subject to availability) | * Can be increased on request (subject to availability) | ||
* BeeGFS Filesystem | * BeeGFS Filesystem | ||
- | * Metadata Servers | ||
- | * Metadata on Datacenter SSDs (RAID-10) | ||
- | * 8 Metadata Servers | ||
- | * Object Storages | ||
- | * Disk Storages configured as RAID-6 | ||
- | * 12 Disks per Target / 1 Target per Server / 16 Servers total | ||
- | * Up to 40 Gigabyte/ | ||
* Accessible via '' | * Accessible via '' | ||
* '' | * '' | ||
+ | * Also available on VSC-4 | ||
+ | * Check quota | ||
+ | < | ||
+ | beegfs-ctl --getquota --cfgFile=/ | ||
+ | </ | ||
+ | < | ||
+ | VSC-3 > beegfs-ctl --getquota --cfgFile=/ | ||
+ | user/ | ||
+ | | ||
+ | --------------|------||------------|------------||---------|--------- | ||
+ | p70824| 70824|| | ||
+ | |||
+ | </ | ||
====== The BINFS filesystem ====== | ====== The BINFS filesystem ====== | ||
Line 84: | Line 93: | ||
* Can be increased on request (subject to availability) | * Can be increased on request (subject to availability) | ||
* BeeGFS Filesystem | * BeeGFS Filesystem | ||
- | * Metadata Servers | ||
- | * Metadata on Datacenter SSDs (RAID-10) | ||
- | * 8 Metadata Servers | ||
- | * Object Storages | ||
- | * Datacenter SSDs are used instead of traditional disks. | ||
- | * No redundancy. See it as (very) fast and low-latency scratch space. Data may be lost after a hardware failure. | ||
- | * 4x Intel P3600 2TB Datacenter SSDs per Server | ||
- | * 16 Storage Servers | ||
- | * Up to 80 Gigabyte/ | ||
* Accessible via '' | * Accessible via '' | ||
* '' | * '' | ||
+ | * Also available on VSC-4 | ||
+ | * Check quota | ||
+ | < | ||
+ | beegfs-ctl --getquota --cfgFile=/ | ||
+ | </ | ||
+ | < | ||
+ | VSC-3 > beegfs-ctl --getquota --cfgFile=/ | ||
+ | user/ | ||
+ | | ||
+ | --------------|------||------------|------------||---------|--------- | ||
+ | p70824| 70824|| | ||
+ | |||
+ | </ | ||
====== The TMP filesystem ====== | ====== The TMP filesystem ====== | ||
Line 109: | Line 122: | ||
* Very small files waste main memory (memory mapped files are aligned to page-size)--></ | * Very small files waste main memory (memory mapped files are aligned to page-size)--></ | ||
* Accessible with the '' | * Accessible with the '' | ||
+ | |||
+ | ====== Storage hardware VSC-4 ====== | ||
+ | |||
+ | * Storage on VSC-4 | ||
+ | * 1 Server for '' | ||
+ | * 6 Servers for '' | ||
+ | * 720 spinning disks | ||
+ | * 16 NVMEs flash drives | ||
+ | |||
+ | ====== The HOME Filesystem (VSC-4) ====== | ||
+ | |||
+ | * Use for software and job scripts | ||
+ | * Default quota: 100GB | ||
+ | * Accessible with the '' | ||
+ | * / | ||
+ | * Also available on VSC-3 | ||
+ | * / | ||
+ | * Check quota | ||
+ | |||
+ | < | ||
+ | mmlsquota --block-size auto -j home_fs70XXX home | ||
+ | </ | ||
+ | < | ||
+ | VSC-4 > mmlsquota --block-size auto -j home_fs70824 home | ||
+ | Block Limits | ||
+ | Filesystem type | ||
+ | home | ||
+ | |||
+ | </ | ||
+ | ====== The DATA Filesystem ====== | ||
+ | |||
+ | * Use for all kind of I/O | ||
+ | * Default quota: 10TB | ||
+ | * Extansion can be requested | ||
+ | * Accessible with the '' | ||
+ | * / | ||
+ | * Also available on VSC-3 | ||
+ | * / | ||
+ | * Check quota | ||
+ | |||
+ | < | ||
+ | mmlsquota --block-size auto -j data_fs70XXX data | ||
+ | </ | ||
+ | < | ||
+ | VSC-4 > mmlsquota --block-size auto -j data_fs70824 data | ||
+ | Block Limits | ||
+ | Filesystem type | ||
+ | data | ||
+ | |||
+ | </ | ||
+ | ====== Backup policy ====== | ||
+ | |||
+ | * Backup of user files is **solely the responsibility of each user** | ||
+ | * [[https:// | ||
+ | * Backed up filesystems: | ||
+ | * '' | ||
+ | * '' | ||
+ | * '' | ||
+ | * Backups are performed on best effort basis | ||
+ | * Full backup run: ~3 days | ||
+ | * Backups are used for **disaster recovery only** | ||
+ | * Project manager can exclude $DATA filesystem from backup | ||
+ | * [[https:// | ||
====== Storage exercises ====== | ====== Storage exercises ====== | ||
- | In these exercises we try to measure the performance of the different storage targets on VSC-3. For that we will use the “ior” application (https: | + | In these exercises we try to measure the performance of the different storage targets on VSC-3. For that we will use the “IOR” application (https: |
- | “ior” for these exercises has been built with gcc-4.9 and openmpi-1.10.2 so load these 2 modules first: | + | “IOR” for these exercises has been built with gcc-4.9 and openmpi-1.10.2 so load these 2 modules first: |
< | < | ||
module purge | module purge | ||
- | module | + | module load gcc/4.9 openmpi/ |
</ | </ | ||
Now extract the storage exercises to your own Folder. | Now extract the storage exercises to your own Folder. | ||
Line 125: | Line 201: | ||
mkdir my_directory_name | mkdir my_directory_name | ||
cd my_directory_name | cd my_directory_name | ||
- | tar xf / | + | cp -r ~training/ |
</ | </ | ||
- | Download [[examples/ | ||
- | |||
Keep in mind that the results will vary, because there are other users working on the storage targets. | Keep in mind that the results will vary, because there are other users working on the storage targets. | ||
- | ====== Exercise 1 - Sequential I/O performance hands-On | + | ====== Exercise 1 - Sequential I/O ====== |
We will now measure the sequential performance of the different storage targets on VSC-3. | We will now measure the sequential performance of the different storage targets on VSC-3. | ||
- | < | + | |
- | < | + | |
< | < | ||
cd 01_SequentialStorageBenchmark | cd 01_SequentialStorageBenchmark | ||
- | # Run the test-script | + | # Submit |
- | ./ | + | sbatch 01a_one_process_per_target.slrm |
- | # Wait to complete and look at the output afterwards | + | # Inspect corresponding slurm-*.out files |
- | cat Exercise_1a.result | + | |
</ | </ | ||
< | < | ||
Line 149: | Line 221: | ||
< | < | ||
- | ./ | + | # Submit |
- | # Wait to complete and look at the output afterwards | + | sbatch 01b_eight_processes_per_target.slrm |
- | cat Exercise_1b.result | + | # Inspect corresponding slurm-*.out files |
</ | </ | ||
- | Take your time and compare the outputs of the 2 different runs. What can you deduce about the storage targets on VSC-3? | + | Take your time and compare the outputs of the 2 different runs. What conclusions |
====== Exercise 1 - Sequential I/O performance discussion ====== | ====== Exercise 1 - Sequential I/O performance discussion ====== | ||
Line 161: | Line 233: | ||
* The performance of which storage targets improves with the number of processes? Why? | * The performance of which storage targets improves with the number of processes? Why? | ||
* What could you do to further improve the performance of the sequential write throughput? What could be a problem with that? | * What could you do to further improve the performance of the sequential write throughput? What could be a problem with that? | ||
- | * Bonus Question: '' | + | * Bonus Question: '' |
Line 189: | Line 261: | ||
---- | ---- | ||
- | ====== Exercise 2 - Random I/O performance hands-On | + | ====== Exercise 2 - Random I/O ====== |
- | We will now measure the storage performance | + | We will now measure the storage performance |
- | < | + | |
- | < | + | |
< | < | ||
cd 02_RandomioStorageBenchmark | cd 02_RandomioStorageBenchmark | ||
- | # Run the Test | + | # Submit |
- | ./ | + | sbatch 02a_one_process_per_target.slrm |
- | # Wait to complete and look at the output afterwards | + | # Inspect corresponding slurm-*.out files |
- | cat Exercise_2a.result | + | |
</ | </ | ||
< | < | ||
Line 207: | Line 278: | ||
< | < | ||
- | ./ | + | # Submit |
- | # Wait to complete and look at the output afterwards | + | sbatch 02b_eight_processes_per_target.slrm |
- | cat Exercise_2b.result | + | # Inspect corresponding slurm-*.out files |
</ | </ | ||
- | Take your time and compare the outputs of the 2 different runs. Do additional processes speed up the process? | + | Take your time and compare the outputs of the 2 different runs. Do additional processes speed up the I/O activity? |
- | Now compare your Results to the sequential run in exercise 1. What can you deduce about random I/O versus sequential I/O on the VSC-3 storage targets? | + | Now compare your Results to the sequential run in exercise 1. What can be concluded for random I/O versus sequential I/O on the VSC-3 storage targets? |
====== Exercise 2 - Random I/O performance discussion ====== | ====== Exercise 2 - Random I/O performance discussion ====== | ||
Line 249: | Line 320: | ||
---- | ---- | ||
+ | |||