This version is outdated by a newer approved version.This version (2022/06/20 09:01) was approved by msiegel.
This is an old revision of the document!
Storage infrastructure
Storage targets
- Several Storage Targets available
$HOME
$TMPDIR
$BINFS
$BINFL
$DATA
- For different purposes
- Random I/O
- Small Files
- Huge Files / Streaming Data
The HOME Filesystem (VSC-3)
- Use for non I/O intensive jobs
- Basically NFS Exports over infiniband (no RDMA)
- Logical volumes of projects are distributed among the servers
- Each logical volume belongs to 1 NFS server
- Accessible with the
$HOME
environment variable- /home/lv70XXX/username
The HOME Filesystem (VSC-4)
- Use for software and job scripts
- Default quota: 100GB
- Accessible with the
$HOME
environment variable (VSC-4)- /home/fs70XXX/username
- Also available on VSC-3
- /gpfs/home/fs70XXX/username
- Check quota
mmlsquota --block-size auto -j home_fs70XXX home
VSC-4 > mmlsquota --block-size auto -j home_fs70824 home Block Limits | File Limits Filesystem type blocks quota limit in_doubt grace | files quota limit in_doubt home FILESET 63.7M 100G 100G 0 none | 3822 1000000 1000000 0
The BINFL filesystem
- Specifically designed for Bioinformatics applications
- Use for I/O intensive jobs
- ~ 1 PB Space (default quota is 10GB/project)
- Can be increased on request (subject to availability)
- BeeGFS Filesystem
- Accessible via
$BINFL
environment variable$BINFL
… /binfl/lv70XXX/username
- Also available on VSC-4
- Check quota
beegfs-ctl --getquota --cfgFile=/etc/beegfs/hdd_storage.d/beegfs-client.conf --gid 70XXX
VSC-3 > beegfs-ctl --getquota --cfgFile=/etc/beegfs/hdd_storage.d/beegfs-client.conf --gid 70824 user/group || size || chunk files name | id || used | hard || used | hard --------------|------||------------|------------||---------|--------- p70824| 70824|| 5.93 MiB| 10.00 GiB|| 574| 1000000
The BINFS filesystem
- Specifically designed for Bioinformatics applications
- Use for very I/O intensive jobs
- ~ 100 TB Space (default quota is 2GB/project)
- Can be increased on request (subject to availability)
- BeeGFS Filesystem
- Accessible via
$BINFS
environment variable$BINFS
… /binfs/lv70XXX/username
- Also available on VSC-4
- Check quota
beegfs-ctl --getquota --cfgFile=/etc/beegfs/nvme_storage.d/beegfs-client.conf --gid 70XXX
VSC-3 > beegfs-ctl --getquota --cfgFile=/etc/beegfs/nvme_storage.d/beegfs-client.conf --gid 70824 user/group || size || chunk files name | id || used | hard || used | hard --------------|------||------------|------------||---------|--------- p70824| 70824|| 0 Byte| 2.00 GiB|| 0| 2000
The TMP filesystem
- Use for
- Random I/O
- Many small files
- Size is up to 50% of main memory
- Data gets deleted after the job
- Write Results to
$HOME
or$GLOBAL
- Disadvantages
- Space is consumed from main memory <html><!–* Alternatively the mmap() system call can be used
- Keep in mind, that mmap() uses lazy loading
- Very small files waste main memory (memory mapped files are aligned to page-size)–></html>
- Accessible with the
$TMPDIR
environment variable
The DATA Filesystem
- Use for all kind of I/O
- Default quota: 10TB
- Extansion can be requested
- Accessible with the
$DATA
environment variable (VSC-4)- /data/fs70XXX/username
- Also available on VSC-3
- /gpfs/data/fs70XXX/username
- Check quota
mmlsquota --block-size auto -j data_fs70XXX data
VSC-4 > mmlsquota --block-size auto -j data_fs70824 data Block Limits | File Limits Filesystem type blocks quota limit in_doubt grace | files quota limit in_doubt data FILESET 0 9.766T 9.766T 0 none | 14 1000000 1000000 0
Backup policy
- Backup of user files is solely the responsibility of each user
- Backed up filesystems:
$HOME
(VSC-3)$HOME
(VSC-4)$DATA
(VSC-4)
- Backups are performed on best effort basis
- Full backup run: ~3 days
- Backups are used for disaster recovery only
- Project manager can exclude $DATA filesystem from backup