This version is outdated by a newer approved version.DiffThis version (2022/06/20 09:01) was approved by msiegel.

This is an old revision of the document!


Storage infrastructure

Storage targets

  • Several Storage Targets available
    • $HOME
    • $TMPDIR
    • $BINFS
    • $BINFL
    • $DATA
  • For different purposes
    • Random I/O
    • Small Files
    • Huge Files / Streaming Data

The HOME Filesystem (VSC-3)

  • Use for non I/O intensive jobs
  • Basically NFS Exports over infiniband (no RDMA)
  • Logical volumes of projects are distributed among the servers
    • Each logical volume belongs to 1 NFS server
  • Accessible with the $HOME environment variable
    • /home/lv70XXX/username

The HOME Filesystem (VSC-4)

  • Use for software and job scripts
  • Default quota: 100GB
  • Accessible with the $HOME environment variable (VSC-4)
    • /home/fs70XXX/username
  • Also available on VSC-3
    • /gpfs/home/fs70XXX/username
  • Check quota
mmlsquota --block-size auto -j home_fs70XXX home
VSC-4 > mmlsquota --block-size auto -j home_fs70824 home
                         Block Limits                                    |     File Limits
Filesystem type         blocks      quota      limit   in_doubt    grace |    files   quota    limit in_doubt
home       FILESET       63.7M       100G       100G          0     none |     3822 1000000  1000000        0 

The BINFL filesystem

  • Specifically designed for Bioinformatics applications
  • Use for I/O intensive jobs
  • ~ 1 PB Space (default quota is 10GB/project)
    • Can be increased on request (subject to availability)
  • BeeGFS Filesystem
  • Accessible via $BINFL environment variable
    • $BINFL/binfl/lv70XXX/username
  • Also available on VSC-4
  • Check quota
    beegfs-ctl --getquota --cfgFile=/etc/beegfs/hdd_storage.d/beegfs-client.conf --gid 70XXX
VSC-3 > beegfs-ctl --getquota --cfgFile=/etc/beegfs/hdd_storage.d/beegfs-client.conf --gid 70824
      user/group     ||           size          ||    chunk files    
     name     |  id  ||    used    |    hard    ||  used   |  hard   
--------------|------||------------|------------||---------|---------
        p70824| 70824||    5.93 MiB|   10.00 GiB||      574|  1000000

The BINFS filesystem

  • Specifically designed for Bioinformatics applications
  • Use for very I/O intensive jobs
  • ~ 100 TB Space (default quota is 2GB/project)
    • Can be increased on request (subject to availability)
  • BeeGFS Filesystem
  • Accessible via $BINFS environment variable
    • $BINFS/binfs/lv70XXX/username
  • Also available on VSC-4
  • Check quota
    beegfs-ctl --getquota --cfgFile=/etc/beegfs/nvme_storage.d/beegfs-client.conf --gid 70XXX
VSC-3 > beegfs-ctl --getquota --cfgFile=/etc/beegfs/nvme_storage.d/beegfs-client.conf --gid 70824
      user/group     ||           size          ||    chunk files    
     name     |  id  ||    used    |    hard    ||  used   |  hard   
--------------|------||------------|------------||---------|---------
        p70824| 70824||      0 Byte|    2.00 GiB||        0|     2000

The TMP filesystem

  • Use for
    • Random I/O
    • Many small files
  • Size is up to 50% of main memory
  • Data gets deleted after the job
    • Write Results to $HOME or $GLOBAL
  • Disadvantages
    • Space is consumed from main memory <html><!–* Alternatively the mmap() system call can be used
  • Keep in mind, that mmap() uses lazy loading
  • Very small files waste main memory (memory mapped files are aligned to page-size)–></html>
  • Accessible with the $TMPDIR environment variable

The DATA Filesystem

  • Use for all kind of I/O
  • Default quota: 10TB
    • Extansion can be requested
  • Accessible with the $DATA environment variable (VSC-4)
    • /data/fs70XXX/username
  • Also available on VSC-3
    • /gpfs/data/fs70XXX/username
  • Check quota
mmlsquota --block-size auto -j data_fs70XXX data
VSC-4 > mmlsquota --block-size auto -j data_fs70824 data
                         Block Limits                                    |     File Limits
Filesystem type         blocks      quota      limit   in_doubt    grace |    files   quota    limit in_doubt 
data       FILESET           0     9.766T     9.766T          0     none |       14 1000000  1000000        0 

Backup policy

  • Backup of user files is solely the responsibility of each user
  • Backed up filesystems:
    • $HOME (VSC-3)
    • $HOME (VSC-4)
    • $DATA (VSC-4)
  • Backups are performed on best effort basis
    • Full backup run: ~3 days
  • Backups are used for disaster recovery only
  • Project manager can exclude $DATA filesystem from backup
  • doku/storage.1641303081.txt.gz
  • Last modified: 2022/01/04 13:31
  • by jz