Table of Contents

High performance parallel storage + large memory nodes

17 Bioinformatics nodes (“BINF”-Nodes) are a new addition to VSC-3. They are intended for

Used Hardware

There are 17 nodes and each node consists of

Main memory is

Differences to other VSC-3 Nodes

The nodes are interconnected via Intel Omnipath. If one runs his calculations directly on the BINF nodes and utilises the BINFS or BINFL filesystem for writing data, then the throughput rate of the interconnect is up to 2.5 times as fast as using the QDR Infiniband interconnect of VSC3.

Quotas / How to get more diskspace

Because of the high throughput rate of the SSDs, there are very strict quotas set on the filesystems. If you are eligible to use the BINFS and/or the BINFL filesystems write us a mail with the needed diskspace and we will increase your quota in coordination with the bioinformatics project leaders.

To see the quota limits and the used space one can use the beegfs-ctl command:

# Show Quota on $BINFS
beegfs-ctl --getquota --mount=/binfs --gid ProjectNumber
# Show Quota on $BINFL
beegfs-ctl --getquota --mount=/binfl --gid ProjectNumber

Please note that one has to include the –mount=/binfs or –mount=/binfl part, otherwise the quota for the $GLOBAL filesystem will be shown.

For additional info refer to BeeGFS quotas.

Available filesystems

The binf nodes export 2 mountpoints which are $BINFS (/binfs) and $BINFL (/binfl). The BINFS filesystem is more than twice as fast as the BINFL filesystem and even a magnitude faster then the $GLOBAL filesystem. If you need really high IO rates you should use the BINFS filesystem.

If you need much space then you should use the $BINFL filesystem, which is 10 times larger than $BINFS.

Please keep in mind that $BINFS has no redundancy and is seen as fast scratch space. A hardware defect can (and most probably will, and already has) lead to a situation where the whole file systems gets reinitialised, which means the wiping of the whole data on the parallel filesystem. So if you can't live with that, you should use the $BINFL file system which has a bit more redundancy.

Submitting Jobs

To submit Jobs that should run on the binf nodes one has to add

#SBATCH --qos normal_binf
#SBATCH -C binf
#SBATCH --partition binf
#SBATCH --profile=None

to the jobscript. This will give a maximum runtime of 24 hours.

If you are are one of the priority users of the binf nodes you can use

#SBATCH --qos fast_binf

to get in the priority queue, which has a maximum runtime of 72 hours.

If you just want to utilise the filesystems you can use the $BINFS/$BINFL environment variables in your scripts.

Using high memory nodes

If you have a need for more than 512GB of memory you can get the 4 1 TB nodes with

#SBATCH --mem 1000000

and if you want to use the 1.5 TB node issue a

#SBATCH --mem 1500000

in your script.

Pitfalls