====== Agenda ======
===== Introduction to Working on the MUL Cluster =====
----
----
===== Login =====
==== General remarks ====
* User accounts and passwords are NOT identical to your MUL account
* User name = first letter of given name + full surname
* example: Erich Mustermann … emustermann
* Connections limited to MUL network
* use MUL VPN if necessary
* Use a terminal program: xterm, putty, terminal, …
----
===== Login =====
==== Linux and OS X ====
ssh -X @mul-hpc-81a.unileoben.ac.at
ssh -X @mul-hpc-81b.unileoben.ac.at
=== OS X ===
If, e.g., you have an US keyboard layout, you may have to include the following lines have in your ''%%.bashrc%%'' file on the login node in order to be able to submit jobs:
export LC_CTYPE=en_US.UTF-8
export LC_ALL=en_US.UTF-8
Alternatively, you can change the same directly on your Mac in
Terminal ➠ Preferences ➠ Profiles ➠ Advanced ➠ International
Uncheck the option “Set locale environment variables on startup” and restart the terminal.
----
===== Filesystems =====
We have 3 filesystems for users:
* ''%%/home%%'' … essentially only for config files
* NOT for data
* NOT for calculations
* NOT for compiling codes
* ''%%/calc%%'' … for data (also longer time) and for running calculations
* ''%%/scratch%%'' … fast filesystem for running calculations only
----
The filesystems are mounted:
* via Infiniband on login nodes and compute nodes c1-xx, c2-xx
* via Ethernet on all other nodes (c3-xx, c4-xx, c5-xx)
We use ZFS filesystems with these RAID Levels and disk types:
* RAID10 on SSDs for /home and /scratch
* RAID(Z3)0 on HDDs for /calc
* this is similar to RAID60 but with 3 redundancy disks
----
===== Home directories =====
* home directories are located in ''%%/home%%''
* mounted via NFS from fileserver f1
* For performance reasons ''%%/home%%'' is separate from ''%%/calc%%''
* even if /calc is heavily loaded, /home should still be responsive
* … important for interactive work
* quota per user: 10 GB
* /home is only for config files
* not for compilation of codes (use /calc for this)
----
===== Home directories =====
The directory structure is:
/home/
By default permission are set to 0700:
rwx------
* Directories and files are only accessible by user,
* no access for group
* no access for others.
* The settings can be changed by the user who owns the directory.
----
===== /calc directories =====
* located in ''%%/calc%%'' and mounted via NFS from fileserver f2
* directory structure is the same as for ''%%/home%%'':
/calc/
* permissions are the same as for ''%%/home%%''
rwx------
----
===== purpose of /calc =====
* this is the place for
* calculations
* compiling codes
* data
* calculations
* especially for calculations with low IO requirements
* on our previous clusters > 90% of calculations were not IO intensive
----
===== /calc quota =====
* /calc has a size of approx. 200 TB
* the group quota for /calc depends on your cluster share
* e.g. if you contributed 20% of the money you can use 20% of 200 TB = 40 TB
* every group can have an arbitrary number of users therefore there is no uniform user quota
----
===== /scratch directories =====
* mounted via NFS from fileserver f1
* directory structure is the same as for ''%%/home%%'' or ''%%/calc%%'':
/scratch/
* permissions are the same as for ''%%/home%%'' or ''%%/calc%%''
rwx------
----
===== purpose of /scratch =====
* this is the place for IO heavy calculations
* this can be used for software that supports a scratch directory
* e.g. option ''%%scratch=/scratch/username%%'' of Abaqus
* this is NOT the place for storing files for a longer time
* there will be some policy for automatic deletion of data
* this is also not the place for jobs with almost no IO
* these would gain nothing
* but would only waste space
----
===== /scratch quota =====
* /scratch has a size of approx. 16 TB
* there is no need for a quota if you use /scratch like intended:
* copy files for a job from /calc to /scratch
* run job
* copy results of job back to /calc
* delete job data immediately from /scratch
* the best method is to do the copying and removing in your job script
----
===== using /scratch in job script =====
example (non-working skeleton):
#!/bin/bash
#SBATCH ... see slurm slides ...
[...]
CALC=/home/username/someproject
MYTMP=$(mktemp -p /scratch/username -d)
cp -a $CALC $MYTMP
cd $MYTMP
# run the software here
cp -a $MYTMP/. $CALC/.
rm -rf $MYTMP
----
===== Tranferring data =====
=== copy data to cluster: ===
[...]$ scp @.unileoben.ac.at:~/
The servername can be one of these:
* ''%%mul-hpc-81a%%'' , ''%%mul-hpc-81b%%''
* the 2 login nodes provide slow access to all filesystems
* ''%%mul-hpc-81-fs1%%''
* fileserver 1 is fastest for ''%%/home%%'' and ''%%/scratch%%''
* ''%%mul-hpc-81-fs2%%''
* fileserver 2 is fastest for ''%%/calc%%''
----
===== Tranferring data =====
==== backup of data ➠ rsync ====
* reduces amount of data sent over the network
* “quick check” algorithm ➠ only changes sent
* [[https://wiki.vsc.ac.at/doku.php?id=doku:backupcontinuous_backup_of_user_data_to_remote_machines|rsync options:]] ➠ ''%%rsync -av%%''
* recursive, copies symlinks, preserves permissions, modification times, group, special files, owner
* Alternatively, data can be copied using, e.g., either ➠ FileZilla or ➠ winscp
Backup Policy: no backup of user data ➠ backup is solely the responsibility of each user
----
====== Policies ======
===== How to get an account =====
* write an e-mail to the MUL-HPC support address
* with CC to head of your chair
* Address: ''%%mul-hpc-support@unileoben.ac.at%%''
----
====== Policies ======
===== Fair share usage =====
* jobs are scheduled in a way that ensures a fair usage of the resources (mostly CPU cores)
* according to predefined percentages
* there is no static allocation of resources
* this would be inflexible and bad for overall utilization
* instead resources are scheduled dynamically
see [[06_background_info.html#(10)|Background information]] slides for more technical details
----
====== Policies ======
===== Getting help =====
* write an e-mail to the MUL-HPC support address
* Address: ''%%mul-hpc-support@unileoben.ac.at%%''
----
===== Scratch directories =====
* there is a hierarchy of scratch directories
* ''%%/dev/shm%%'' … RAM disk which uses a maximum of 1/2 of the memory
* ''%%/tmp%%'' … local /tmp directory of the node
* ''%%/scratch/username%%'' … global (on fileserver) scratch directory
* you can an use a script like the one shown above to copy/remove data
* n.b.: everything you copy into ''%%/dev/shm%%'' reduces the available memory
----