====== Agenda ====== ===== Introduction to Working on the MUL Cluster ===== ---- ---- ===== Login ===== ==== General remarks ==== * User accounts and passwords are NOT identical to your MUL account * User name = first letter of given name + full surname * example: Erich Mustermann … emustermann * Connections limited to MUL network * use MUL VPN if necessary * Use a terminal program: xterm, putty, terminal, … ---- ===== Login ===== ==== Linux and OS X ==== ssh -X @mul-hpc-81a.unileoben.ac.at ssh -X @mul-hpc-81b.unileoben.ac.at === OS X === If, e.g., you have an US keyboard layout, you may have to include the following lines have in your ''%%.bashrc%%'' file on the login node in order to be able to submit jobs: export LC_CTYPE=en_US.UTF-8 export LC_ALL=en_US.UTF-8 Alternatively, you can change the same directly on your Mac in Terminal ➠ Preferences ➠ Profiles ➠ Advanced ➠ International Uncheck the option “Set locale environment variables on startup” and restart the terminal. ---- ===== Filesystems ===== We have 3 filesystems for users: * ''%%/home%%'' … essentially only for config files * NOT for data * NOT for calculations * NOT for compiling codes * ''%%/calc%%'' … for data (also longer time) and for running calculations * ''%%/scratch%%'' … fast filesystem for running calculations only ---- The filesystems are mounted: * via Infiniband on login nodes and compute nodes c1-xx, c2-xx * via Ethernet on all other nodes (c3-xx, c4-xx, c5-xx) We use ZFS filesystems with these RAID Levels and disk types: * RAID10 on SSDs for /home and /scratch * RAID(Z3)0 on HDDs for /calc * this is similar to RAID60 but with 3 redundancy disks ---- ===== Home directories ===== * home directories are located in ''%%/home%%'' * mounted via NFS from fileserver f1 * For performance reasons ''%%/home%%'' is separate from ''%%/calc%%'' * even if /calc is heavily loaded, /home should still be responsive * … important for interactive work * quota per user: 10 GB * /home is only for config files * not for compilation of codes (use /calc for this) ---- ===== Home directories ===== The directory structure is: /home/ By default permission are set to 0700: rwx------ * Directories and files are only accessible by user, * no access for group * no access for others. * The settings can be changed by the user who owns the directory. ---- ===== /calc directories ===== * located in ''%%/calc%%'' and mounted via NFS from fileserver f2 * directory structure is the same as for ''%%/home%%'': /calc/ * permissions are the same as for ''%%/home%%'' rwx------ ---- ===== purpose of /calc ===== * this is the place for * calculations * compiling codes * data * calculations * especially for calculations with low IO requirements * on our previous clusters > 90% of calculations were not IO intensive ---- ===== /calc quota ===== * /calc has a size of approx. 200 TB * the group quota for /calc depends on your cluster share * e.g. if you contributed 20% of the money you can use 20% of 200 TB = 40 TB * every group can have an arbitrary number of users therefore there is no uniform user quota ---- ===== /scratch directories ===== * mounted via NFS from fileserver f1 * directory structure is the same as for ''%%/home%%'' or ''%%/calc%%'': /scratch/ * permissions are the same as for ''%%/home%%'' or ''%%/calc%%'' rwx------ ---- ===== purpose of /scratch ===== * this is the place for IO heavy calculations * this can be used for software that supports a scratch directory * e.g. option ''%%scratch=/scratch/username%%'' of Abaqus * this is NOT the place for storing files for a longer time * there will be some policy for automatic deletion of data * this is also not the place for jobs with almost no IO * these would gain nothing * but would only waste space ---- ===== /scratch quota ===== * /scratch has a size of approx. 16 TB * there is no need for a quota if you use /scratch like intended: * copy files for a job from /calc to /scratch * run job * copy results of job back to /calc * delete job data immediately from /scratch * the best method is to do the copying and removing in your job script ---- ===== using /scratch in job script ===== example (non-working skeleton): #!/bin/bash #SBATCH ... see slurm slides ... [...] CALC=/home/username/someproject MYTMP=$(mktemp -p /scratch/username -d) cp -a $CALC $MYTMP cd $MYTMP # run the software here cp -a $MYTMP/. $CALC/. rm -rf $MYTMP ---- ===== Tranferring data ===== === copy data to cluster: === [...]$ scp @.unileoben.ac.at:~/ The servername can be one of these: * ''%%mul-hpc-81a%%'' , ''%%mul-hpc-81b%%'' * the 2 login nodes provide slow access to all filesystems * ''%%mul-hpc-81-fs1%%'' * fileserver 1 is fastest for ''%%/home%%'' and ''%%/scratch%%'' * ''%%mul-hpc-81-fs2%%'' * fileserver 2 is fastest for ''%%/calc%%'' ---- ===== Tranferring data ===== ==== backup of data ➠ rsync ==== * reduces amount of data sent over the network * “quick check” algorithm ➠ only changes sent * [[https://wiki.vsc.ac.at/doku.php?id=doku:backup&#continuous_backup_of_user_data_to_remote_machines|rsync options:]] ➠ ''%%rsync -av%%'' * recursive, copies symlinks, preserves permissions, modification times, group, special files, owner * Alternatively, data can be copied using, e.g., either ➠ FileZilla or ➠ winscp  Backup Policy: no backup of user data ➠ backup is solely the responsibility of each user  ---- ====== Policies ====== ===== How to get an account ===== * write an e-mail to the MUL-HPC support address * with CC to head of your chair * Address: ''%%mul-hpc-support@unileoben.ac.at%%'' ---- ====== Policies ====== ===== Fair share usage ===== * jobs are scheduled in a way that ensures a fair usage of the resources (mostly CPU cores) * according to predefined percentages * there is no static allocation of resources * this would be inflexible and bad for overall utilization * instead resources are scheduled dynamically see [[06_background_info.html#(10)|Background information]] slides for more technical details ---- ====== Policies ====== ===== Getting help ===== * write an e-mail to the MUL-HPC support address * Address: ''%%mul-hpc-support@unileoben.ac.at%%'' ---- ===== Scratch directories ===== * there is a hierarchy of scratch directories * ''%%/dev/shm%%'' … RAM disk which uses a maximum of 1/2 of the memory * ''%%/tmp%%'' … local /tmp directory of the node * ''%%/scratch/username%%'' … global (on fileserver) scratch directory * you can an use a script like the one shown above to copy/remove data * n.b.: everything you copy into ''%%/dev/shm%%'' reduces the available memory ----