This version (2024/10/24 10:28) is a draft.
Approvals: 0/1
The Previously approved version (2015/05/28 12:19) is available.Diff

Login

ssh <username>@vsc3.vsc.ac.at

In the following you will be asked to type first your password and then your one time password (OTP; sms token).

How to connect from Windows?

VSC 3

Once you have logged into VSC-3, type:

  • module avail to get a basic idea of what is around in terms of installed software and available standard tools
  • module list to see what is currently loaded into your session
  • module unload xyz to unload a particular package xyz from your session
  • module load xyz to load a particular package xyz into your session
Note:
  1. xyz format corresponds exactly to the output of module avail. Thus, in order to load or unload a selected module, copy and paste exactly the name listed by module avail.
  2. a list of module load/unload directives may also be included in the top part of a job submission script

When all required/intended modules have been loaded, user packages may be compiled as usual.

SLURM (Simple Linux Utility for Resource Management)

Contrary to previous VSC* times, the scheduler on VSC-3 is SLURM. For basic information type:

  • sinfo to find out which 'queues'='partitions' are available for job submission. Note: the in SGE times termed 'queue', is now under SLURM called a 'partition'.
  • scontrol show partition more or less the same as the previous command except that with scontrol much more information may be obtained and basic settings be modified/reset/abandoned.
  • squeue to see the current list of submitted jobs, their state and resource allocation.

vi check.slrm

#!/bin/bash
#
#SBATCH -J chk
#SBATCH -N 2
#SBATCH --ntasks-per-node=16
#SBATCH --ntasks-per-core=1

mpirun -np 32 a.out
  • -J some name for the job
  • -N number of nodes requested (16 cores per node available)
  • –ntasks-per-node number of processes run in parallel on a single node
  • –ntasks-per-core number of tasks a single core should work on
  • mpirun -np 32 a.out standard invocation of some parallel program (a.out) running 32 processes in parallel. Note,
  • in SLURM srun is preferred over mpirun, so an equivalent call to the one on the final line above could have been srun -l -N2 -n32 a.out where the -l just adds task-specific labels to the beginning of all output lines.
Job submission
[username@l31 ~]$ sbatch check.slrm    # to submit the job             
[username@l31 ~]$ squeue               # to check the status  
[username@l31 ~]$ scancel  JOBID       # for premature removal, where JOBID
                                       # is obtained from the previous command   

Another simple job submission script

This example is for using a set of 4 nodes to compute a series of jobs in two stages, each of them split into two separate subjobs.

vi check.slrm

#!/bin/bash
#
#SBATCH -J chk
#SBATCH -N 4
#SBATCH --ntasks-per-node=16
#SBATCH --ntasks-per-core=1

export I_MPI_PMI_LIBRARY=/cm/shared/apps/slurm/current/lib64/libpmi.so

scontrol show hostnames $SLURM_NODELIST  > ./nodelist

srun -l -N2 -r0 -n32 job1.scrpt &
srun -l -N2 -r2 -n32 job2.scrpt &
wait

srun -l -N2 -r2 -n32 job3.scrpt &
srun -l -N2 -r0 -n32 job4.scrpt &
wait
Note:

the file 'nodelist' has been written for information only;
it is important to send the jobs into the background (&) and insert the 'wait' at each synchronization point;
with -r2 one can define an offset in the node list, in particular the -r2 means taking nodes number 2 and 3 from the set of four (where the list starts with node number 0), hence a combination of -N -r -n allows full control over all involved cores and the tasks they are going to be used for;

  • doku/quick3save.txt
  • Last modified: 2024/10/24 10:28
  • by 127.0.0.1