Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revisionBoth sides next revision
doku:vsc5quickstart [2022/06/24 06:40] – [SLURM] jzdoku:vsc5quickstart [2023/05/17 15:00] msiegel
Line 1: Line 1:
 ====== Quick start guide for VSC-5 ====== ====== Quick start guide for VSC-5 ======
- 
-**Status: 2022/04** 
- 
-This page is under construction. 
  
 ===== Connecting ===== ===== Connecting =====
Line 23: Line 19:
  
  
-===== Loading Modules & Spack Environments =====+===== Software installations =====
  
-Different CPUs come with different compilers, so we use +==== New SPACK without environments ====
-the new spack feature ''environment'' to make sure to choose the right +
-package. +
  
-On login the default ''spack environment'' (zen3) is loaded +Having worked with spack environments for some timewe have encountered several severe issues which have convinced us that we need to find a more practical way of maintaining software packages at VSC.
-automaticallyso only modules that run on AMD processors are visible +
-with ''spack find''.+
  
-On VSC5 no default modules are loaded. Please do that by yourself +There are now three separate spack installation trees corresponding to the CPU/GPU architectures on VSC:
-using ''spack load <module>'' or ''module load <module>''+
  
-Find the official SPACK documentation at https://spack.readthedocs.io/+  * skylake - Intel CPUs; works on Intel Skylake and Cascadelake CPUs 
 +  * zen - AMD CPUs; works on Zen 2 and 3 CPUs 
 +  * cuda-zen - AMD CPUs + NVIDIA GPUs; works on all nodes equipped with graphics cards
  
- +By default the spack installation tree suitable for the current compute/login node is activated and will be indicated by a **prefix** on the command line, e.g.:
-==== List Spack Environments ==== +
- +
-Type ''spack env list'' to see which environments are available and +
-which one is active.+
  
 <code> <code>
-spack env list +zen [user@l51 ~]$
-==> 2 environments +
-    cascadelake  zen3+
 </code> </code>
  
-The current ''spack environment'' is also shown in your prompt+Read more about SPACK at
- +  * [[doku:spack-transition | Transition to new SPACK without Environments]] 
-<code> +  * [[doku:spack]] 
-(zen3) [myname@l55 ~]# +  * [[https://spack.readthedocs.io/en/latest/basic_usage.html|Official documentation of SPACK]]
-</code> +
- +
-Mind that if your prompt is changed later, like when loading a ''python +
-environment'' using ''conda'', the correct ''spack environment'' might +
-not be shown correctly in your prompt. +
- +
-When a spack environment is activated, the command ''spack find -l'' lists those packages available for the active +
-environment. +
- +
-The command ''module avail'' will also show only those modules that are compatible with the active +
-spack environment. +
- +
- +
-==== Change Spack Environment ==== +
- +
-If you want to look for a certain package that belongs to another +
-architecture, first change the spack environment: +
- +
-<code> +
-$ spacktivate <myenv> +
-$ spacktivate cascadelake +
-</code> +
- +
-Only then ''spack find'' will show the modules for the active environment (e.g''cascadelake''). +
- +
- +
-==== Save Spack Environment ==== +
- +
-The following creates a load script for your current spack environment +
-with all loaded modules: +
- +
-<code> +
-$ spack env loads -r  +
-</code>+
  
-This creates a file called ''loads'' in the environment 
-directory. Sourcing that file in bash will make the environment 
-available to the user. The ''source loads'' command can be included in 
-''.bashrc'' files. The loads file may also be copied out of the 
-environment, renamed, etc. 
  
 +==== Load a module ====
  
-==== Load Module ====+Most software is installed via SPACK, so you can use spack commands like ''spack find -ld xyz'' to get details about the installation. All these installations also provide module, find available modules with ''module avail xyz'', and load with ''module load xyz''. See [[doku:spack|SPACK - a package manager for HPC systems]] for more information.
  
-Please always use spacksee [[doku:spack|SPACK - a package manager for +Some software is still installed by handfind available modules with ''module avail xyz'', and load with ''module load xyz''.
-HPC systems]].+
  
  
-===== Compile Code =====+===== Compile code =====
  
 A program needs to be compiled on the hardware it will later run A program needs to be compiled on the hardware it will later run
Line 120: Line 68:
 128 physical cores (core-id 0-127) and 256 virtual cores available. 128 physical cores (core-id 0-127) and 256 virtual cores available.
  
-The A100 GPU nodes have 512GB RAM and the NVIDIA A100 cards have 40GB RAM each. +The A100 GPU nodes have 512GB RAM and the two NVIDIA A100 cards have 40GB RAM each. 
-At the moment 40 GPU nodes are installed.+60 A100 nodes are installed.
  
 +The A40 GPU nodes have 256GB RAM and the two NVIDIA A40 cards have 46GB each.
 +45 A40 nodes are installed.
 <code> <code>
 $ nvidia-smi $ nvidia-smi
Line 155: Line 105:
  
 ===== SLURM ===== ===== SLURM =====
- +For the partition/queue setup see [[doku:vsc5_queue|Queue | Partition setup on VSC-5]]. 
-The following partitions are currently available+type ''sinfo -o %P'' to see the available partitions.
-<code> +
-sinfo -o %P +
-PARTITION +
-gpu_a100_dual* -> Currently the default partition. AMD CPU nodes with 2x AMD Epyc (Milan) and 2x NIVIDA A100 and 512GB RAM +
-cascadelake_0384 -> Intel CPU nodes with 2x Intel Cascadelake and 384GB RAM +
-zen3_0512 -> AMD CPU nodes with 2x AMD Epyc (Milan) and 512GB RAM +
-zen3_1024 -> AMD CPU nodes with 2x AMD Epyc (Milan) and 1TB RAM +
-zen3_2048 -> AMD CPU nodes with 2x AMD Epyc (Milan) and 2TB RAM +
-</code> +
- +
-==== QoS ==== +
- +
-During the friendly user test phase the QoS ''goodluck'' can be used for both partitions.+
  
 ==== Submit a Job ==== ==== Submit a Job ====
Line 179: Line 116:
 #SBATCH -J <meaningful name for job> #SBATCH -J <meaningful name for job>
 #SBATCH -N 1 #SBATCH -N 1
-#SBATCH --gres=gpu:2 
 ./my_program ./my_program
 </file> </file>
  
-This will submit a job in the default partition (gpu_a100_dual) using the default QoS (gpu_a100_dual).+This will submit a job in the default partition (zen3_0512) using the default QoS (zen3_0512).
  
 To submit a job to the cascadelake nodes: To submit a job to the cascadelake nodes:
Line 191: Line 127:
 #SBATCH -N 1 #SBATCH -N 1
 #SBATCH --partition=cascadelake_0384 #SBATCH --partition=cascadelake_0384
-#SBATCH --qos goodluck+#SBATCH --qos cascadelake_0384
 ./my_program ./my_program
 </file> </file>
Line 202: Line 138:
 #SBATCH -N 1 #SBATCH -N 1
 #SBATCH --partition=zen3_0512 #SBATCH --partition=zen3_0512
-#SBATCH --qos goodluck+#SBATCH --qos zen3_0512
 ./my_program ./my_program
 </file> </file>
Line 211: Line 147:
 #SBATCH -N 1 #SBATCH -N 1
 #SBATCH --partition=zen3_1024 #SBATCH --partition=zen3_1024
-#SBATCH --qos goodluck+#SBATCH --qos zen3_1024
 ./my_program ./my_program
 </file> </file>
Line 220: Line 156:
 #SBATCH -N 1 #SBATCH -N 1
 #SBATCH --partition=zen3_2048 #SBATCH --partition=zen3_2048
-#SBATCH --qos goodluck+#SBATCH --qos zen3_2048
 ./my_program ./my_program
 </file> </file>
Line 230: Line 166:
 #SBATCH -J <meaningful name for job> #SBATCH -J <meaningful name for job>
 #SBATCH -N 1 #SBATCH -N 1
-#SBATCH --partition=gpu_a100_dual +#SBATCH --partition=zen3_0512_a100x2 
-#SBATCH --qos goodluck+#SBATCH --qos zen3_0512_a100x2
 #SBATCH --gres=gpu:2 #SBATCH --gres=gpu:2
 ./my_program ./my_program
Line 240: Line 176:
 #!/bin/sh #!/bin/sh
 #SBATCH -J <meaningful name for job> #SBATCH -J <meaningful name for job>
-#SBATCH --partition=gpu_a100_dual +#SBATCH --partition=zen3_0512_a100x2 
-#SBATCH --qos goodluck+#SBATCH --qos zen3_0512_a100x2
 #SBATCH --gres=gpu:1 #SBATCH --gres=gpu:1
 ./my_program ./my_program
Line 254: Line 190:
  
 Official Slurm documentation: https://slurm.schedmd.com Official Slurm documentation: https://slurm.schedmd.com
 +
 +===== Intel MPI =====
 +
 +When **using Intel-MPI on the AMD nodes and mpirun** please set the following environment variable in your job script to allow for correct process pinning:
 +
 +<code>
 +export I_MPI_PIN_RESPECT_CPUSET=0
 +</code>
  
  • doku/vsc5quickstart.txt
  • Last modified: 2023/05/17 15:28
  • by msiegel