This version is outdated by a newer approved version.DiffThis version (2023/02/23 11:31) was approved by katrin.The Previously approved version (2023/02/23 11:11) is available.Diff

This is an old revision of the document!


Python

Python is comparatively fast evolving programming language, so different versions behave very differently. We provide multiple varieties of python installations, please always use spack to find and load them.

The VSC team makes sure to have the most used packages readily available via spack. The installed python packages are always named in the following format py-mypackagename e.g. py-numpy or py-scipy. If you can, please always consider using those packages first. See spack on how to find and load them.

There are many additional python packages, some with long dependency chains. Because of this we simply cannot install all of them for all the different python versions we provide. If you need a specific package and its a very popular one, consider dropping us a mail so we can make it generally available.

Apart from loading packages via spack you should always consider creating a virtual environment for your project. This way it will be easier to install other packages / specific package versions and its also possible to exactly track them to produce consistent results. For most of the python packages this is the easiest way to get you up and running in no time.

Note: Before you start, make sure that you have loaded the python version you need! The virtual environment will be created using this version.

cd my_project_folder
python -m venv venv --system-site-packages
source venv/bin/activate
pip install autopep8

The above commands create a new virtual environment in the folder 'venv' (including the system provided packages), activate it and install the package autopep8 into it.

To be able to reproduce the venv, consider specifying the exact versions of the packages as well as tracking your packages in a requirements.txt file (also see python4HPC Development Tools Lecture and Installing packages using pip and virtual environments for more information).

Sometimes, especially in a scientific context, there will be cases were you cannot use pip e.g. some packages need to be compiled. Again this creates the problem that we simply cannot each and every package and package version in our infrastructure.

In this case you can use conda (see Anaconda) instead of pip to set up a consistent local python environment. Conda provides ready made binary distributions for many scientific packages and can thus be used to circumvent this problem.

To load conda on our clusters search for the miniconda3 package. At the time of writing the miniconda3 package is available on both VSC-4 & VSC-5 and can be loaded via

spack load miniconda3@4.12.0

If you plan to use conda more frequently you can simple add the spack load statement to your ~/.bashrc file to load it automatically after logging in.

Also make sure that you run the following statements if you are setting up conda for the first time:

conda init bash --dry-run --verbose | grep "# >>> conda initialize" -A 100 | grep "# <<< conda initialize" -B 100 | sed 's/+//g' > ~/.bashrc
source ~/.bashrc

This will add some necessary startup code for conda to your ~/.bashrc file.

After executing these steps you will see that your prompt changed to (base) [myname@l51 ~]$ which signifies that conda is active and the base environment is active.

The default conda channel points to anacondas repository. However this repo does not always contain the latest packages. There is a community driven channel called conda-forge that has many more packages and most of the time the newer versions readily available.

To use conda-forge you need to specify –channel conda-forge when executing conda install commands.

If you always want to use conda-forge you can executed the following statement to make it the default for your user

conda config --add channels conda-forge

In order to create your own user environment you need to do the following steps. To also give a short example for a package which we do not provide via spack we will install phono3py (available onconda forge) into our conda environment (myenv) with conda:

# create conda env 'myenv', set conda-forge channel as default and use the latest python 3.11
conda create --name myenv --channel conda-forge python=3.11
conda activate myenv
conda install --channel conda-forge numpy phono3py

With the above statements conda will create a new environment, activate it and install the requested packages into it. You should see that your prompt now changed to (myenv) [myname@l51 ~]$'

The following commands provide a bit of introspection to make sure that everything is setup as expected:

(myenv) [myname@l51 ~]$ which python
~/.conda/envs/myenv/bin/python
(myenv) [myname@l51 ~]$ python --version
Python 3.11.0
(myenv) [myname@l51 ~]$ which phono3py
~/.conda/envs/myenv/bin/phono3py

Starting python in this conda environment (myenv) and loading the packages also works:

(myenv) myname@l51:~$ python
Python 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> import phono3py
>>> exit()

See the following minimal example to use conda with slurm in your batch script (the example assumes that you have the conda init code in your ~/.bashrc file - see below for an alternative). See slurm for detailed information about slurm in general.

#!/bin/bash --login
 
#SBATCH --job-name=slurm_conda_example
#SBATCH --time 00-00:05:00
#SBATCH --ntasks=2
#SBATCH --mem=2GB
 
spack load miniconda3@4.12.0
conda activate myenv
which python
python --version

Please note the –login flag to the shell above. This way the shell in the job will be a login shell and thus load your ~/.bashrc file before executing the script code. If cannot or don't want do this you can also just extract the conda init code from your bashrc file and put it into a separate file (e.g. conda-init.sh) that is placed in your home directory and source this file manually in the batch script.

See the following example:

#!/bin/bash
 
#SBATCH --job-name=slurm_conda_example
#SBATCH --time 00-00:05:00
#SBATCH --ntasks=2
#SBATCH --mem=2GB
 
spack load miniconda3@4.12.0
source ~/conda-init.sh
conda activate myenv
which python
python --version

In case you need visualization capabilities or you need to do some preprocessing also consider using our JupyterHub service jupyterhub.

Please note that you should still use slurm and batch processing for actual computation runs since JupyterHub is mainly reserved for interactive use and runs on shared nodes.

  • doku/python.1677151784.txt.gz
  • Last modified: 2023/02/23 11:29
  • by katrin