This version (2024/10/25 13:14) was approved by katrin.The Previously approved version (2024/09/19 16:52) is available.Diff

JupyterHub

The VSC offers a JupyterHub service available for all VSC users at https://jupyterhub.vsc.ac.at

Login works with any cluster user and uses the OTP like on the cluster. A VPN connection is not needed.

Also make sure to checkout the FAQ to see if your question was already answered.

The started Jupyter Server instance runs on a separate hardware partition ('jupyter') in each cluster in order to not interfere with the regular operation of VSC.

Since the machines in this partition are shared between all users it is important to be aware of the memory requirements of the processing you are doing.

In order to help the Jupyter users the VSC Team recently added a Memory Resources Monitor plugin to the JupyterLab IDE. It is displayed in the right upper corner of the JupyterLab IDE and shows the current memory consumption of the Jupyter Server instance.

If your job goes out of memory it will affect your job as well as other users running on the same node. This is a hardware/system limitation of how slurm and linux work together. So make sure to have the right amount of memory selected for your purposes.

Important

Always make sure to select the right amount of memory suited to your requirements.

choose a profile:

  • VSC-4 Singularity Image (this is the default)
  • VSC-4 python venv
  • VSC-4 conda python env
  • VSC-5 Singularity Image
  • VSC-5 A40 GPU Singularity Image
  • VSC-5 A100 GPU Singularity Image
  • VSC-5 python venv
  • VSC-5 conda python env
  • VSC-5 A40 GPU conda python env
  • VSC-5 A100 GPU Singularity Image
  • If you are participating in a training, there is most likely only a single fixed profile or special training profiles available

Note: You need a VSC4 user for the VSC4 profiles and a VSC5 user for the VSC5 profiles. You also need to have logged into the respective cluster via SSH at least once after getting a username.

In all profiles, you can choose the IDE, either Jupyter Notebook or the more modern Jupyter Lab (which is the default)

For all the singularity profiles, you can choose between a predefined image (default) or a custom image:

  • Predefined image gives a dropdown list of different singularity images. Each image shows a short description, a version, an author and a link to the image. The GPU option only has one predefined image. See also some examples here
  • Custom image shows a line where the complete path to the image has to be entered. Note that the image has to reside on the cluster already. Web-upload is not possible at the moment.

Note: training profiles usually do not allow for customization.

Independent of the profile you can choose the number of CPUs, size of the RAM and the maximum running time.

After everything is selected, press start - your server will start and show you the JupyterLab interface or the simpler Notebook interface. In both cases, there is a list of all files in your home directory, which includes previously created notebooks.

Simply click on the desired notebook in your list - it will open in the main panel.

On first use, no notebook is present yet. To create one (or to create new ones), go to File → New → Notebook. You will be asked what kernel to use (the default is Python3): choose select and the notebook is started. It will show up as untitled in the file list, and can be renamed via right-click.

It is possible to log out of the system and leave the server running with File → Log Out. Keep in mind that this also keeps the selected resources blocked for other users.

To stop the server, user FileHub Control Panel

On the page that opens click the button labelled 'Stop My Server'.

If you have chosen to work with the Jupyter Notebook IDE, simply click on the desired notebook in your list - it will be opened in a new tab.

On first use, no notebook is present yet. To create one (or to create new ones), click on 'new' in the upper right area above the file list and choose the appropriate type. The new notebook will start in a new browser tab. It will show up as untitled in the file list, and can be renamed by selecting the checkbox to the left of the name and choose 'rename' at the top.

It is possible to log out of the system and leave the server running with the 'Logout' button at the top right. Keep in mind that this also keeps the selected resources blocked for other users.

To stop the server, click on 'Control Panel' at the top right and choose 'Stop My Server'.

It is possible to use a custom apptainer/singularity image with our JupyterHub profiles.

A good starting point for creating your own container is the documentation of the official JupyterHub Docker Stacks Images @ Docker-Stacks

Apart from adding your own software the image also needs to have at least the following packages for it to be able to run in our JupyterHub environment:

# This package pulls in all the necessary dependencies to start a jupyter server
# Make sure this always matches the current JupyterHub version used by VSC
jupyterhub==3.1.1        

# This package provides functionality needed to run in the slurm environment of VSC (e.g. `batchspawner-singleuser` script).
git+https://github.com/vsc-ac-at/batchspawner.git@1.2.0+auth_fix

# June 2024: fix a dependency to an earlier version
nbclassic==0.3.7

In addition we usually also install the following jupyterlab extensions in our images but they are not strictly necessary and just provide extended functionality like memory monitoring for the user:

  • jupyterlab-system-monitor
  • jupyterlab-git
  • jupyterlab-widgets (or jupyterlab_widgets in conda)

An up 2 date list of packages can always be found in our repo: requirements.txt

Run hooks from /usr/local/bin/before-notebook.d

If your image needs to run hooks before startup (e.g. the pyspark image depends on this PySpark Dockerfile), the docker stack images provide a folderfor such startup scripts `/usr/local/bin/before-notebook.d`.

Unfortunately the batchspawner package does not source them so we are using a custom startscript in our images called `vsc-singleuser.sh`. All it does is to run the hooks from `/usr/local/bin/before-notebook.d` before executing the `batchspawner-singleuser` script.

The script can be found at vsc-singleuser.sh.

Note: Contact us to get read rights to the repository.

Dockerfile Documentation

If you already have a docker image you want to start from or if you are more familiar with docker image creation you can just use that image and create it into a singularity image after you are finished.

Note: Make sure that you use the right versions for the current version of our JupyterHub (the version is displayed at the bottom of the jupyterhub page and was 3.1.1 at the time of writing).

After selecting / building the docker image all that needs to be done is to convert it into a singularity image.

You can do this by executing the following lines (assuming your docker image is named “my_image”)

singularity build my_image.sif docker://my_image

See the documentation for more examples: Build a (singularity) container

When the conversion process has finished make sure that the resulting “.sif” file is placed in a folder that is available from all compute nodes (e.g. home dir; data dir)

Note: the conversion can be done on our cluster since we have singularity installed on our nodes.

Instead of starting with docker you can also directly build a singularity image that is ready to use with our JupyterHub instance.

For this you need to create a so called “.def” file - see the singularity documentation for more information on this format: Definition Files

Here is a minimal example using the datascience docker stacks image as a basis:

BootStrap: docker
From: jupyter/datascience-notebook:hub-3.1.1

%post
/opt/conda/bin/pip install jupyterhub==3.1.1 git+https://github.com/katringoogoo/batchspawner.git@1.2.0+auth_fix               

If we save the file as “my_image.def” we can use the build command to build the image with singularity/apptainer.

Please not that you have to use apptainer if you want to build on the cluster, since the singularity version that comes installed on all nodes from the OS needs root rights to build an image.

  • VSC4: module load –auto apptainer/1.1.9-gcc-12.2.0-vflkcfi
  • VSC5: module load –auto apptainer/1.1.6-gcc-12.2.0-xxfuqni
# load an apptainer module (see above)
module load --auto apptainer/<version>

# build the image
apptainer build my_image.sif my_image.def

If you cannot use the versions provided on VSC you can of course also build the image on your own machine and upload it the VSC.

  • My Server instance is stuck and I get a timeout when I try to reload the window. Going to the VSC jupyterhub website also results in a timeout.

In order to to control the running instance, navigate directly to the following URL: https://jupyterhub.vsc.ac.at/hub/home. From there you can click 'Stop My Server' to stop the running instance (if there is any).

  • doku/jupyterhub.txt
  • Last modified: 2024/10/24 10:28
  • by 127.0.0.1