This version (2024/10/24 10:28) is a draft.
Approvals: 0/1
Approvals: 0/1
overview of services
compilation
module purge module load gnu7/7.2.0 cd /opt/install/src/slurm/slurm-17.11.0 ./configure --prefix=/opt/ohpc/pub/slurm make -j 8 make install
Prerequisites:
- hdf5,
- hdf5-devel
- munge-devel
- mariadb-devel
- pam-devel
- lua-devel !!!
adjustments:
- add slurm.sh to profile.d in synclist
- add /install/postscripts/set_slurmd to postscript list of compute
slurm config directory:
/opt/ohpc/pub/slurm/etc
db setup
Mysql/MariaDB:
create database slurm_acct_db; create user 'slurm'@'localhost' identified by 'password'; grant all on slurm_acct_db.* TO 'slurm'@'localhost';
config files
- slurm.conf (general conf)
- slurmdbd.conf (database daemon)
- topology.conf (infiniband structure, used for scheduling, can be empty)
- cgroup.conf
- gres.conf (can be empty)
update procedure
- first: restart slurmdb
- second: restart slurmctld
- third: slurmd on nodes
- db consistency +/- two version numbers
backuping
- /etc/munge/munge.key
- /etc/slurm/
mysqldump --all-databases | /bin/gzip > slurm_complete-$(date +\%Y\%m\%d\%H\%M).sql.gz
recovery of database
create user 'slurm'@'localhost' identified by 'password'; grant all on slurm_acct_db.* TO 'slurm'@'localhost';
zcat slurm_complete-xxxxxxx | mysql
pam slurm
account required pam_slurm.so
- permits ssh login if user has an active job
- synced with file /install/hpc81/etc/pam.d/sshd
pam user add
- write a script that adds new user to correct account (primary gid)