This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ===== overview of services ===== {{.:slurm_services.png}} ===== compilation ===== <code> module purge module load gnu7/7.2.0 cd /opt/install/src/slurm/slurm-17.11.0 ./configure --prefix=/opt/ohpc/pub/slurm make -j 8 make install </code> Prerequisites: * hdf5, * hdf5-devel * munge-devel * mariadb-devel * pam-devel * lua-devel !!! ===== adjustments: ===== * add slurm.sh to profile.d in synclist * add /install/postscripts/set_slurmd to postscript list of compute slurm config directory: <code> /opt/ohpc/pub/slurm/etc </code> ===== db setup ===== Mysql/MariaDB: <code> create database slurm_acct_db; create user 'slurm'@'localhost' identified by 'password'; grant all on slurm_acct_db.* TO 'slurm'@'localhost'; </code> ===== config files ===== * slurm.conf (general conf) * slurmdbd.conf (database daemon) * topology.conf (infiniband structure, used for scheduling, can be empty) * cgroup.conf * gres.conf (can be empty) ===== update procedure ===== * first: restart slurmdb * second: restart slurmctld * third: slurmd on nodes * db consistency +/- two version numbers ===== backuping ===== * /etc/munge/munge.key * /etc/slurm/ <code> mysqldump --all-databases | /bin/gzip > slurm_complete-$(date +\%Y\%m\%d\%H\%M).sql.gz </code> ===== recovery of database ===== <code> create user 'slurm'@'localhost' identified by 'password'; grant all on slurm_acct_db.* TO 'slurm'@'localhost'; </code> <code> zcat slurm_complete-xxxxxxx | mysql </code> ===== pam slurm ===== <code> account required pam_slurm.so </code> * permits ssh login if user has an active job * synced with file /install/hpc81/etc/pam.d/sshd ===== pam user add ===== * write a script that adds new user to correct account (primary gid) pandoc/introduction-to-mul-cluster/02_slurm/01_install_and_recovery.txt Last modified: 2020/10/20 09:13by pandoc