This version (2020/10/20 08:09) is a draft.
Approvals: 0/1

VASP

module purge
module load autotools 
module load gnu7/7.2.0 
module load openmpi3/3.0.0 
module load openblas/0.2.20
module load scalapack/2.0.2 
module load fftw/3.3.6 
module load prun

ln -s makefile.linux_gfortran makefile
export CC=gcc
make

OFLAG      = -O2 -march=broadwell

LIBDIR     = /opt/ohpc/pub/libs/gnu7/openblas/0.2.20/lib

BLAS       = -L$(LIBDIR) -lopenblas

LAPACK     =

BLACS      =

SCALAPACK  = -L$(LIBDIR) -lscalapack $(BLACS)

FFTW       ?= /opt/ohpc/pub/libs/gnu7/openmpi3/fftw/3.3.6

make all
make veryclean
make all
make std

/opt/ohpc/pub/examples/slurm/mul/vasp

#!/bin/bash
#
#SBATCH -J vasp
#SBATCH -N 2
#SBATCH -o job.%j.out
#SBATCH -p E5-2690v4
#SBATCH -q E5-2690v4-batch
#SBATCH --ntasks-per-node=28
#SBATCH --threads-per-core=1
#SBATCH --mem=16G

export OMP_NUM_THREADS=1

exe=/path/to/my/vasp/vasp.5.4.4/bin/vasp_std

time mpirun -np $SLURM_NPROCS $exe

LAMMPS

/opt/ohpc/pub/examples/slurm/mul/lammps

#!/bin/bash
#
#SBATCH -J lammps 
#SBATCH -N 2
#SBATCH -o job.%j.out
#SBATCH -p E5-2690v4
#SBATCH -q E5-2690v4-batch
#SBATCH --ntasks-per-node=28
#SBATCH --threads-per-core=1
#SBATCH --mem=16G

module purge
module load lammps

mpirun -np $SLURM_NTASKS lmp_mul -in ./in.put

Wien2k


Example job script in:

/opt/ohpc/pub/examples/slurm/mul/wien2k

COMSOL


load module COMSOL/5.2a

You can use only one node in this case. Create a job file called job_smp.sh:

/opt/ohpc/pub/examples/slurm/mul/comsol

#!/bin/bash
#
#SBATCH -J comsol
#SBATCH -N 1
#SBATCH -o job.%j.out
#SBATCH -p E5-2690v4
#SBATCH -q E5-2690v4-batch
#SBATCH --ntasks-per-node=28
#SBATCH --threads-per-core=1
#SBATCH --time=04:00:00
#SBATCH --mem=16G

# Details of your input and output files
INPUTFILE=micromixer_cluster.mph
OUTPUTFILE=micromixer.out

# Load our comsol module
module purge
module load COMSOL/5.2a

# create tmpdir 
TMPDIR="/tmp1/comsol"
mkdir -p $TMPDIR

## Now, run COMSOL in batch mode with the input and output detailed above.
comsol batch -np $SLURM_NTASKS -inputfile $INPUTFILE -outputfile $OUTPUTFILE -tmpdir $TMPDIR

You can use more than one node in this case. Create a job file called job_mpi.sh:

/opt/ohpc/pub/examples/slurm/mul/comsol

#!/bin/bash
#
#SBATCH -J comsol
#SBATCH -N 2
#SBATCH -o job.%j.out
#SBATCH -p E5-2690v4
#SBATCH -q E5-2690v4-batch
#SBATCH --ntasks-per-node=28
#SBATCH --threads-per-core=1
#SBATCH --time=04:00:00
#SBATCH --mem=16G

# Details of your input and output files
INPUTFILE=micromixer_cluster.mph
OUTPUTFILE=micromixer.out

# Load our comsol module
module purge
module load COMSOL/5.2a
module load intel-mpi/2018

# create tmpdir 
TMPDIR="/tmp1/comsol"
mkdir -p $TMPDIR

## Now, run COMSOL in batch mode with the input and output detailed above.
comsol -clustersimple batch \
-inputfile $INPUTFILE \
-outputfile $OUTPUTFILE \
-tmpdir $TMPDIR \
-mpiroot $MPIROOT -mpi intel -mpifabrics shm:dapl

sbatch job_xyz.sh

To continue your calculation from the last saved state on use the options:

-recover -continue

ANSYS Fluent

Check available Fluent versions:

module avail 2>&1 |grep Fluent

Load the correct version of fluent:

module load ANSYS-Fluent/17.1

Create a journal file (fluent.jou) which is written in a dialect of Lisp called Scheme and contains all the instructions that are to be executed during the run. A basic form of this file, is as follows:

/opt/ohpc/pub/examples/slurm/mul/ansys

# -----------------------------------------------------------
# SAMPLE JOURNAL FILE
#
# read case file (*.cas.gz) that had previously been prepared
file/read-case "tubench1p4b.cas.gz"
file/autosave/data-frequency 10
solve/init/initialize-flow
solve/iterate 500
file/write-data "tubench1p4b.dat.gz"
exit yes

The autosave/data-frequency setting will save a data file every 10 iterations.


A script for running Ansys/Fluent called fluent_run.sh is shown below.

#!/bin/sh
#SBATCH -J fluent
#SBATCH -N 2
#SBATCH -o job.%j.out
#SBATCH -p E5-2690v4
#SBATCH -q E5-2690v4-batch
#SBATCH --ntasks-per-node=28
#SBATCH --threads-per-core=1
#SBATCH --time=04:00:00
#SBATCH --mem=16G

module purge
module load ANSYS-Fluent/17.1

JOURNALFILE=fluent.jou

if [ $SLURM_NNODES -eq 1 ]; then
    # Single node with shared memory
    fluent 3ddp -g -t $SLURM_NTASKS -i $JOURNALFILE > fluent.log 
else
    # Multi-node
    fluent 3ddp  \                # call fluent with 3D double precision solver
    -g \                          # run without GUI
    -slurm -t $SLURM_NTASKS \     # run via SLURM with NTASKS
    -pinfiniband \                # use Infiniband interconnect
    -mpi=openmpi \                # use IntelMPI
    -i $JOURNALFILE > fluent.log  #input file
fi

There are slightly different command line switch in version 18:

#!/bin/sh
#SBATCH -J fluent
#SBATCH -N 2
#SBATCH -o job.%j.out
#SBATCH -p E5-2690v4
#SBATCH -q E5-2690v4-batch
#SBATCH --ntasks-per-node=28
#SBATCH --threads-per-core=1
#SBATCH --time=04:00:00
#SBATCH --mem=16G

module purge
module load ANSYS-Fluent/18.1

JOURNALFILE=fluent.jou

if [ $SLURM_NNODES -eq 1 ]; then
    # Single node with shared memory
    fluent 3ddp -g -t $SLURM_NTASKS -i $JOURNALFILE > fluent.log
else
    # Multi-node
    fluent 3ddp  \                # call fluent with 3D double precision solver
    -g \                          # run without GUI
    -slurm -t $SLURM_NTASKS \     # run via SLURM with NTASKS
    -pib \                        # use Infiniband interconnect
    -platform=intel \             # run on Intel processors
    -mpi=openmpi \                # use OpenMPI
    -i $JOURNALFILE > fluent.log  #input file
fi

These variables are defined when loading the fluent module file:

setenv       ANSYSLI_SERVERS 2325@LICENSE.SERVER
setenv       ANSYSLMD_LICENSE_FILE 1055@LICENSE.SERVER

sbatch fluent_run.sh

To restart a fluent job, you can read in the latest data file:

# read case file (*.cas.gz) that had previously been prepared
file/read-case "MyCaseFile.cas.gz"
file/read-data "MyCase_-1-00050.dat.gz"   # read latest data file and continue calculation
file/autosave/data-frequency 10
solve/init/initialize-flow
solve/iterate 500
file/write-data "MyCase.dat.gz"
exit yes

ABAQUS

/opt/ohpc/pub/examples/slurm/mul/abaqus

#!/bin/bash
#
#SBATCH -J abaqus
#SBATCH -N 2
#SBATCH -o job.%j.out
#SBATCH -p E5-2690v4
#SBATCH -q E5-2690v4-batch
#SBATCH --ntasks-per-node=8
#SBATCH --mem=16G

module purge
module load Abaqus/2016

export LM_LICENSE_FILE=<license_port>@license_server>:$LM_LICENSE_FILE

# specify some variables:
JOBNAME=My_job_name
INPUT=My_Abaqus_input.inp
SCRATCHDIR="/scratch"

# MODE can be 'mpi' or 'threads':
#MODE="threads"
MODE="mpi"

scontrol show hostname $SLURM_NODELIST | paste -d -s > hostlist
cpu=`expr $SLURM_NTASKS / $SLURM_JOB_NUM_NODES`
echo $cpu

mp_host_list="("
for i in $(cat hostlist)
do
  mp_host_list="${mp_host_list}('$i',$cpu),"
done

mp_host_list=`echo ${mp_host_list} | sed -e "s/,$/,)/"`

echo "mp_host_list=${mp_host_list}" >> abaqus_v6.env

abaqus interactive job=$JOBNAME cpus=$SLURM_NTASKS mp_mode=$MODE scratch=$SCRATCHDIR input=$INPUT

Users sometimes find that their jobs take longer than the maximaum runtime permitted by the scheduler to complete. Providing that your model does not automatically re-mesh (for example, after a fracture), you may be able to make use of Abaqus’ built-in checkpointing function.

This will create a restart file (.res file extension) from which a job that is killed can be restarted.

  1. Activate the restart feature by adding the line:
*restart, write

at the top of your input file and run your job as normal. It should produce a restart file with a .res file extension.

<HTML><ol start=“2” style=“list-style-type: decimal;”></HTML> <HTML><li></HTML>Run the restart analysis with<HTML></li></HTML><HTML></ol></HTML>

abaqus job=jobName oldjob=oldjobName ...

where oldJobName is the initial input file and newJobName is a file which contains only the line:

*restart, read

Example:

INPUT: dynam.inp

JOB SCRIPT: job.sh

INPUT FOR RESTART: dynamr.inp


  • pandoc/introduction-to-mul-cluster/01_introduction/05_user_software.txt
  • Last modified: 2020/10/20 08:09
  • by pandoc