This version (2022/06/20 09:01) was approved by msiegel.
Starting fewer MPI processes per node than slots available
Modifying the machine file
Manually modifying the machine file
In cases when not all CPUs of one node are required, the machines file can be modified to guarantee the right behaviour of mpirun. The $TMPDIR/machines file on VSC-1 consists of a number of machine/node names. Each name stands for one CPU on the given machine/node. For an exclusive job on 2 nodes the machine file looks like:
r10n01 r10n01 r10n01 r10n01 r10n01 r10n01 r10n01 r10n01 r12n10 r12n10 r12n10 r12n10 r12n10 r12n10 r12n10 r12n10
For running a job on less than eight cores the $TMPDIR/machines file has to be replaced within the job script:
#$ -N test
#$ -pe mpich 16
NSLOTS_PER_NODE_AVAILABLE=8
NSLOTS_PER_NODE_USED=4
NSLOTS_REDUCED=`echo "$NSLOTS / $NSLOTS_PER_NODE_AVAILABLE * $NSLOTS_PER_NODE_USED" | bc `
echo "starting run with $NSLOTS_REDUCED processes; $NSLOTS_PER_NODE_USED per node"
for i in `seq 1 $NSLOTS_PER_NODE_USED`
do
uniq $TMPDIR/machines >> $TMPDIR/tmp
done
sort $TMPDIR/tmp > $TMPDIR/myhosts
cat $TMPDIR/myhosts
mpirun -machinefile $TMPDIR/myhosts -np $NSLOTS_REDUCED sleep 2
The reduced form would look like:
r10n01 r10n01 r10n01 r10n01 r12n10 r12n10 r12n10 r12n10