This version is outdated by a newer approved version.DiffThis version (2014/06/03 07:14) is a draft.
Approvals: 0/1

This is an old revision of the document!


Modyfying the machine file on VSC-1

In cases when not all CPUs of one node are required, the machines file can be modified to guarantee the right behaviour of mpirun. The $TMPDIR/machines file on VSC-1 consists of a number of machine/node names. Each name stands for one CPU on the given machine/node. For an exclusive job on 2 nodes the machine file looks like:

r10n01
r10n01
r10n01
r10n01
r10n01
r10n01
r10n01
r10n01
r12n10
r12n10
r12n10
r12n10
r12n10
r12n10
r12n10
r12n10

For running a job on less than eight cores the $TMPDIR/machines file has to be replaced within the job script:

#$ -N test
#$ -pe mpich 16

NSLOTS_PER_NODE_AVAILABLE=8
NSLOTS_PER_NODE_USED=4
NSLOTS_REDUCED=`echo "$NSLOTS / $NSLOTS_PER_NODE_AVAILABLE * $NSLOTS_PER_NODE_USED" | bc  `

echo "starting run with $NSLOTS_REDUCED processes; $NSLOTS_PER_NODE_USED per node"
for i in `seq 1 $NSLOTS_PER_NODE_USED`
do
	uniq $TMPDIR/machines >> $TMPDIR/tmp
done
sort $TMPDIR/tmp  > $TMPDIR/myhosts
cat $TMPDIR/myhosts


mpirun -machinefile $TMPDIR/myhosts -np $NSLOTS_REDUCED sleep 2

The reduced form would look like:

r10n01
r10n01
r10n01
r10n01
r12n10
r12n10
r12n10
r12n10
  • doku/machine_file.1401779647.txt.gz
  • Last modified: 2014/06/03 07:14
  • by ir