Differences

This shows you the differences between two versions of the page.

--- doku:gpaw [2014/03/21 09:31] – external edit 127.0.0.1
+++ doku:gpaw [2014/08/27 11:27] (current) – jz
@@ Line 2: / Line 2: @@
 === Settings Scientific linux 6.5 (VSC 1+2) ===
-Following Versions of GPAW and ASE were used:
+Following Revisions of GPAW and ASE were used, specify the revisions in ''install_gpaw_vsc_sl65.sh'':
-  * gpaw: svn checkout https://svn.fysik.dtu.dk/projects/gpaw/trunk gpaw -r 10428
-  * ase: svn checkout https://svn.fysik.dtu.dk/projects/ase/trunk ase -r 2801
+  * gpaw (?): 11253 ; ase (?): 3547
+  * gpaw 0.10.0: 11364 ; ase 3.8.1: 3440
   * gpaw-setups-0.9.9672
   * numpy-1.6.2
-  * MPI Version: impi-4.1.0.024
+  * MPI Version: impi-4.1.x
   * FFTW from INTEL MKL
   * files: {{:doku:gpaw:sl65:site.cfg.numpy.sl65_icc_mkl.txt}}, {{:doku:gpaw:sl65:install_gpaw_vsc_sl65.sh}},  {{:doku:gpaw:sl65:customize_sl65_icc_mkl.py}}, {{:doku:gpaw:sl65:config.py}}
@@ Line 84: / Line 85: @@
 === running gpaw jobs ===
-Job submission on VSC-1 (on VSC-2 also mpich8, mpich4 can be used instead)
+== Job submission using all cores on the compute nodes (VSC-1 and VSC-2) ==
 <code>
 #!/bin/sh
@@ Line 90: / Line 91: @@
 #$ -pe mpich 256
 #$ -V
-export OMP_NUM_THREADS=1
-NSLOTS_PER_NODE_AVAILABLE=8
-NSLOTS_PER_NODE_USED=4
-NSLOTS_REDUCED=`echo "$NSLOTS / $NSLOTS_PER_NODE_AVAILABLE * $NSLOTS_PER_NODE_USED" | bc  `
-echo "starting run with $NSLOTS_REDUCED processes; $NSLOTS_PER_NODE_USED per node"
+mpirun -machinefile $TMPDIR/machines -np $NSLOTS gpaw-python static.py --domain=None --band=1 --sl_default=4,4,64
-for i in `seq 1 $NSLOTS_PER_NODE_USED`
+</code>
-do
-	uniq $TMPDIR/machines >> $TMPDIR/tmp
-done
-sort $TMPDIR/tmp  > $TMPDIR/myhosts
+== Job submission using half of the cores on the compute nodes on VSC-2. ==
-cat $TMPDIR/myhosts
+If each of your processes require more than 2GB (and less than 4GB) of memory, you can use the parallel environment ''mpich8''. This will allocate only 8 processes on each node while still starting 256 processes but distributed over 32 nodes. It is necessary to use the variable ''$NSLOTS_REDUCED'' instead of ''$NSLOTS'' in that case.
+<code>
-mpirun -machinefile $TMPDIR/myhosts -np $NSLOTS_REDUCED gpaw-python static.py --domain=None --band=1 --sl_default=4,4,64
+#!/bin/sh
+#$ -N Cl5_4x4x1
+#$ -pe mpich8 256
+#$ -V
+mpirun -machinefile $TMPDIR/machines -np $NSLOTS_REDUCED gpaw-python static.py --domain=None --band=1 --sl_default=4,4,64
 </code>
+If even more memory per process is required the environments ''mpich4'', ''mpich2'', and ''mpich1'' are also available, as discussed in [[doku:ompmpi|Hybrid OpenMP/MPI jobs]].
+Alternatively, you can simply start your GPAW job with more processes which will reduce the amount of memory per process. GPAW usually scales well with the number of processes.
-Significant speed up is seen when --sl_default is set. First two parameters for the BLACS grid should be similar in size to get an optimal memory distribution on the nodes.
+Significant speed up is seen **in our test case** when --sl_default is set. First two parameters for the BLACS grid should be similar in size to get an optimal memory distribution on the nodes.