This version is outdated by a newer approved version.DiffThis version (2015/04/28 09:21) is a draft.
Approvals: 0/1

This is an old revision of the document!


http://www.r-project.org/

We installed the libraries Rmpi, doMPI and foreach and their dependencies on VSC2 and VSC3. This libraries give you the possibilty to parallelize loops over more nodes with MPI using the foreach function which is very similar to a for loop.

Example given a simple for loop:

for (i in 1:50) {mean(rnorm(1e+07))}

sequentially it takes on VSC3:

/opt/sw/R/current/bin/R
> system.time(for (i in 1:50) { mean(rnorm(1e+07))})
   user  system elapsed 
 83.690   1.013  84.721 

using this script on VSC2:

#$ -N rstat
#$ -V
#$ -pe mpich 16
#$ -l h_rt=01:00:00
echo $NSLOTS
mpirun -machinefile $TMPDIR/machines -np 16 ~/sw/rstat/bin/R CMD BATCH berk-rmpi.R

using this script on VSC3:

#!/bin/sh
#SBATCH -J rstat
#SBATCH -N 1
#SBATCH --tasks-per-node=16

module unload intel-mpi/5
module load intel-mpi/4.1.3.048
module load R

export I_MPI_FABRICS=shm:tcp

mpirun R CMD BATCH berk-rmpi.R

berk-rmpi.R:

# basic example with foreach
# start R as usual:'R'or via a batch job
library (Rmpi)
library (doMPI)
cl <- startMPIcluster ()
registerDoMPI (cl)

result <- foreach (i = 1:50) %dopar% {
mean(rnorm(1e+07))
}

closeCluster(cl)
mpi.finalize()

takes on VSC2:

> proc.time()
   user  system elapsed 
  8.495   0.264   9.616 
  • doku/rstat.1430212868.txt.gz
  • Last modified: 2015/04/28 09:21
  • by ir