This version (2024/10/24 10:28) is a draft.
Approvals: 0/1The Previously approved version (2021/09/29 12:25) is available.
Approvals: 0/1The Previously approved version (2021/09/29 12:25) is available.
Using R on VSC2 and VSC3 with MPI libraries
We installed the libraries Rmpi, doMPI and foreach and their dependencies on VSC2 and VSC4. These libraries give you the possibility to parallelize loops over more nodes with MPI using the foreach function which is very similar to a for loop.
Example
Given a simple for loop:
for (i in 1:50) {mean(rnorm(1e+07))}
Sequential execution
Sequential execution on VSC3 leads to an execution time [s] of:
/opt/sw/R/current/bin/R > system.time(for (i in 1:50) { mean(rnorm(1e+07))}) user system elapsed 83.690 1.013 84.721
Parallel execution
In R, the code berk-rmpi.R may be parallelized in the following form:
# basic example with foreach # start R as usual:'R'or via a batch job library (Rmpi) library (doMPI) cl <- startMPIcluster () registerDoMPI (cl) result <- foreach (i = 1:50) %dopar% { mean(rnorm(1e+07)) } closeCluster(cl) mpi.finalize()
On VSC3 the script reads:
#!/bin/sh #SBATCH -J rstat #SBATCH -N 1 #SBATCH --tasks-per-node=16 module unload intel-mpi/5 module load intel-mpi/4.1.3.048 module load R export I_MPI_FABRICS=shm:tcp mpirun R CMD BATCH berk-rmpi.R
yielding to an execution time [s] of
> proc.time() user system elapsed 4.566 0.156 5.750