Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
doku:vasp-benchmarks [2015/01/08 14:55]
ir [VSC 2]
doku:vasp-benchmarks [2022/06/23 13:20] (current)
msiegel [VASP benchmarks]
Line 1: Line 1:
 ====== VASP benchmarks ====== ====== VASP benchmarks ======
  
-The following plots show the performance of the VASP code depending on the selected **mpich** environment in the grid engine, the **number of processes** and the **number of threads**  +The following plots show the performance of the VASP code depending on the selected **mpich** environment in the grid engine, the **number of processes** and the **number of threads**.
-(see also [[ompmpi|hybrid jobs]], item [[ompmpi#automatic_modification_through_parallel_environment|Automatic modification through parallel environment]]).+
  
 A 5x4 supercell with 150 Palladium atoms and 24 Oxygen atoms, i.e., 3 pure Palladium layers with one mixed Palladium/Oxygen layer have been used for the following benchmark tests. A 5x4 supercell with 150 Palladium atoms and 24 Oxygen atoms, i.e., 3 pure Palladium layers with one mixed Palladium/Oxygen layer have been used for the following benchmark tests.
 +
 +===== VSC 3 =====
 +
 +The code was compiled with the intel compiler, intel MKL (BLACS, SCALAPACK), and intel mpi (5.0.0.028).
 +
 +Figure 6 shows that the running time of this benchmark substantially decreases with the number of MPI-processes. The decrease in running time with the number of threads is less dominant. However, other applications may exhibit a different behavior.
 +
 +{{:doku:vasp:nn08_vsc3_vasp.png?400|}}
 +{{:doku:vasp:nn32_vsc3_vasp.png?400|}}
 +
 +**Figure 6:** real running time on eight and 32 nodes depending on the number of MPI processes and the number of threads.
 +{{ :doku:vasp:nn1-64_tn16_th01_time.png?600 |}}
 +**Figure 7:** real running time for 16 tasks (MPI processes) per node and one thread depending on the number of nodes.
 +
 +===== VSC 2 =====
 +
 +The code was compiled with the intel compiler, intel MKL (BLACS, SCALAPACK), and intel mpi (4.0.1.007).
 +
 +The figure shows the dependency of the computing time on the selected mpich environment and the number of processes. **mpich1** means 1 process per node, **mpich2** 2 processes and so forth. For 16 processes, the environment variable is not following the same name convention, it is called **mpich**.
 +
 +A trend can be noticed that the computing time decreases with decreasing number of processes per node (mpich) and increasing number of processes, also corresponding to an increase of the number of slots. However, when the number of processes further increases, the computing time rises. The reason is the not optimal scaling of BLACS and SCALAPACK. Improvement can be achieved by utilizing ELPA instead.
 +
 +{{ :doku:vasp:vsc2_treal_mpich_v.01.png?400 |}}
 +
 +**Figure 5:** real running time depending on the selected mpich environment and the number of processes. Here the number of threads is always 1.
 +
 ===== VSC 1 ===== ===== VSC 1 =====
  
Line 49: Line 74:
 {{ :doku:vasp:corehoursbest.png |}} {{ :doku:vasp:corehoursbest.png |}}
 ** Figure 4:** real running time (dashed lines [min]) and core hours (solid lines [h]) for the best result (mpich/1 thread). ** Figure 4:** real running time (dashed lines [min]) and core hours (solid lines [h]) for the best result (mpich/1 thread).
-===== VSC 2 ===== 
- 
-The code was compiled with the intel compiler, intel MKL (BLACS, SCALAPACK), and intel mpi (4.0.1.007). 
- 
-The figure shows the dependency of the computing time on the selected mpich environment and the number of processes. **mpich1** means 1 process per node, **mpich2** 2 processes and so forth. For 16 processes, the environment variable is not following the same name convention, it is called **mpich**. 
- 
-A trend can be noticed that the computing time decreases with decreasing number of processes per node (mpich) and increasing number of processes, also corresponding to an increase of the number of slots. However, when the number of processes further increases, the computing time rises. The reason is the not optimal scaling of BLACS and SCALAPACK. Improvement can be achieved by utilizing ELPA instead. 
- 
-{{ :doku:vasp:vsc2_treal_mpich_v.01.png?400 |}} 
- 
-**Figure 5:** real running time depending on the selected mpich environment and the number of processes. Here the number of threads is always 1. 
- 
-===== VSC 2 ===== 
- 
-The code was compiled with the intel compiler, intel MKL (BLACS, SCALAPACK), and intel mpi (5.0.0.028). 
- 
-Figure 6 shows that the running time of this benchmark substantially decreases with the number of MPI-processes. The decrease in running time with the number of threads is less dominant. However, other applications may exhibit a different behavior. 
- 
-{{:doku:vasp:nn08_vsc3_vasp.png?400|}} 
-{{:doku:vasp:nn32_vsc3_vasp.png?400|}} 
- 
-**Figure 6:** real running time on eight and 32 nodes depending on the number of MPI processes and the number of threads. 
-{{ :doku:vasp:nn1-64_tn16_th01_time.png?200 |}} 
-**Figure 7:** real running time for 16 tasks (MPI processes) per node and one thread depending on the number of nodes. 
  • doku/vasp-benchmarks.1420728958.txt.gz
  • Last modified: 2015/01/08 14:55
  • by ir