Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
doku:vasp-vsc2 [2014/10/02 14:18] – [VASP + ELPA] ir | doku:vasp-vsc2 [2022/02/01 21:10] (current) – removed goldenberg | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== VASP ====== | ||
- | |||
- | see also | ||
- | * [[vasp-benchmarks|VASP benchmarks]] | ||
- | ==== MPI SELECTOR ==== | ||
- | In order to successfully compile and run VASP use Intel MPI 4.0.1.007, which gives best performance for VASP: | ||
- | < | ||
- | # mpi-selector --query | ||
- | default: | ||
- | level:user | ||
- | </ | ||
- | ==== VASP + ELPA ==== | ||
- | An optimized executable, which scales very nicely on VSC-2 (scaLAPACK has been replaced with [[http:// | ||
- | |||
- | Please send an email to {{: | ||
- | to be added to the corresponding user group. | ||
- | |||
- | {{: | ||
- | |||
- | |||
- | |||
- | |||
- | ELPA devolopers highly appreciate citing their work when using the optimized library, see the README file of the library below: | ||
- | |||
- | < | ||
- | *** Citing: | ||
- | A description of some algorithms present in ELPA can be found in: | ||
- | |||
- | T. Auckenthaler, | ||
- | L. Kr\" | ||
- | " | ||
- | electronic structure calculations", | ||
- | Parallel Computing [volume], [page], (2011). | ||
- | accepted for publication (May 11, 2011). | ||
- | |||
- | Please cite this paper when using ELPA. We also intend to publish an | ||
- | overview description of the ELPA library as such, and ask you to | ||
- | make appropriate reference to that as well, once it appears. | ||
- | </ | ||
- | ==== Compiling ==== | ||
- | With the above mpi-selector setting you should set the following PATHS to libraries in your makefile | ||
- | < | ||
- | OFLAG=-O2 -ip -ftz -fno-alias -msse3 | ||
- | |||
- | OFLAG_HIGH = $(OFLAG) | ||
- | OFLAG_NOOPT = -O1 -msse3 -ip -ftz | ||
- | OBJ_HIGH = | ||
- | OBJ_NOOPT = | ||
- | DEBUG = -FR -O0 | ||
- | INLINE = $(OFLAG) -ip | ||
- | |||
- | MKL_PATH=$(MKLROOT)/ | ||
- | |||
- | # Use libgoto preferentially; | ||
- | BLAS = -Wl, | ||
- | LAPACK = / | ||
- | |||
- | # Alternative BLAS and LAPACK: | ||
- | #BLAS = $(MKL_PATH)/ | ||
- | #LAPACK = $(MKL_PATH)/ | ||
- | |||
- | BLACS=-lmkl_blacs_intelmpi_lp64 | ||
- | |||
- | SCA=$(MKL_PATH)/ | ||
- | </ | ||
- | and the compiler should be set to: | ||
- | < | ||
- | FC=mpiifort | ||
- | </ | ||
- | |||
- | VASP has an own FFTW library. For normal use this library is sufficient. In case of non collinear spin calculations using the fftw (http:// | ||
- | |||
- | |||
- | vasp.5.2.12_mpi: | ||
- | < | ||
- | FFT3D=fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o / | ||
- | </ | ||
- | or | ||
- | < | ||
- | FFT3D = fftmpiw.o fftmpi_map.o | ||
- | </ | ||
- | |||
- | vasp.5.2.12_mpi_ncl: | ||
- | < | ||
- | FFT3D = fftmpiw.o fftmpi_map.o | ||
- | / | ||
- | </ | ||
- | |||
- | |||
- | |||
- | Complete Makefile for non collinear VASP 5.2.12 on VSC-2 | ||
- | |||
- | < | ||
- | |||
- | .SUFFIXES: .inc .f .f90 .F | ||
- | # | ||
- | # Makefile for Intel Fortran compiler for P4 systems | ||
- | # | ||
- | # The makefile was tested only under Linux on Intel platforms | ||
- | # | ||
- | |||
- | # all CPP processed fortran files have the extension .f | ||
- | SUFFIX=.f90 | ||
- | |||
- | # | ||
- | # START CUSTOMIZATION HERE | ||
- | # | ||
- | |||
- | # | ||
- | # whereis CPP ?? (I need CPP, can't use gcc with proper options) | ||
- | # the following works almost on all systems | ||
- | # possible cpp is located in a different directory | ||
- | # | ||
- | |||
- | # CPP_ = ./ | ||
- | # CPP_ = / | ||
- | CPP_=fpp -f_com=no -free -w0 $*.F $*$(SUFFIX) | ||
- | |||
- | |||
- | # | ||
- | # f90 compiler | ||
- | # | ||
- | |||
- | # simple version, use mpif77 wrapper | ||
- | # this works only if mpif77 has been compiled using the exactly | ||
- | # same fortran compiler | ||
- | FC=mpiifort | ||
- | #FCL=mpif90 -i-static | ||
- | FCL=mpiifort | ||
- | |||
- | # | ||
- | # general fortran flags (there must a trailing blank on this line) | ||
- | # | ||
- | |||
- | # INCS=-I$(VSC)/ | ||
- | |||
- | FFLAGS = -FR -lowercase -assume byterecl | ||
- | |||
- | # | ||
- | # optimization | ||
- | # for some files a lower optimization level is explicitly selected | ||
- | # at the bottom | ||
- | # | ||
- | |||
- | OFLAG=-O2 -ip -ftz -fno-alias -msse3 | ||
- | |||
- | OFLAG_HIGH = $(OFLAG) | ||
- | OFLAG_NOOPT = -O1 -msse3 -ip -ftz | ||
- | OBJ_HIGH = | ||
- | OBJ_NOOPT = | ||
- | DEBUG = -FR -O0 | ||
- | INLINE = $(OFLAG) -ip | ||
- | |||
- | # | ||
- | # the following lines specify the position of BLAS and LAPACK, | ||
- | # PBLAS and scaLAPACK | ||
- | # | ||
- | |||
- | # fastest Kazushige Goto's BLAS | ||
- | # http:// | ||
- | # mkl is almost as fast | ||
- | MKL_PATH=$(MKLROOT)/ | ||
- | |||
- | BLAS = -Wl, | ||
- | LAPACK = / | ||
- | |||
- | #BLAS = $(MKL_PATH)/ | ||
- | |||
- | # LAPACK, use vasp.5.lib/ | ||
- | # optimized LAPACK does not improve the performance | ||
- | #LAPACK = ../ | ||
- | #LAPACK= $(MKL_PATH)/ | ||
- | |||
- | # location of BLACS and SCALAPACK | ||
- | # optional only required if SCA is defined below | ||
- | # | ||
- | # | ||
- | |||
- | ## For openmpi | ||
- | # | ||
- | |||
- | ## For mpich | ||
- | BLACS=-lmkl_blacs_intelmpi_lp64 | ||
- | |||
- | ## Just a test for qlogic | ||
- | # | ||
- | |||
- | |||
- | # BLACS and SCALAPACK libraries if available | ||
- | # if SCA is defined SCALAPACK will be used | ||
- | |||
- | #SCA= $(VSC)/ | ||
- | SCA=$(MKL_PATH)/ | ||
- | |||
- | #WANLIB= | ||
- | WANLIB=../ | ||
- | |||
- | LINK=-mpif90_abi=intel11 | ||
- | # | ||
- | # END CUSTOMIZATION | ||
- | # | ||
- | |||
- | # | ||
- | # options for CPP in parallel version (see also above): | ||
- | # NGZhalf | ||
- | # wNGZhalf | ||
- | # scaLAPACK | ||
- | # | ||
- | ifdef SCA | ||
- | CPP = $(CPP_) -DMPI -DHOST=\" | ||
- | | ||
- | | ||
- | | ||
- | else | ||
- | CPP = $(CPP_) -DMPI -DHOST=\" | ||
- | | ||
- | | ||
- | | ||
- | endif | ||
- | |||
- | # | ||
- | # libraries for vasp | ||
- | # | ||
- | ifdef SCA | ||
- | LIB = -L../ | ||
- | ../ | ||
- | | ||
- | else | ||
- | LIB = -L../ | ||
- | ../ | ||
- | $(WANLIB) $(LAPACK) $(BLAS) | ||
- | endif | ||
- | |||
- | # FFT: fftmpi.o with fft3dlib of Juergen Furthmueller | ||
- | # must be used for this benchmark | ||
- | #rv,sgi FFT3D = fftmpi.o fftmpi_map.o fft3dlib.o | ||
- | # | ||
- | # | ||
- | # | ||
- | # | ||
- | # FFT3D = fftmpi.o fftmpi_map.o fftw3d.o fft3dlib.o $(VSC)/ | ||
- | # | ||
- | FFT3D=fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o / | ||
- | |||
- | # | ||
- | # general rules and compile lines | ||
- | # | ||
- | BASIC= | ||
- | |||
- | SOURCE= | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | | ||
- | |||
- | INC= | ||
- | |||
- | vasp: $(SOURCE) $(FFT3D) $(INC) main.o | ||
- | rm -f vasp | ||
- | $(FCL) -o vasp main.o $(SOURCE) $(FFT3D) $(LIB) $(LINK) | ||
- | vasp_: $(SOURCE) $(FFT3D) $(INC) main.o | ||
- | rm -f vasp_ | ||
- | $(FCL) -o vasp_ main.o | ||
- | makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC) | ||
- | $(FCL) -o makeparam | ||
- | zgemmtest: zgemmtest.o base.o random.o $(INC) | ||
- | $(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB) | ||
- | dgemmtest: dgemmtest.o base.o random.o $(INC) | ||
- | $(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB) | ||
- | ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC) | ||
- | $(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB) | ||
- | kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC) | ||
- | $(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB) | ||
- | |||
- | clean: | ||
- | -rm -f *.mod *.f90 *.o *.L ; touch *.F | ||
- | |||
- | main.o: main$(SUFFIX) | ||
- | $(FC) $(FFLAGS)$(DEBUG) | ||
- | xcgrad.o: xcgrad$(SUFFIX) | ||
- | $(FC) $(FFLAGS) $(INLINE) | ||
- | xcspin.o: xcspin$(SUFFIX) | ||
- | $(FC) $(FFLAGS) $(INLINE) | ||
- | |||
- | makeparam.o: | ||
- | $(FC) $(FFLAGS)$(DEBUG) | ||
- | |||
- | makeparam$(SUFFIX): | ||
- | # | ||
- | # MIND: I do not have a full dependency list for the include | ||
- | # and MODULES: here are only the minimal basic dependencies | ||
- | # if one strucuture is changed then touch_dep must be called | ||
- | # with the corresponding name of the structure | ||
- | # | ||
- | base.o: base.inc base.F | ||
- | mgrid.o: mgrid.inc mgrid.F | ||
- | constant.o: constant.inc constant.F | ||
- | lattice.o: lattice.inc lattice.F | ||
- | setex.o: setexm.inc setex.F | ||
- | pseudo.o: pseudo.inc pseudo.F | ||
- | poscar.o: poscar.inc poscar.F | ||
- | mkpoints.o: mkpoints.inc mkpoints.F | ||
- | wave.o: wave.inc wave.F | ||
- | nonl.o: nonl.inc nonl.F | ||
- | nonlr.o: nonlr.inc nonlr.F | ||
- | |||
- | $(OBJ_HIGH): | ||
- | $(CPP) | ||
- | $(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX) | ||
- | $(OBJ_NOOPT): | ||
- | $(CPP) | ||
- | $(FC) $(FFLAGS) $(OFLAG_NOOPT) $(INCS) -c $*$(SUFFIX) | ||
- | |||
- | fft3dlib_f77.o: | ||
- | $(CPP) | ||
- | $(F77) $(FFLAGS_F77) -c $*$(SUFFIX) | ||
- | |||
- | .F.o: | ||
- | $(CPP) | ||
- | $(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX) | ||
- | .F$(SUFFIX): | ||
- | $(CPP) | ||
- | $(SUFFIX).o: | ||
- | $(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX) | ||
- | |||
- | # special rules | ||
- | # | ||
- | |||
- | # -tpp5|6|7 P, PII-PIII, PIV | ||
- | # -xW use SIMD (does not pay of on PII, since fft3d uses double prec) | ||
- | # all other options do no affect the code performance since -O1 is used | ||
- | |||
- | fft3dlib.o : fft3dlib.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O2 -c $*$(SUFFIX) | ||
- | fft3dfurth.o : fft3dfurth.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O1 -c $*$(SUFFIX) | ||
- | fftw3d.o : fftw3d.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O1 -c $*$(SUFFIX) | ||
- | |||
- | radial.o : radial.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase $(OFLAG) -c $*$(SUFFIX) | ||
- | fftmpi.o : fftmpi.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O1 -c $*$(SUFFIX) | ||
- | fftmpiw.o : fftmpiw.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O1 $(INCS) -c $*$(SUFFIX) | ||
- | symlib.o : symlib.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase $(OFLAG) | ||
- | |||
- | symmetry.o : symmetry.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase $(OFLAG) -c $*$(SUFFIX) | ||
- | |||
- | broyden.o : broyden.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase $(OFLAG) -c $*$(SUFFIX) | ||
- | |||
- | dynbr.o : dynbr.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase $(OFLAG) -c $*$(SUFFIX) | ||
- | |||
- | paw.o : paw.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O3 -c $*$(SUFFIX) | ||
- | |||
- | cl_shift.o : cl_shift.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O3 -c $*$(SUFFIX) | ||
- | |||
- | us.o : us.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O3 -c $*$(SUFFIX) | ||
- | |||
- | wave.o : wave.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O3 -c $*$(SUFFIX) | ||
- | |||
- | wave_high.o : wave_high.F | ||
- | $(CPP) | ||
- | $(FC) -FR -lowercase -O1 -c $*$(SUFFIX) | ||
- | |||
- | LDApU.o : LDApU.F | ||
- | $(CPP) | ||
- | $(F77) -FR -lowercase -O3 -c $*$(SUFFIX) | ||
- | |||
- | </ | ||